How to Import an Excel worksheet into Python using Pandas












Worksheet

A worksheet is a collection of cells where you keep and manipulate the data. Each Excel workbook can contain multiple worksheets.


One of the ways to read an Excel file is loading the file, in our case linear_regresion_data.xlsx, by using pd.ExcelFile, and after that call each of the sheets that you have in your spreadsheet  or workbook.

# Load Excel File
xlsx = pd.ExcelFile(r'C:\linear_regresion_data.xlsx')
df1 = pd.read_excel(xlsx, 'Set 1')
df2 = pd.read_excel(xlsx, 'Set 2')
df3 = pd.read_excel(xlsx, 'Set 3')
df4 = pd.read_excel(xlsx, 'Set 4')
df5 = pd.read_excel(xlsx, 'Set 5')
df6 = pd.read_excel(xlsx, 'Set 6')

# Preview datasets
display('Set 1', df1.head())
display('Set 2', df2.head())
display('Set 3', df3.head())
display('Set 4', df4.head())
display('Set 5', df5.head())
display('Set 6', df6.head())

As you can see in the example above, this is an inefficient way to read an excel file, especially if you have many sheets in your document. One way to handle this is by creating a function that use a list to keep track of the sheets.

def multi_excel_loader(filepath):
    # Load Excel file using Pandas
    f = pd.ExcelFile(filepath)
    # Empty list to store individual DataFrames
    list_of_dfs = []
    # Iterate through each worksheet
    for sheet in f.sheet_names:
        df = f.parse(sheet)
        list_of_dfs.append(df)
    return list_of_dfs

# Load the assignment dataframe
filepath = 'C:\linear_regresion_data.xlsx'
worksheet = multi_excel_loader(filepath)
# load set_1
df_set1 = worksheet[0]
df_set1.head()

In the last snip of code a spreadsheet  is load into a worksheet list, this allowed you to call each sheet with a position list.



Comments