How to Import an Excel worksheet into Python using Pandas
Worksheet
A worksheet is a collection of cells where you keep and manipulate the data. Each Excel workbook can contain multiple worksheets.
One of the ways to read an Excel file is loading the file, in our case linear_regresion_data.xlsx, by using pd.ExcelFile, and after that call each of the sheets that you have in your spreadsheet or workbook.
# Load Excel File
xlsx = pd.ExcelFile(r'C:\linear_regresion_data.xlsx')
df1 = pd.read_excel(xlsx, 'Set 1')
df2 = pd.read_excel(xlsx, 'Set 2')
df3 = pd.read_excel(xlsx, 'Set 3')
df4 = pd.read_excel(xlsx, 'Set 4')
df5 = pd.read_excel(xlsx, 'Set 5')
df6 = pd.read_excel(xlsx, 'Set 6')
# Preview datasets
display('Set 1', df1.head())
display('Set 2', df2.head())
display('Set 3', df3.head())
display('Set 4', df4.head())
display('Set 5', df5.head())
display('Set 6', df6.head())
As you can see in the example above, this is an inefficient way to read an excel file, especially if you have many sheets in your document. One way to handle this is by creating a function that use a list to keep track of the sheets.
def multi_excel_loader(filepath):
# Load Excel file using Pandas
f = pd.ExcelFile(filepath)
# Empty list to store individual DataFrames
list_of_dfs = []
# Iterate through each worksheet
for sheet in f.sheet_names:
df = f.parse(sheet)
list_of_dfs.append(df)
return list_of_dfs
# Load the assignment dataframe
filepath = 'C:\linear_regresion_data.xlsx'
worksheet = multi_excel_loader(filepath)
# load set_1
df_set1 = worksheet[0]
df_set1.head()
In the last snip of code a spreadsheet is load into a worksheet list, this allowed you to call each sheet with a position list.
Comments
Post a Comment