How to loop in folder through all excel files and all sheets using pandas?

Joined
Dec 1, 2022
Messages
1
Reaction score
0
Using python 3. I need to loop through a folder that contains excel files and each file has multiple sheets. How do I loop through all the files and all the sheets and extract to a dataframe?

What I was able to accomplish only returns one excel file and all the worksheets for that file but I need for all files. Please help.

This is what I have so far:

from xlsxwriter import Workbook
import pandas as pd
import openpyxl
import glob
import os

path = 'filestoimport/*.xlsx'

for filepath in glob.glob(path):
xl = pd.ExcelFile(filepath)


# Define an empty list to store individual DataFrames
list_of_dfs = []
list_of_dferror= []
for sheet_name in xl.sheet_names:
df = xl.parse(sheet_name, usecols='A,D,N,B,C,E,F,G,H,I,J,K,L,M', header=0)
df.columns = df.columns.str.replace(' ', '')

df['sheetname'] = sheet_name # this adds `sheet_name` into the column

# using basename function from os
# module to print file name
file_name = os.path.basename(filepath)
df['sourcefilename'] = file_name

# only add sheets containing columns ['Status', 'ProjectID']
column_names = ['Status', 'ProjectID']
if set(column_names).issubset(df.columns):
df['Status'].fillna('', inplace=True)
df['Addedby'].fillna('', inplace=True)
# And append it to the list
list_of_dfs.append(df)

# Combine all DataFrames into one
data = pd.concat(list_of_dfs, ignore_index=True)
 

Ask a Question

Want to reply to this thread or ask your own question?

You'll need to choose a username for the site, which only take a couple of moments. After that, you can post your question and our members will help you out.

Ask a Question

Members online

No members online now.

Forum statistics

Threads
473,769
Messages
2,569,582
Members
45,060
Latest member
BuyKetozenseACV

Latest Threads

Top