While creating the report or collating the results from multiple excel file it is often tedious to do this repeated task of copying the data from each file and pasting into the single main file.
Here we are providing code snippet in Python to merge the data from all excel sheet present in the folder and create a new file with all the data in it.
Merge Excel Files |
Libraries : Pandas, Openpyxl
Here we are running the code in the folder which contain multiple excel files.
In the current working directory automation will generate a new file which contains all the data from multiple excels in the folder.In the new version of pandas library while reading file we need to use 'engine' parameter as 'openpyxl'.
Below is the code to install libraries in command prompt.
- 1. Install pandas using pip
pip install pandas
pip install openpyxl
Below is the code to merge the file:
'''This code we need to run in the folder containing exel files.'''
#Import liraries
import os
import pandas as pd
import openpyxl
#Get current working directory
cwd = os.path.abspath('')
files = os.listdir(cwd)
#Create a dataframe using pandas
df = pd.DataFrame()
#Read all the files in the folder using pandas
for file in files:
if file.endswith('.xlsx'):
df = df.append(pd.read_excel(file, engine='openpyxl'), ignore_index=True)
#Create a new file with all the data in it.You can change the name of file.
df.to_excel('MergedFile.xlsx')
After running the code, check for MergerFile.xlsx in the current working folder.
Very helpful
ReplyDelete