banner



How To Read Multiple Excel Files In Python

We discussed how to read data from a single Excel file. Next nosotros'll learn how to read multiple Excel files into Python using the pandas library.

My personal approach are the following two ways, and depending on the state of affairs I adopt i way over the other.

Method 1: Get Files From Folder – PowerQuery manner

Excel PowerQuery has a characteristic "Get Data From Folder" that allows us load all files from a specific folder. We tin practise this easily in Python. The workflow goes like this:

  • Given a folder, find all files within it.
  • Narrow down the file selection, which files practise I need to load?
  • Load data from the selected files, 1 by one.

To accomplish the in a higher place workflow, we'll need bone and pandas libraries. os library provides ways to collaborate with your computer'south operating system, such as finding out what files exist in a folder. bone.listdir() returns a list of all file names (string) within a specific folder. In one case we have the listing of file names, we can iterate through them and load data into Python.

            import bone import pandas as pd  binder = r'C:\Users\JZ\Desktop\PythonInOffice\python_excel_series_read_multiple_excel_files' files = os.listdir(folder)  for file in files:     if file.endswith('.xlsx'):         df = pd.read_excel(os.path.join(binder,file))   >>> files ['information.pdf', 'File_1.xlsx', 'File_2.xlsx', 'File_3.xlsx', 'header-groundwork.PNG', 'multiple files.py']          

Our working folder contains various file types (PDf, Excel, Paradigm, and Python files). But the file.endswith('.xlsx') makes sure that we read only the Excel files into Python.

os.path.join() provides an efficient style to create file path. This should always be used where possible, instead of folder + "\" + file .

Method 2: Using an Excel input file

The 2d method requires us to have a separate Excel file acts as an "input file". Information technology contains links to individual files that we intend to read into Python. To replicate the instance we but walked through, we need to create an Excel file looks like the below, essentially just a column with links to other files.

I like this method a lot, because:

  • I can organize and store information (file names, links, etc) in an environment (spreadsheet) I'yard familiar with.
  • If I need to update or add new files to exist read, I just need to update the input file. No coding modify is required.

The workflow is similar to the previous method. First nosotros need to allow Python know the file paths, which can be obtained from the input file.

            df_files = pd.read_excel('Excel_input.xlsx')  >>> df_files                                            File path 0  C:\Users\JZ\Desktop\PythonInOffice\python_exce... 1  C:\Users\JZ\Desktop\PythonInOffice\python_exce... 2  C:\Users\JZ\Desktop\PythonInOffice\python_exce... >>>                      

This is basically a simple dataframe with merely one column, that contains the file links. At present nosotros can iterate through the listing and read Excel files.

            for file in df_files['File path']:     df = pd.read_excel(file)          

When to employ Get Files From Folder vs Excel Input File

I ask two elementary questions when determining which method to utilize.

  1. Does the source folder contain extra files that I don't need?
    • For example, if a folder contains xx csv files, and I need only 10 of them. Information technology'due south probably easier to use the Excel Input File method. Editing an Excel Input file is much easier and faster than writing code to handle unlike scenarios in Python.
    • However, if the folder contains 50 files, of which xx are csv, and I need them all. And then I'll utilize the Go File From Folder method, because we can easily select all the .csv files from the list of files.
  2. Do all the files live inside the same folder?
    • If files are in different folders, information technology makes more than sense to utilize an Excel Input File to store the file paths.

Source: https://pythoninoffice.com/read-multiple-excel-files-into-python/

Posted by: daltonthisharm.blogspot.com

0 Response to "How To Read Multiple Excel Files In Python"

Post a Comment

Iklan Atas Artikel

Iklan Tengah Artikel 1

Iklan Tengah Artikel 2

Iklan Bawah Artikel