How to Use pandas and openpyxl Together
So far, you have learned that pandas
and openpyxl
are specialized in handling tabular data.
pandas is excellent for processing and analyzing large datasets, while openpyxl allows for precise formatting of Excel files.
In this lesson, we will explore how to handle Excel files using these two libraries together.
Using pandas and openpyxl Together
The following code example demonstrates how to process data with pandas, save it to an Excel file using openpyxl, and apply additional formatting.
Data processing and Excel formatting with pandas and openpyxl
import pandas as pd
from openpyxl import load_workbook
# Create a pandas DataFrame
df = pd.DataFrame({
'Name': ['Alice', 'Bob', 'Charlie'],
'Age': [25, 30, 35]
})
# Save the DataFrame to an Excel file using pandas (create the file with ExcelWriter)
file_name = 'output_file.xlsx'
with pd.ExcelWriter(file_name, engine='openpyxl') as writer:
df.to_excel(writer, index=False, sheet_name='Sheet1')
# Additional modifications using openpyxl
wb = load_workbook(file_name) # Load the Excel file
ws = wb['Sheet1'] # Select the sheet to be modified
# For example, add a new value to cell A1
ws['A1'] = 'Updated Name'
# Save the Excel file
wb.save(file_name)
print("The task is completed")
The code above works as follows:
Detailed Explanation of Code
1. Create a DataFrame with pandas
Create a pandas DataFrame
df = pd.DataFrame({
'Name': ['Alice', 'Bob', 'Charlie'],
'Age': [25, 30, 35]
})
- A
DataFrame
is created using thepandas
library. - There are two columns,
Name
andAge
, which store the names and ages respectively.
2. Save DataFrame to an Excel File Using pandas
Save the DataFrame to an Excel file using pandas
file_name = 'output_file.xlsx'
with pd.ExcelWriter(file_name, engine='openpyxl') as writer:
df.to_excel(writer, index=False, sheet_name='Sheet1')
- Use
ExcelWriter
to save theDataFrame
to an Excel file. file_name
is the name of the Excel file, set tooutput_file.xlsx
.engine='openpyxl'
: specifiesopenpyxl
as the engine for handling the Excel file.- The method
to_excel()
converts theDataFrame
to an Excel file. index=False
: excludes the index when saving.sheet_name='Sheet1'
: specifies "Sheet1" as the sheet name where the data will be saved.
3. Open Excel File and Select Sheet Using openpyxl
Additional modifications using openpyxl
wb = load_workbook(file_name) # Load the Excel file
ws = wb['Sheet1'] # Select the sheet to be modified
- Load the previously created Excel file (
output_file.xlsx
) usingopenpyxl
'sload_workbook()
method. wb
refers to the Workbook object that represents the Excel file.ws = wb['Sheet1']
loads and selects the "Sheet1" sheet.ws
refers to the Worksheet object representing the selected sheet.
4. Modify Cell Value
Add new value to cell A1
ws['A1'] = 'Updated Name'
ws['A1']
refers to cell A1 in the sheet.- It changes the value of cell A1 to
'Updated Name'
. - This code overwrites any existing value in cell A1 (e.g., 'Name').
5. Save the Excel File
Save the Excel file using wb.save()
wb.save(file_name)
- Save the modified Excel file using the
save()
method. - It updates the changes in the specified
file_name
, 'output_file.xlsx'.
Want to learn more?
Join CodeFriends Plus membership or enroll in a course to start your journey.