Saving Collected Data as a CSV File
Saving data collected through web scraping as a CSV
file allows you to easily view the data in Excel or any text editor.
In this lesson, we'll learn how to save data collected using BeautifulSoup
into a CSV
file.
Code for Saving CSV File
# Create a StringIO object, which acts like a file but is in memory (can handle strings)
output = io.StringIO()
# Declare a list of field names for the CSV file
fieldnames = ['CompanyName', 'Founder', 'YearEstablished']
# Create a CSV writer object, specifying the field names to allow writing data in dictionary format
writer = csv.DictWriter(output, fieldnames=fieldnames)
# Write the field names as the first line of the CSV file (header)
writer.writeheader()
# Write multiple rows of data containing company information in CSV format (company_info is a list of dictionaries)
writer.writerows(company_info)
# Output the results in CSV format
print(output.getvalue())
Code Explanation
1. output = io.StringIO()
-
io.StringIO()
is a class provided by Python'sio
module, which creates an object that acts like a file in memory. -
Typically, CSV files are saved to the file system, but this example uses
StringIO()
to handle strings in memory without using a file. -
The
output
variable acts as a "memory buffer", temporarily holding the CSV data to be written.
2. writer = csv.DictWriter(output, fieldnames=fieldnames)
-
csv.DictWriter()
is a class provided by Python'scsv
module, which helps write dictionary data to a CSV file. -
csv.DictWriter(output, fieldnames=fieldnames)
converts the outputStringIO
object into a CSV writer object. -
fieldnames
is a list specifying the field names to be written as the first line of the CSV file.
3. writer.writeheader()
-
writer.writeheader()
is a function that writes the field names as the first line of the CSV file. -
Calling this function writes the field names specified in
fieldnames
as the first line of the CSV file.
4. writer.writerows(company_info)
-
writer.writerows()
is a function that writes multiple rows of data to the CSV file. -
company_info
is a list of dictionaries, with each dictionary representing a row in the CSV file. -
Calling
writer.writerows(company_info)
writes the data stored in company_info to the CSV file.
5. print(output.getvalue())
-
output.getvalue()
is a function that returns the data stored in the StringIO object as a string. -
Calling
print(output.getvalue())
allows you to output the results in CSV format.
We have learned how to save the data collected through web scraping as a CSV file.
Utilizing web scraping effectively can make complicated and repetitive data collection tasks much more efficient.
In the next Chapter, we will explore how to apply email automation in practical scenarios.
Want to learn more?
Join CodeFriends Plus membership or enroll in a course to start your journey.