Python CSV: How to Read and Write CSV Files in Python

Article Summary

Working with Python CSV files means learning to read, write, and parse tabular data using Python's built-in `csv` module or pandas. This article covers both tools, including reader, DictReader, writer, and DictWriter methods. You'll gain practical skills to handle any CSV format confidently.

To retrieve and store the information your application uses, it takes more than just your programming language, keyboard, and console. If you are using Python, you have a few options, including the Python CSV module that will allow you to store the data your app uses and processes as a CSV (Comma Separated Values) file. While many applications use a database, using a CSV file might be better at times, like when you are generating data for another application that is expecting this format. In this article, you’ll learn how to read and write the data in a CSV file with Python in two different ways: using the Python CSV module and using the third-party Pandas library.

What is a CSV file?

A CSV file is a plain text file used for storing tabular data. Because it is plain text, you can open a CSV file in a text editor to edit it. You can also open CSV files in Microsoft Excel and other spreadsheet applications to see the data displayed in a tabular format.

The name of a CSV file hints at what it looks like (i.e., Comma Separated Values). Here is the contents of an example CSV file:

name,city,state,birthday month
Bob,Kansas City,MO,March
Jane,St. Louis,MO,January
Rick,New York,NY,April

Recommended Udemy course

The Complete Python Bootcamp From Zero to Hero in Python

Jose Portilla, Pierian Training

And here is how it looks when you open up this file in Excel:

Notice that a line break separates each data set, and the first line contains the headings for each column. Commas then separate the data on each line.

In standard CSV format, the character to separate each value, also known as the delimiter, is a comma, but not all CSV files use a comma as a delimiter. Sometimes, a tab, semi-colon, or colon is used, especially if commas are in the data.

The data values can also be surrounded by single quotes or double quotes, which allows you to use commas in the data value itself. Here is an example CSV line where this would be useful:

“Bob Smith”,”Kansas City”, MO”

Sometimes CSV files use escape characters, which basically tell the code processing the file that the character after the escape character should be processed as normal text and not as a delimiter or a quote surrounding a value. Here is an example of using escape characters in the line of a CSV file:

“Bob”, “I say, “Let’s go now!””

Here, a backslash () is used as an escape character, which means the parser will understand to ignore the double-quote right after it and consider it as text. Here is how the second value in this line will look when processed:

I say, “Let’s go now!”

Note that the comma does not need an escape character in this example because double quotes surround the value.

All types of applications use CSV files, including spreadsheets and databases, for ingesting data and transferring data from one application to another. Since they are in plain text, they are easy to manipulate programmatically, and any programming language with text processing capabilities can process them.

Handling CSV files with Python CSV

The Python programming language ships with a CSV module that allows for reading and writing CSV files without installing a third-party library. By default, it is designed to handle the CSV file format generated by Microsoft Excel, but it can be configured to read and write any type of CSV file.

Reading CSV files using the CSV module

To parse CSV data, you can use the CSV module’s reader object. Let’s say we have the first CSV example with the name, city, state, and birthday month values in a file called names.csv. Here is a Python code example we can use to parse the file:

# Import the csv module
import csv

# Use Python’s built-in open function which opens file objects
with open(‘names.csv’) as csv_file:
# Create a CSV Reader object
reader = csv.reader(csv_file)
# Set a line count so we know which line has the headings
line_count = 0
for row in reader:
# Print out the headings from the first line of the file
if line_count == 0:
print(f’The column headings are {“, “.join(row)}’)
line_count += 1
# Print out the data from the remaining lines
else:
print(f'{row[0]} lives in {row[1]},{row[2]}, and was born in {row[3]}.’)
line_count += 1
print(f’CSV File contains {line_count} lines.’)

When this Python script runs, this is what it prints out:

The column headings are name, city, state, birthday month
Bob lives in Kansas City,MO, and was born in March.
Jane lives in St. Louis,MO, and was born in January.
Rick lives in New York,NY, and was born in April.
CSV File contains 4 lines.

The reader object returns rows, which are a list of string items, and since the first row contains the heading for each column of data, it’s processed differently. The comments in the example code explain more of what is happening in detail.

Reading CSV files into a dictionary with the CSV module

The last method we used for parsing the CSV file worked, but it required dealing with lists of strings. A cleaner way of parsing the CSV is reading the data into Python dictionaries. We will use the same CSV data that we processed in the last script, but this time will use the CSV module’s DictReader object. Here is the code:

# Import the csv module
import csv

# Open the csv file with Python’s built in open() method
with open(‘names.csv’) as csv_file:
# Create a CSV DictReader object
reader = csv.DictReader(csv_file)
# Get the heading names with reader.fieldnames
print(f’The column headings are {“, “.join(reader.fieldnames)}’)
# Set a line counter
line_count = 0
# Print out the lines of the file
for row in reader:
print(f'{row[“name”]} lives in {row[“city”]},{row[“state”]}, and was born in {row[“birthday month”]}.’)
line_count += 1
print(f’CSV File contains {line_count} lines.’)

This prints out the same results as the last code:

But when you use the DictReader method, it automatically parses the headings to create the dictionaries. So instead of printing out the first line to get the headings, we must use the fieldnames attribute of the DictReader object. When we access the values in each row, we can access them by the heading name now instead of numerical indexes. Also, note that the DictReader only iterates the rows of data, not the heading line.

Reading custom CSV files with Python CSV

The CSV file we are currently using has the standard CSV format that the Python CSV module expects, so we can just pass the file to the reader object, and it works. But if you have a CSV file that uses a different delimiter, wraps the data in quotes, or uses an escape character, you must add extra parameters to either the reader or DictReader method calls in order to process the file correctly.

If we have a CSV file that uses tab delimiters (often called a TSV file), wraps the values in double-quotes, and uses a backslash as an escape character, we will have to set these extra parameters like this:

# Using reader
with open(‘names.csv’) as csv_file:
reader = csv.reader(csv_file, delimiter=’t’, quotechar='”‘, escapechar=””)

# Using DictReader
with open(‘names.csv’) as csv_file:
reader = csv.DictReader(csv_file, delimiter=’t’, quotechar='”‘, escapechar=””)

Writing CSV files using Python CSV

To write data to a CSV file using the Python CSV module, you can use one of the writer objects: writer or DictWriter. We will start by using writer, which uses Python lists to create the rows of the CSV file. The writer method also accepts the same parameters as the reader method: delimiter, quotechar, and escapechar. This allows you to generate any CSV format you choose. It also accepts a quoting parameter, which we can see in action in the example below:

# Open the file for writing with the Python open method
with open(‘example.csv’, mode=’w’) as example_file:
writer = csv.writer(example_file, delimiter=’,’, quotechar='”‘, quoting=csv.QUOTE_MINIMAL)
# Write the header row with the writerow method
writer.writerow([‘name’, ‘city’, ‘state’, ‘birthday month’])
# Write the data rows with the writerow method
writer.writerow([‘Bob’, ‘Kansas City’, ‘MO’, ‘March’])
writer.writerow([‘Jane’, ‘St. Louis’, ‘MO’, ‘January’])
writer.writerow([‘Rick’, ‘New York’, ‘NY’, ‘April’])

The code above will generate a file that looks the same as the one we have been using. There are no quotes in it, even though we set the quotechar to double quotes. This is because the quoting parameter is set to csv.QUOTE_MINIMAL. This parameter tells the writer how to use quotes and has four possible values:

csv.QUOTE_MINIMAL. This is the default setting, and the writer will only apply quotes if a field contains the delimiter or quotechar.
csv.QUOTE_ALL. The writer will apply quotes to every field.
csv.QUOTE_NONNUMERIC. The writer will quote all text fields and convert any numeric data to the float data type.
csv.QUOTE_NONE. The writer will escape delimiters instead of using quotes. An escapechar parameter is required if you use this value.

Writing CSV files into a dictionary with CSV

Just like we can use the DictReader method instead of the reader method to use dictionaries for our data, we can use the DictWriter method instead of the writer method. Here is an example writing the same CSV file:

# Import the csv module
import csv

# Open the file for writing with the Python open method
with open(‘example.csv’, mode=’w’) as example_file:
# List of the headings
fieldnames = [‘name’, ‘city’, ‘state’, ‘birthday month’]
# Create the DictWriter object
writer = csv.DictWriter(example_file, fieldnames=fieldnames, delimiter=’,’, quotechar='”‘, quoting=csv.QUOTE_MINIMAL)
# Write the header row with the writeheader method
writer.writeheader()
# Write the data rows with the writerow method
writer.writerow({‘name’: ‘Bob’, ‘city’: ‘Kansas City’, ‘state’: ‘MO’, ‘birthday month’: ‘March’})
writer.writerow({‘name’: ‘Jane’, ‘city’: ‘St. Louis’, ‘state’: ‘MO’, ‘birthday month’: ‘January’})
writer.writerow({‘name’: ‘Rick’, ‘city’: ‘New York’, ‘state’: ‘NY’, ‘birthday month’: ‘April’})

The DictWriter method accepts all the parameters that the writer method does, but it has another required parameter: fieldnames. This parameter is a list of the names of the headings in your CSV file that also are the keys in the dictionaries you will be using to create the rows of data. To print the header row, we call the writeheader method instead of manually creating a row like we did with the writer method.

Handling CSV files with pandas

Another option for more advanced CSV file handling is using the Python pandas module. This module provides advanced analysis and manipulation functionality for tabular data and is often used in combination with machine learning, data analysis, and data visualization libraries. It is also open source and free to use, but you will have to install it separately with your Python package manager.

If you are using pip, you can install it with this command:

pip install pandas

And if you are using Anaconda, you can install it by running this command:

conda install pandas

Reading CSV files with pandas

Reading a CSV file with pandas is really easy. Just make sure you have the pandas module installed, import the module, call the pandas read_csv method on the file, and print the results. You don’t have to use the Python open method or loop through the lines in the file. It’s all done for you. Here is an example:

# Import the pandas module
import pandas

# Read the csv file into a DataFrame
dataframe = pandas.read_csv(‘names.csv’)

print(dataframe)

And here is the result:

name city state birthday month
0 Bob Kansas City MO March
1 Jane St. Louis MO January
2 Rick New York NY April

When pandas reads a CSV file, it puts the data in a dataframe. A dataframe is a two-dimensional labeled data structure, kind of like a spreadsheet or database that is in memory. To learn more about dataframes, read our article on creating dataframes with pandas.

Writing CSV files with pandas

You can also create CSV files in pandas from a dataframe. Here is an example:

# Import the pandas module
import pandas

# Create a dataframe
dataframe = pandas.DataFrame([
[‘Bob’, ‘Kansas City’, ‘MO’, ‘March’],
[‘Jane’, ‘St. Louis’, ‘MO’, ‘January’],
[‘Rick’, ‘New York’, ‘NY’, ‘April’]],
columns=[‘name’, ‘city’, ‘state’, ‘birthday month’])

# Write dataframe to a CSV file
dataframe.to_csv(‘example.csv’, index=False)

This code will recreate the same CSV we have used for this whole article. The first parameter in pandas.DataFrame contains the data we want to write to the CSV as a list of lists. Each of the nested lists will become a data row. By setting the named columns parameter, we define the columns in the dataframe that will become the headers of the CSV file.

Then we call the to_csv method on the dataframe we created to generate the CSV file. The first parameter is the path of the file we want to create, and the named index parameter that is set to false tells pandas to not include the indexes it created for the dataframe. Without this, our CSV file’s first column will have an empty header and an index number for each row.

To learn more about using pandas with CSV files, see our guide to exporting pandas to CSV.

Conclusion

The CSV file format is a handy universal format for transferring data from one application to another. You can simply open the file in Excel or a text editor to examine the data you’re working with instead of needing a proprietary application. If you want to simply read and write CSV files with Python, that functionality comes built-in and is simple to use. If you plan on doing data analysis, however, the Python pandas module may be the best choice because of all the data manipulation features it has. To learn more about using the Python programming language, see our article on command line arguments in Python, and if you are ready to build real-world skills using Python, our Python courses will help you build them.