Alexander Hagmann

Knowing how to export Pandas DataFrames to a CSV file is an essential skill in every data scientist’s toolkit. Pandas is a Python-based data manipulation tool, popular for data science uses. Data specialists use DataFrames, a common Pandas object and represents a table, to merge, manipulate, and analyze tabular data. 

At the end of a Pandas coding session, any data and progress will need to be saved. The most common way to do this is to write DataFrames to a CSV file, which is nothing more than a simple text file. It’s the most common and easiest way to store and exchange tabular data. The CSV file format is so because it’s widely supported by other applications including Excel, Open Office, and Tableau. 

Some typical use cases for exporting DataFrames to CSV include:

The Complete Pandas Bootcamp 2021: Data Science with Python

Last Updated November 2020

  • 325 lectures
  • All Levels
4.7 (1,535)

Pandas fully explained | 150+ Exercises | Must-have skills for Machine Learning & Finance | + Scikit-Learn and Seaborn | By Alexander Hagmann

Explore Course

Basics of exporting Pandas DataFrames to CSV files

To understand DataFrame df. As a first step, we have to import the Pandas Library with import pandas as pd

import pandas as pd 

With pd.DataFrame() we can create a simple DataFrame object.  

df = pd.DataFrame(data = {"Name": ["Lionel Messi", "Cristiano Ronaldo",
                               	  "Neymar Junior", "Kylian Mbappe", 
                                   "Manuel Neuer"],
                      	"Country": ["Argentina", "Portugal", "Brazil",
                                      "France", "Germany"],
                      	"Height_m": [1.70, 1.87, 1.75, 1.78, 1.93]
                     	})
df

A DataFrame is a 2-dimensional labeled data structure. In our example, df has five rows and three columns. Each row represents a soccer player and each column contains information on the players. The ‘column’ on the left side isn’t a column. It’s the index of the DataFrame. The index labels the rows. If not specified, DataFrames have a RangeIndex with ascending integers. At the top of the DataFrame are the column headers.

To write DataFrames to CSV files we can use the DataFrame method to_csv(). A straightforward example is: 

df.to_csv("players.csv")

This creates the CSV file players.csv. When opening the file, we can see the following structure:

,Name,Country,Height_m
0,Lionel Messi,Argentina,1.7
1,Cristiano Ronaldo,Portugal,1.87
2,Neymar Junior,Brazil,1.75
3,Kylian Mbappe,France,1.78
4,Manuel Neuer,Germany,1.93

A CSV file is a delimited text file that uses a comma to separate values. You can still see the tabular data structure. Each line of the file is a data record – the soccer player. Each record consists of one or more values — player information — separated by commas.

Depending on the use case, we can customize the export. The method to_csv() provides several options (parameters) to fine-tune the final output.

5 ways to customize Pandas to CSV

  1. Define file name and location

The first and most important parameter is path_or_buf. Here you can define:

Players is an appropriate filename. You may choose a different filename. But don’t use any whitespaces (football players) or special characters. Use underscores if your filename contains two or more words (football_players). 

Use the CSV filetype (.csv) if not specified otherwise. Alternatively, you may write to TXT files by using the .txt extension. 

Saving in current working directory

If you do not specify a location with a full path, Pandas saves the file in your current working directory (CWD):

df.to_csv(path_or_buf = "players.csv")

This saves players.csv in your CWD. Note that you can omit “path_or_buf =“. 

Saving in a specified location

The CWD can vary and depends on your system and your Python Installation. Therefore, you may define a specified location by adding the full file path. To save players.csv on a Windows desktop, you will add the path C:\Users\alex\desktop\ to players.csv.

The full filename on Windows is: C:\Users\alex\desktop\players.csv

The full filename on macOS and Linux is: /Users/alex/desktop/players.csv

Please note that Windows employs the backslash (“\”) instead of the slash (“/”). Since backslash is a special character in Python, using the following code will drop an error:

df.to_csv("C:\Users\alex\desktop\players.csv") 

There are two ways how to fix this issue:

df.to_csv("C:/Users/alex/desktop/players.csv")
df.to_csv(r"C:\Users\alex\desktop\players.csv")

On macOS and Linux the single best solution is:

df.to_csv(r”C:\Users\alex\desktop\players.csv”)

  1. Exporting the Index

The to_csv() method by default exports the index. You can drop the index by adding index = False.

df.to_csv("players.csv", index = False)

Let’s have a look inside the CSV file:

Name,Country,Height_m
Lionel Messi,Argentina,1.7
Cristiano Ronaldo,Portugal,1.87
Neymar Junior,Brazil,1.75
Kylian Mbappe,France,1.78
Manuel Neuer,Germany,1.93

A simple rule: If your DataFrame has a default RangeIndex, don’t export the index as it doesn’t contain any valuable information. If you reimport the dataset from CSV with pd.read_csv(), the index may be listed twice in your DataFrame.

When should you export the index? In cases where you have important information in the index. The following DataFrame stocks contains stock prices for Microsoft (MSFT) and Apple (AAPL):

This DataFrame has an index with datetime information, which is a DatetimeIndex. In this example, you shouldn’t drop the index. 

stocks.to_csv("stocks.csv")

The CSV file stocks.csv still contains the datetime information:

Date,AAPL,MSFT
2020-05-04,293.16,178.84
2020-05-05,297.56,180.76
2020-05-06,300.63,182.54
2020-05-07,303.74,183.60
2020-05-08,310.13,184.68
  1. Selecting columns

If not specified, to_csv() writes all columns of a DataFrame to CSV. You may select one or many columns and omit all other columns. 

Create a list (my_list) with those columns that you wish to export (e.g. Name and Country).

my_list = ["Name", "Country"]

Pass my_list to columns = 

my_list = ["Name", "Country"]
  1. Exporting column headers

The to_csv() method by default writes column headers (e.g. Country) to CSV. You may drop these column labels by adding header = False.

df.to_csv(..., header = False)
  1. Be careful with all other options

There are 14 additional parameters to further customize the export with to_csv(). It’s best to use the default settings here.

In rare cases, alternative settings may be appropriate. Let’s consider two more options:  

Changing the delimiter (not recommended)

In a CSV file, values are separated by a comma. You may change the delimiter and use a semicolon (“;”) instead. Pass the desired delimiter in quotes to sep =.

df.to_csv(..., sep = ";")

Defining an alternative representation for missing data (not recommended)

When writing DataFrames to CSV, missing data is represented by an empty string (“”). You may define an alternative representation (e.g. “None”) by passing it to na_rep =

df.to_csv(..., na_rep = "None")

Data scientists frequently write Pandas DataFrames to CSV. The to_csv() method provides many options to customize the export. If you want to save your data until the next coding session, do the following:

df.to_csv("file_name.csv", index = False) # if df contains a RangeIndex
df.to_csv("file_name.csv")           # if the index contains important information

This allows you to reimport the data into Pandas with simple code:

pd.read_csv("file_name.csv", ...)

In all other cases, you can customize the export to your needs. 
Now that you have the skills to perform this important Pandas task, you can learn more about Pandas in its documentation or by starting a Pandas bootcamp.

Page Last Updated: August 2020

Top courses in Pandas

Data Processing with Python
Ardit Sulce
4.3 (1,472)
Data Manipulation in Python: A Pandas Crash Course
Samuel Hinton, Kirill Eremenko, Hadelin de Ponteves, SuperDataScience Team
4.6 (410)
Highest Rated
Data Analysis with Pandas and Python
Boris Paskhaver
4.7 (11,466)
Bestseller
Complete Data Analysis with Pandas : Hands-on Pandas Python
Ankit Mistry, Data Science & Machine Learning Academy
4.4 (530)

More Pandas Courses

Pandas students also learn

Empower your team. Lead the industry.

Get a subscription to a library of online courses and digital learning tools for your organization with Udemy for Business.

Request a demo