Pandas concat() method is used to concatenate pandas objects such as DataFrames and Series. We can pass various parameters to change the behavior of the concatenation operation.
The concat() method syntax is:
concat(objs, axis=0, join='outer', join_axes=None, ignore_index=False,
keys=None, levels=None, names=None, verify_integrity=False,
sort=None, copy=True)
Recommended Reading: Python Pandas Tutorial
Let’s look at a simple example to concatenate two DataFrame objects.
import pandas
d1 = {"Name": ["Pankaj", "Lisa"], "ID": [1, 2]}
d2 = {"Name": "David", "ID": 3}
df1 = pandas.DataFrame(d1, index={1, 2})
df2 = pandas.DataFrame(d2, index={3})
print('********\n', df1)
print('********\n', df2)
df3 = pandas.concat([df1, df2])
print('********\n', df3)
Name ID
1 Pankaj 1
2 Lisa 2
Name ID
3 David 3
Name ID
1 Pankaj 1
2 Lisa 2
3 David 3
Notice that the concatenation is performed row-wise i.e. 0-axis. Also, the indexes from the source DataFrame objects are preserved in the output.
d1 = {"Name": ["Pankaj", "Lisa"], "ID": [1, 2]}
d2 = {"Role": ["Admin", "Editor"]}
df1 = pandas.DataFrame(d1, index={1, 2})
df2 = pandas.DataFrame(d2, index={1, 2})
df3 = pandas.concat([df1, df2], axis=1)
print('********\n', df3)
Name ID Role
1 Pankaj 1 Admin
2 Lisa 2 Editor
The concatenation along column makes sense when the source objects contain different kinds of data of an object.
d1 = {"Name": ["Pankaj", "Lisa"], "ID": [1, 2]}
d2 = {"Name": "David", "ID": 3}
df1 = pandas.DataFrame(d1, index={1, 2})
df2 = pandas.DataFrame(d2, index={3})
df3 = pandas.concat([df1, df2], keys=["DF1", "DF2"])
print('********\n', df3)
Name ID
DF1 1 Pankaj 1
2 Lisa 2
DF2 3 David 3
d1 = {"Name": ["Pankaj", "Lisa"], "ID": [1, 2]}
d2 = {"Name": "David", "ID": 3}
df1 = pandas.DataFrame(d1, index={10, 20})
df2 = pandas.DataFrame(d2, index={30})
df3 = pandas.concat([df1, df2], ignore_index=True)
print('********\n', df3)
Name ID
0 Pankaj 1
1 Lisa 2
2 David 3
This is useful when the indexes in the source objects don’t make much sense. So we can ignore them and assign the default indexes to the output DataFrame.
Thanks for learning with the DigitalOcean Community. Check out our offerings for compute, storage, networking, and managed databases.
While we believe that this content benefits our community, we have not yet thoroughly reviewed it. If you have any suggestions for improvements, please let us know by clicking the “report an issue“ button at the bottom of the tutorial.
Hi ! i want to use concatenate function for each row of 2 or most column of my dataset in pandas. Ex: i have a series with 3 columns (NAme, Age , country ) of 10 rows (person). So I want to create a new column which concatenate for each person his name, age and country like (David22USA) Thank for your help.
- Soufiane CAMARA
Hi Pankaj, thanks for the work. One bit I don’t get. In 4. Assigning Keys to the Concatenated DataFrame Indexes, the index on the far left column shows DF1, then blank, then DF2. ******** Name ID DF1 1 Pankaj 1 2 Lisa 2 DF2 3 David 3 I don’t get the value of leaving the second row blank. How would i search on that? If i ran a query to gather all rows from the d1 data using the far left index col as a the column to search in, it would miss the second row. I don’t understand why anyone would do this. Thanks.
- Stu David