Pandas melt() function is used to change the DataFrame format from wide to long. It’s used to create a specific format of the DataFrame object where one or more columns work as identifiers. All the remaining columns are treated as values and unpivoted to the row axis and only two columns - variable and value.
The use of melt() function is more clear when looked through an example.
import pandas as pd
d1 = {"Name": ["Pankaj", "Lisa", "David"], "ID": [1, 2, 3], "Role": ["CEO", "Editor", "Author"]}
df = pd.DataFrame(d1)
print(df)
df_melted = pd.melt(df, id_vars=["ID"], value_vars=["Name", "Role"])
print(df_melted)
Output:
Name ID Role
0 Pankaj 1 CEO
1 Lisa 2 Editor
2 David 3 Author
ID variable value
0 1 Name Pankaj
1 2 Name Lisa
2 3 Name David
3 1 Role CEO
4 2 Role Editor
5 3 Role Author
We can pass the ‘var_name’ and ‘value_name’ parameters to change the column names of ‘variable’ and ‘value’.
df_melted = pd.melt(df, id_vars=["ID"], value_vars=["Name", "Role"], var_name="Attribute", value_name="Value")
Recommended Read: Python Pandas Tutorial
Let’s see what happens when we pass multiple columns as the id_vars parameter.
df_melted = pd.melt(df, id_vars=["ID", "Name"], value_vars=["Role"])
print(df_melted)
Output:
ID Name variable value
0 1 Pankaj Role CEO
1 2 Lisa Role Editor
2 3 David Role Author
It’s not required to use all the rows from the source DataFrame. Let’s skip the “ID” column in the next example.
df_melted = pd.melt(df, id_vars=["Name"], value_vars=["Role"])
print(df_melted)
Output:
Name variable value
0 Pankaj Role CEO
1 Lisa Role Editor
2 David Role Author
We can use pivot() function to unmelt a DataFrame object and get the original dataframe. The pivot() function ‘index’ parameter value should be same as the ‘id_vars’ value. The ‘columns’ value should be passed as the name of the ‘variable’ column.
import pandas as pd
d1 = {"Name": ["Pankaj", "Lisa", "David"], "ID": [1, 2, 3], "Role": ["CEO", "Editor", "Author"]}
df = pd.DataFrame(d1)
# print(df)
df_melted = pd.melt(df, id_vars=["ID"], value_vars=["Name", "Role"], var_name="Attribute", value_name="Value")
print(df_melted)
# unmelting using pivot()
df_unmelted = df_melted.pivot(index='ID', columns='Attribute')
print(df_unmelted)
Output:
ID Attribute Value
0 1 Name Pankaj
1 2 Name Lisa
2 3 Name David
3 1 Role CEO
4 2 Role Editor
5 3 Role Author
Value
Attribute Name Role
ID
1 Pankaj CEO
2 Lisa Editor
3 David Author
The unmelted DataFrame values are the same as the original DataFrame. But, the columns and index need some minor changes to make it exactly like the original data frame.
df_unmelted = df_unmelted['Value'].reset_index()
df_unmelted.columns.name = None
print(df_unmelted)
Output:
ID Name Role
0 1 Pankaj CEO
1 2 Lisa Editor
2 3 David Author
Reference: pandas melt() API Doc
Thanks for learning with the DigitalOcean Community. Check out our offerings for compute, storage, networking, and managed databases.
While we believe that this content benefits our community, we have not yet thoroughly reviewed it. If you have any suggestions for improvements, please let us know by clicking the “report an issue“ button at the bottom of the tutorial.