Hey, folks! In this article, we will be focusing on Seaborn Distplot in detail.
A Distplot or distribution plot, depicts the variation in the data distribution. Seaborn Distplot represents the overall distribution of continuous data variables.
The Seaborn module along with the Matplotlib module is used to depict the distplot with different variations in it. The Distplot depicts the data by a histogram and a line in combination to it.
Python Seaborn module contains various functions to plot the data and depict the data variations. The seaborn.distplot() function
is used to plot the distplot. The distplot represents the univariate distribution of data i.e. data distribution of a variable against the density distribution.
Syntax:
seaborn.distplot()
The seaborn.distplot() function accepts the data variable as an argument and returns the plot with the density distribution.
Example 1:
import numpy as np
import seaborn as sn
import matplotlib.pyplot as plt
data = np.random.randn(200)
res = sn.distplot(data)
plt.show()
We have used the numpy.random.randn() function
to generate random data values. Further, the pyplot.show() function
is used show the plot.
Output:
Example 2:
import numpy as np
import seaborn as sn
import matplotlib.pyplot as plt
import pandas as pd
data_set = pd.read_csv("C:/mtcars.csv")
data = pd.DataFrame(data_set['mpg'])
res = sn.distplot(data)
plt.show()
The pandas.read_csv() function
loads the dataset into the Python environment.
Output:
The Seaborn Distplot can be provided with labels of the axis by converting the data values into a Pandas Series using the below syntax:
Syntax:
pandas.Series(data,name='name')
seaborn.distplot()
Pandas Series contains a parameter ‘name
’ to set the label of the data axis.
Example:
import numpy as np
import seaborn as sn
import matplotlib.pyplot as plt
data = np.random.randn(200)
res = pd.Series(data,name="Range")
plot = sn.distplot(res)
plt.show()
Output:
The Seaborn Distplot can also be clubbed along with the Kernel Density Estimate Plot to estimate the probability of distribution of continuous variables across various data values.
Syntax:
seaborn.distplot(data,kde=True)
The kde
parameter is set to True
to enable the Kernel Density Plot along with the distplot.
Example:
import numpy as np
import seaborn as sn
import matplotlib.pyplot as plt
data = np.random.randn(100)
res = pd.Series(data,name="Range")
plot = sn.distplot(res,kde=True)
plt.show()
Output:
We can map the Seaborn Distplot along with Rug Plot to depict the distribution of data against bins with respect to the univariate data variable. The Rug Plot describes visualizes distribution of data in the form of bins.
Syntax:
seaborn.distplot(data, rug=True, hist=False)
The ‘rug
’ parameter needs to be set to True
to enable the rug plot distribution.
Example:
import numpy as np
import seaborn as sn
import matplotlib.pyplot as plt
data = np.random.randn(100)
res = pd.Series(data,name="Range")
plot = sn.distplot(res,rug=True,hist=False)
plt.show()
Output:
The entire Distplot can be plotted on the y axis using the below syntax:
Syntax:
seaborn.distplot(data,vertical=True)
The ‘vertical
’ parameter needs to be set to True
to plot the distplot on the y-axis.
Example:
import numpy as np
import seaborn as sn
import matplotlib.pyplot as plt
data = np.random.randn(100)
plot = sn.distplot(data,vertical=True)
plt.show()
Output:
Seaborn has a number of in-built functions to add extra background features to the plots. The seaborn.set() function
is used to set different background to the distribution plots.
Syntax:
seaborn.set(style)
Example:
import numpy as np
import seaborn as sn
import matplotlib.pyplot as plt
sn.set(style='dark',)
data = np.random.randn(500)
plot = sn.distplot(data)
plt.show()
Output:
We can set different colors to the distplot to add to the visualization of the data using the ‘color
’ parameter of the seaborn.distplot() function.
Syntax:
seaborn.distplot(data, color='color')
Example:
import numpy as np
import seaborn as sn
import matplotlib.pyplot as plt
sn.set(style='dark',)
data = np.random.randn(500)
plot = sn.distplot(data,color='purple')
plt.show()
Output:
Thus, Seaborn Module along with Matplotlib module helps in the data visualization and depicts the distribution of data.
I strongly recommend all the readers to read the Python Matplotlib Module to understand the basics of Data Visualization.
Thanks for learning with the DigitalOcean Community. Check out our offerings for compute, storage, networking, and managed databases.
While we believe that this content benefits our community, we have not yet thoroughly reviewed it. If you have any suggestions for improvements, please let us know by clicking the “report an issue“ button at the bottom of the tutorial.