Skip to main content
Practice

Distribution Plots (histplot, kdeplot)

Visualizing data distributions helps you understand how your data is spread, spot patterns, and detect potential outliers.

Seaborn provides two main functions for this:

  • histplot(): Shows the frequency distribution of a dataset.
  • kdeplot(): Shows a smooth curve of the estimated probability density.

Using histplot()

The histplot() function creates a histogram that shows how many data points fall into each range (bin).

Basic Histogram
import seaborn as sns
import matplotlib.pyplot as plt

tips = sns.load_dataset("tips")
sns.histplot(data=tips, x="total_bill")
plt.title("Distribution of Total Bills")
plt.show()

Key points:

  • x specifies the variable to plot.
  • The plot is divided into bins (intervals) along the X-axis.
  • The height of each bar shows how many observations fall into that bin.

Using kdeplot()

The kdeplot() function displays a smooth curve that represents the estimated probability density of the data.

Basic KDE Plot
sns.kdeplot(data=tips, x="total_bill")
plt.title("KDE of Total Bills")
plt.show()

Key points:

  • KDE stands for Kernel Density Estimate, which creates a smooth curve from the data.
  • Useful for understanding trends in continuous data.
  • Can be combined with histplot() for better context.

Combining Histogram and KDE

You can combine both in a single histplot() by setting kde=True:

Histogram with KDE Overlay
sns.histplot(data=tips, x="total_bill", kde=True)
plt.title("Total Bill Distribution with KDE")
plt.show()

Want to learn more?

Join CodeFriends Plus membership or enroll in a course to start your journey.