Skip to main content
Practice

Distribution Plots (histplot, kdeplot)

Visualizing data distributions helps you understand how your data is spread, detect patterns, and identify potential outliers.

Seaborn provides two main tools for this:

  • histplot() – shows the frequency distribution of a dataset.
  • kdeplot() – shows the probability density function (smoothed distribution curve).

Using histplot()

The histplot() function creates a histogram that shows how many data points fall into each range (bin).

Basic Histogram
import seaborn as sns
import matplotlib.pyplot as plt

tips = sns.load_dataset("tips")
sns.histplot(data=tips, x="total_bill")
plt.title("Distribution of Total Bills")
plt.show()

Key points:

  • x specifies the variable to plot.
  • The plot is divided into bins (intervals) along the X-axis.
  • The height of each bar shows how many observations fall into that bin.

Using kdeplot()

The kdeplot() function displays a smooth curve representing the estimated probability density of the data.

Basic KDE Plot
sns.kdeplot(data=tips, x="total_bill")
plt.title("KDE of Total Bills")
plt.show()

Key points:

  • KDE = Kernel Density Estimate (a smoothed version of the histogram).
  • Good for showing trends in continuous data.
  • Can be combined with histplot() for more context.

Combining Histogram and KDE

You can combine both in a single histplot() by setting kde=True:

Histogram with KDE Overlay
sns.histplot(data=tips, x="total_bill", kde=True)
plt.title("Total Bill Distribution with KDE")
plt.show()

In the next Jupyter Notebook, you will experiment with:

  • Changing bin sizes in histograms.
  • Adding hue categories to compare groups.
  • Styling KDE plots for clarity.

Want to learn more?

Join CodeFriends Plus membership or enroll in a course to start your journey.