Distribution Plots (histplot, kdeplot)
Visualizing data distributions helps you see how values are spread, identify trends, and detect outliers.
Seaborn provides two primary functions for this:
histplot()– Displays the frequency distribution of a variable.kdeplot()– Plots a smooth curve representing the estimated probability density.
Using histplot()
The histplot() function creates a histogram — a plot that shows how many data points fall within each numeric range, or bin.
Basic Histogram
import seaborn as sns
import matplotlib.pyplot as plt
tips = sns.load_dataset("tips")
sns.histplot(data=tips, x="total_bill")
plt.title("Distribution of Total Bills")
plt.show()
Key points:
xdefines the variable to visualize.- The X-axis is divided into bins (numeric intervals).
- The height of each bar indicates the count of observations in that bin.
Using kdeplot()
The kdeplot() function draws a smooth curve that estimates the probability density of the dataset.
Basic KDE Plot
sns.kdeplot(data=tips, x="total_bill")
plt.title("KDE of Total Bills")
plt.show()
Key points:
- KDE stands for Kernel Density Estimate, a technique that generates a smooth curve from the data.
- Ideal for analyzing continuous distributions and overall trends.
- Often combined with
histplot()to visualize both frequency and density together.
Combining Histogram and KDE
You can display both a histogram and a KDE curve in one plot by setting kde=True inside histplot():
Histogram with KDE Overlay
sns.histplot(data=tips, x="total_bill", kde=True)
plt.title("Total Bill Distribution with KDE")
plt.show()
Want to learn more?
Join CodeFriends Plus membership or enroll in a course to start your journey.