Descriptive Statistics and Value Counts
Once you've cleaned and prepared your DataFrame, the next step is understanding the distribution and summary of your data.
Pandas provides simple and powerful tools to generate statistical overviews, which help you spot patterns, errors, or insights at a glance.
Descriptive Methods
Use .describe()
to get a quick statistical summary of all numeric columns:
- Count of non-null values
- Mean and standard deviation
- Minimum and maximum values
- 25%, 50%, and 75% percentiles
This method is your go-to for initial data profiling.
Categorical Analysis with value_counts()
To summarize non-numeric (categorical) columns, use .value_counts()
.
It returns the frequency of each unique value in a column.
value_counts() example
df = pd.DataFrame({
"Category": ["A", "A", "B", "B", "C", "C"]
})
df["Category"].value_counts()
# Output:
# B 2
# A 2
# C 2
Common Additional Methods
Method | Purpose |
---|---|
mean() | Average value |
median() | Middle value |
std() | Standard deviation |
min() / max() | Minimum and maximum values |
sum() | Total sum of column |
count() | Number of non-null entries |
You can apply these methods either column-wise or across the entire DataFrame.
Want to learn more?
Join CodeFriends Plus membership or enroll in a course to start your journey.