Descriptive Statistics and Value Counts
Once you've cleaned and prepared your DataFrame, the next step is understanding the distribution and summary of your data.
Pandas provides simple and powerful tools to generate statistical overviews — which help you spot patterns, errors, or insights at a glance.
Descriptive Methods
Use .describe()
to get a quick statistical summary of all numeric columns:
- Count of non-null values
- Mean, standard deviation
- Min and max values
- 25%, 50%, and 75% percentiles
This method is your go-to for initial data profiling.
Categorical Analysis with value_counts()
To summarize non-numeric (categorical) columns, use .value_counts()
.
It returns the frequency of each unique value in a column — which is useful for:
- Understanding dominant categories
- Checking for typos or outliers
- Preparing for visualizations
Common Additional Methods
Method | Purpose |
---|---|
mean() | Average value |
median() | Middle value |
std() | Standard deviation |
min() / max() | Range of values |
sum() | Total sum of column |
count() | Number of non-null entries |
You can apply them either column-wise or across the entire DataFrame.
What’s Next?
Now let’s try these summary methods in practice using a sample dataset.