Introduction to `scipy.stats`

The scipy.stats module is one of the most powerful parts of SciPy.
It provides tools for statistical analysis, including probability distributions, statistical tests, and summary statistics.

In this lesson, we’ll:

Explore summary statistics
Perform a basic statistical test
Work with probability distributions

Setting Up

Importing the required modules:

Import NumPy and SciPy Stats
import numpy as np
from scipy import stats

Example 1: Summary Statistics

Summary Statistics
data = [5, 7, 8, 7, 2, 17, 2, 9, 4, 11]

mean = np.mean(data)
median = np.median(data)
mode = stats.mode(data, keepdims=True)

print("Mean:", mean)
print("Median:", median)
print("Mode:", mode.mode[0], "Frequency:", mode.count[0])

Example 2: Hypothesis Testing

One-Sample t-Test
# Test if the mean of data is significantly different from 5
t_stat, p_value = stats.ttest_1samp(data, 5)

print("t-statistic:", t_stat)
print("p-value:", p_value)

If the p-value is less than 0.05, we reject the null hypothesis.

Example 3: Probability Distributions

Normal Distribution PDF
x = np.linspace(-3, 3, 100)
pdf = stats.norm.pdf(x, loc=0, scale=1)

print("First 5 PDF values:", pdf[:5])

Probability density functions (PDF) are useful for understanding the likelihood of different outcomes.

Key Takeaways

scipy.stats is your go-to module for statistical work in Python.
It provides:
- Summary statistics
- Hypothesis tests
- Probability distributions

In the next lessons, we’ll dig deeper into descriptive and inferential statistics with SciPy.

Want to learn more?

Join CodeFriends Plus membership or enroll in a course to start your journey.

Setting Up​

Example 1: Summary Statistics​

Example 2: Hypothesis Testing​

Example 3: Probability Distributions​

Key Takeaways​