Introduction to scipy.stats
The scipy.stats
module is one of the most powerful parts of SciPy.
It provides tools for statistical analysis, including probability distributions, statistical tests, and summary statistics.
In this lesson, we’ll:
- Explore summary statistics
- Perform a basic statistical test
- Work with probability distributions
Setting Up
Importing the required modules:
Import NumPy and SciPy Stats
import numpy as np
from scipy import stats
Example 1: Summary Statistics
Summary Statistics
data = [5, 7, 8, 7, 2, 17, 2, 9, 4, 11]
mean = np.mean(data)
median = np.median(data)
mode = stats.mode(data, keepdims=True)
print("Mean:", mean)
print("Median:", median)
print("Mode:", mode.mode[0], "Frequency:", mode.count[0])
Example 2: Hypothesis Testing
One-Sample t-Test
# Test if the mean of data is significantly different from 5
t_stat, p_value = stats.ttest_1samp(data, 5)
print("t-statistic:", t_stat)
print("p-value:", p_value)
If the p-value is less than 0.05, we reject the null hypothesis.
Example 3: Probability Distributions
Normal Distribution PDF
x = np.linspace(-3, 3, 100)
pdf = stats.norm.pdf(x, loc=0, scale=1)
print("First 5 PDF values:", pdf[:5])
Probability density functions (PDF) are useful for understanding the likelihood of different outcomes.
Key Takeaways
-
scipy.stats
is your go-to module for statistical work in Python. -
It provides:
- Summary statistics
- Hypothesis tests
- Probability distributions
In the next lessons, we’ll dig deeper into descriptive and inferential statistics with SciPy.
Want to learn more?
Join CodeFriends Plus membership or enroll in a course to start your journey.