Filtering with Boolean Conditions
Often in data analysis, you want to narrow your dataset to rows that meet specific criteria — like selecting only the rows where sales exceeded $100 or users are located in the US.
Pandas makes this easy using boolean conditions.
How It Works
You write a condition that checks whether each row meets your requirement. The result is a series of True
or False
values — which Pandas can use to filter the DataFrame.
For example, to filter for rows where the value in a "Score" column is greater than 80:
df[df["Score"] > 80]
This returns a new DataFrame containing only the rows where that condition is True
.
Why It's Useful
Filtering helps you:
- Focus on relevant data
- Explore subsets of your dataset
- Prepare data for visualization or modeling
You can also combine conditions using logical operators like &
(AND) and |
(OR), but they require parentheses:
df[(df["Age"] > 30) & (df["Country"] == "Canada")]
This selects rows where both conditions are true.
Summary
- Boolean filtering is a powerful tool to isolate rows of interest.
- You can filter using conditions like
>
,<
,==
,!=
, etc. - Combine multiple conditions with
&
and|
, and wrap each condition in parentheses.
What’s Next?
Practice filtering in the Jupyter notebook!