Evaluating a Model's Predictive Performance with F1-Score
In machine learning, when evaluating models, precision and recall can have opposing characteristics.
Increasing precision tends to decrease recall, while increasing recall tends to decrease precision.
To address this issue, the F1-Score
, which considers the balance between precision and recall, is used.
The F1-Score is calculated as follows:
Here, Precision
refers to precision, and Recall
refers to recall.
The F1-Score is the harmonic mean
of precision and recall, designed to achieve a balance between the two.
Thus, if one value is significantly lower, the F1-Score will also decrease.
The Significance of F1-Score
The F1-Score assesses the balance between precision and recall.
It is particularly useful as a performance metric when data is imbalanced, offering a more appropriate evaluation than accuracy.
For instance, if precision is 80%
and recall is 40%
, the simple average of the two metrics would be:
(80% + 40%) / 2 = 60%
However, since the F1-Score is calculated as the harmonic mean of precision and recall, it would be:
2 * (80% * 40%) / (80% + 40%) = 50%
Conversely, if both precision and recall are 60%
, the F1-Score would also be 60%
.
Yet, if one value is drastically lower, the F1-Score would sharply decrease.
For example, if precision is 80% and recall is 10%, the simple average and F1-Score would be as follows:
Simple Average = (80% + 10%) / 2
= 45%
F1-Score = 2 * (80% * 10%) / (80% + 10%)
≈ 18%
As shown, when one value is drastically low, the F1-Score
also significantly decreases, achieving the highest value when both are balanced.
In the next lesson, we'll complete a quiz on the machine learning model evaluation metrics we've covered so far.
Want to learn more?
Join CodeFriends Plus membership or enroll in a course to start your journey.