Evaluating Model Predictive Performance with Accuracy

In machine learning, the most fundamental metric used to evaluate classification models is Accuracy.

Accuracy in machine learning indicates how many correct predictions the model made out of the entire dataset.

In other words, it is a way to measure how well a model predicts.

How to Calculate Accuracy

Accuracy is calculated using the following formula:

\text{Accuracy} = \frac{\text{Number of Correct Predictions}}{\text{Total Number of Data Points}}

When using terms commonly used in machine learning, accuracy is calculated as follows:

\text{Accuracy} = \frac{TP + TN}{TP + TN + FP + FN}

Here, T(True) indicates the presence of a specific attribute, feature, or target state, while F(False) indicates its absence.

The meaning of each term is as follows:

TP (True Positive)

The case where it is actually positive and the model predicts it as positive.

Example: A cancer diagnosis model correctly identifying an actual cancer patient as having cancer.

TN (True Negative)

The case where it is actually negative and the model predicts it as negative.

Example: A cancer diagnosis model correctly identifying a person without cancer as not having cancer.

FP (False Positive)

The case where it is actually negative but the model mistakenly predicts it as positive.

Example: A cancer diagnosis model mistakenly diagnosing a person without cancer as having cancer.

FN (False Negative)

The case where it is actually positive but the model mistakenly predicts it as negative.

Example: A cancer diagnosis model mistakenly diagnosing an actual cancer patient as not having cancer.

This can be summarized in the following table:

Actual Value	Model Prediction	Term	Meaning
Positive (Cancer)	Positive (Cancer)	True Positive (TP)	Correctly identifying an actual cancer patient as having cancer
Negative (No Cancer)	Negative (No Cancer)	True Negative (TN)	Correctly identifying a person without cancer as not having cancer
Negative (No Cancer)	Positive (Cancer)	False Positive (FP)	Misdiagnosing a person without cancer as having cancer
Positive (Cancer)	Negative (No Cancer)	False Negative (FN)	Misdiagnosing an actual cancer patient as not having cancer

For example, suppose a machine learning model predicts 100 data points with the following results:

TP = 40
TN = 50
FP = 5
FN = 5

In this case, the machine learning model can be said to have a predictive accuracy of 90%.

\text{Accuracy} = \frac{40 + 50}{40 + 50 + 5 + 5} = \frac{90}{100} = 90\%

Limitations of Accuracy

Although accuracy is an intuitive and straightforward evaluation metric, it can be problematic when there is data imbalance.

Suppose we have 100 people: 98 are healthy, and 2 have a disease.

What if the model predicts everyone is healthy?

In this case, accuracy is calculated as follows:

TP = 0
TN = 98
FP = 0
FN = 2

\text{Accuracy} = \frac{98}{100} = 98\%

The accuracy appears very high at 98%, but the model failed to identify a single patient.

This shows that evaluating a model based solely on accuracy can be misleading.

In the next lesson, we will address the limitations of accuracy with the concept of Precision.

Want to learn more?

Join CodeFriends Plus membership or enroll in a course to start your journey.

How to Calculate Accuracy​

TP (True Positive)​

TN (True Negative)​

FP (False Positive)​

FN (False Negative)​

Limitations of Accuracy​