Mean Squared Error (MSE)

When evaluating the performance of a regression model, it is crucial to measure the error between predictions and actual values rather than relying solely on accuracy.

One of the most common metrics for measuring this error is MSE (Mean Squared Error).

In this session, we will delve into MSE, which was briefly mentioned previously. MSE is the value obtained by squaring the difference between the predicted and the actual value and then taking the average.

How to Calculate Mean Squared Error

MSE is calculated using the following formula:

\text{MSE} = \frac{1}{n} \sum_{i=1}^{n} (y_i - \hat{y}_i)^2

Here's what each term means:

$n$ : Number of data samples
$y_i$ : Actual value (ground truth)
$\hat{y}_i$ : Predicted value by the model

Thus, the Mean Squared Error is computed by squaring the difference between the predicted and actual value for each data sample, then averaging these squared differences over all samples.

MSE Example
Actual values: [10, 20, 30]
Predicted values: [15, 25, 35]

MSE = ((10-15)² + (20-25)² + (30-35)²) / 3
    = (25 + 25 + 25) / 3
    = 75 / 3
    = 25

How Should MSE Be Interpreted?

MSE measures how close the predictions are to the actual values.

A smaller MSE value indicates that the predictions are closer to the actual values, suggesting better model performance.

Conversely, a larger MSE implies that the model is not accurately predicting the actual values.

The squaring is used to ignore the sign (positive/negative) of the error and only reflect the magnitude of the error.

Additionally, squaring assigns more significant penalties to larger errors, allowing us to consider extremely inaccurate predictions more critically.

What Are the Limitations of MSE?

While MSE is very useful for evaluating the performance of regression models, it has some drawbacks.

Sensitivity to Large Errors Due to Squaring

The presence of large error values can cause a dramatic increase in the overall MSE value.

Thus, when dealing with datasets that have many outliers, MSE might not be the best metric for model evaluation.

Difference in Measurement Units

As MSE involves squaring the errors, the resulting unit may differ from the original data's unit.

For example, if the actual values are measured in cm, the unit for MSE would be cm².

To address this, the Root Mean Squared Error (RMSE), which involves taking the square root of the MSE, is often used.

\text{RMSE} = \sqrt{\frac{1}{n} \sum_{i=1}^{n} (y_i - \hat{y}_i)^2}

By taking the square root of the Mean Squared Error, the error's unit aligns with the actual value's unit.

In the next session, we'll explore the Mean Absolute Error (MAE), which addresses some limitations of MSE.

Want to learn more?

Join CodeFriends Plus membership or enroll in a course to start your journey.

How to Calculate Mean Squared Error​

How Should MSE Be Interpreted?​

What Are the Limitations of MSE?​

Sensitivity to Large Errors Due to Squaring​

Difference in Measurement Units​