Evaluating Model Performance (accuracy, R²)
Model evaluation is about measuring how well a trained model makes predictions.
The metric you choose depends on the type of problem:
- Classification → accuracy, precision, recall, F1-score
- Regression → R² (coefficient of determination), MSE, MAE
Scikit-learn provides built-in functions to calculate these metrics.
Classification Example – Accuracy Score
Accuracy Score Example
import piplite
await piplite.install('scikit-learn')
from sklearn.datasets import load_iris
from sklearn.model_selection import train_test_split
from sklearn.neighbors import KNeighborsClassifier
from sklearn.metrics import accuracy_score
# Load data
X, y = load_iris(return_X_y=True)
# Split
X_train, X_test, y_train, y_test = train_test_split(X, y, test_size=0.3, random_state=42)
# Train classifier
knn = KNeighborsClassifier(n_neighbors=3)
knn.fit(X_train, y_train)
# Predict
y_pred = knn.predict(X_test)
# Evaluate
acc = accuracy_score(y_test, y_pred)
print(f"Accuracy: {acc:.2f}")
Accuracy = fraction of correct predictions. Good for balanced datasets, but can be misleading if the classes are imbalanced.
Regression Example – R² Score
R² Score Example
from sklearn.linear_model import LinearRegression
from sklearn.metrics import r2_score
# Synthetic regression data
import numpy as np
rng = np.random.RandomState(0)
X_reg = 2 * rng.rand(50, 1)
y_reg = 4 + 3 * X_reg.ravel() + rng.randn(50)
# Train/test split
X_train, X_test, y_train, y_test = train_test_split(X_reg, y_reg, test_size=0.2, random_state=42)
# Train
reg = LinearRegression()
reg.fit(X_train, y_train)
# Predict
y_pred = reg.predict(X_test)
# Evaluate
r2 = r2_score(y_test, y_pred)
print(f"R² Score: {r2:.3f}")
R² measures how much variance in the target is explained by the model.
- 1.0 = perfect prediction
- 0 = no better than mean prediction
- Negative = worse than mean prediction
Key Takeaways
- Use classification metrics for categorical outputs, and regression metrics for continuous outputs.
- Always evaluate on a test set that the model hasn’t seen during training.
- Consider multiple metrics for a more complete evaluation, especially with imbalanced data.
What’s Next?
In the next lesson, we’ll explore Confusion Matrix and Classification Report for a deeper look at classification performance.
Want to learn more?
Join CodeFriends Plus membership or enroll in a course to start your journey.