Introduction to Scikit-learn

Scikit-learn (also known as sklearn) is one of the most popular open-source Python libraries for machine learning.
It provides efficient tools for:

Classification
Regression
Clustering
Dimensionality reduction
Model selection
Data preprocessing

Built on top of NumPy, SciPy, and Matplotlib, Scikit-learn is designed to be simple, efficient, and accessible for both beginners and professionals.

Why Use Scikit-learn?

Here are some key reasons why Scikit-learn is a go-to library for ML:

Comprehensive Algorithms – Includes a wide variety of supervised and unsupervised learning methods.
Easy-to-Use API – Consistent interface across models.
Preprocessing Tools – Built-in utilities for scaling, encoding, and transforming data.
Model Evaluation – Ready-to-use metrics and validation tools.
Integration – Works seamlessly with NumPy arrays and Pandas DataFrames.

Example: Training a Simple Model

Example: K-Nearest Neighbors Classification
# Install scikit-learn in Jupyter Lite
import piplite
await piplite.install('scikit-learn')

from sklearn.datasets import load_iris
from sklearn.model_selection import train_test_split
from sklearn.neighbors import KNeighborsClassifier

# Load dataset
iris = load_iris()
X_train, X_test, y_train, y_test = train_test_split(
    iris.data, iris.target, test_size=0.2, random_state=42
)

# Create and train model
model = KNeighborsClassifier(n_neighbors=3)
model.fit(X_train, y_train)

# Evaluate
accuracy = model.score(X_test, y_test)
print(f"Accuracy: {accuracy:.2f}")

This example shows how little code is needed to:

Load a dataset
Split it into training and testing sets
Train a machine learning model
Evaluate its performance

What’s Next?

In the next lesson, we’ll explore the Machine Learning Workflow and understand the main steps from data preparation to model deployment.

Want to learn more?

Join CodeFriends Plus membership or enroll in a course to start your journey.

Why Use Scikit-learn?​

Example: Training a Simple Model​

What’s Next?​

Want to learn more?

Why Use Scikit-learn?

Example: Training a Simple Model

What’s Next?