Skip to main content
Practice

The Machine Learning Workflow


A machine learning workflow is a structured process that guides how we move from a raw dataset to a deployed, working model.
Following a clear workflow ensures efficiency, reproducibility, and better results.

Rather than listing each stage here, take a look at the whiteboard diagram for a visual breakdown of the workflow steps and their relationships.


Example: Simple Workflow in Scikit-learn

ML Workflow Example: Classification
# Install scikit-learn in Jupyter Lite
import piplite
await piplite.install('scikit-learn')

from sklearn.datasets import load_iris
from sklearn.model_selection import train_test_split
from sklearn.neighbors import KNeighborsClassifier
from sklearn.metrics import accuracy_score

# 1. Load dataset
iris = load_iris()
X, y = iris.data, iris.target

# 2. Split into train/test sets
X_train, X_test, y_train, y_test = train_test_split(
X, y, test_size=0.2, random_state=42
)

# 3. Choose model
model = KNeighborsClassifier(n_neighbors=3)

# 4. Train model
model.fit(X_train, y_train)

# 5. Evaluate
predictions = model.predict(X_test)
acc = accuracy_score(y_test, predictions)

print(f"Accuracy: {acc:.2f}")

This example demonstrates the core loop of the ML workflow:

  • Data preparation
  • Model selection
  • Training
  • Evaluation

Key Takeaways

  • A well-structured ML workflow reduces errors and improves reproducibility.
  • The steps are iterative — you might return to earlier stages if performance isn’t satisfactory.
  • Scikit-learn provides tools for almost every stage, from preprocessing to evaluation.

What’s Next?

In the next lesson, we’ll dive into Supervised vs. Unsupervised Learning to understand the two main types of machine learning.

Want to learn more?

Join CodeFriends Plus membership or enroll in a course to start your journey.