Skip to main content
Crowdfunding
Python + AI for Geeks
Practice

Stochastic Gradient Descent, a Fast and Efficient Learning Method

Stochastic Gradient Descent (SGD) updates the weights of a neural network by randomly selecting one data sample at each step.

This approach requires less computation and can converge quickly, making it widely used for training on large datasets.


The Process of Stochastic Gradient Descent

Stochastic Gradient Descent is performed by repeating the following steps:

  1. Select one data sample

  2. Compute the loss function

  3. Calculate the gradient

  4. Update the weight

  5. Repeat with the next sample

By repeating this process, the model gradually finds the optimal weights.


Mechanism of Stochastic Gradient Descent

SGD proceeds with learning through the following steps:


1. Selection of Sample and Loss Calculation

Randomly select one sample (x, y) from the dataset and calculate the loss using the current weights.

Loss Function Example
Input data: x = 2.0, Actual value: y = 5.0
Model prediction: 4.2
Loss (MSE) = (5.0 - 4.2)^2 = 0.64

2. Gradient Calculation

Calculate the gradient of the loss function to determine how much the current weight needs to be adjusted.

Gradient Calculation Example
Current weight: 0.5
Gradient: -0.3

3. Update Weights

Use the gradient to update the weights. Adjust the updating speed by multiplying with the Learning Rate (α).

New weight=Current weight(Learning rate×Gradient)\text{New weight} = \text{Current weight} - (\text{Learning rate} \times \text{Gradient})
Weights Update Example
Current weight: 0.8
Gradient: -0.2
Learning rate: 0.1
New weight: 0.8 - (0.1 * -0.2) = 0.82

By repeating this process over all data samples, weights are optimized.

Stochastic Gradient Descent is an essential optimization technique for rapidly learning from large datasets.

In the next lesson, we will explore Batch Gradient Descent.

Want to learn more?

Join CodeFriends Plus membership or enroll in a course to start your journey.