Stochastic Gradient Descent, a Fast and Efficient Learning Method
Stochastic Gradient Descent (SGD)
updates the weights of a neural network by randomly selecting one data sample at each step.
This approach requires less computation and can converge quickly, making it widely used for training on large datasets.
The Process of Stochastic Gradient Descent
Stochastic Gradient Descent is performed by repeating the following steps:
-
Select one data sample
-
Compute the loss function
-
Calculate the gradient
-
Update the weight
-
Repeat with the next sample
By repeating this process, the model gradually finds the optimal weights.
Mechanism of Stochastic Gradient Descent
SGD proceeds with learning through the following steps:
1. Selection of Sample and Loss Calculation
Randomly select one sample (x, y)
from the dataset and calculate the loss using the current weights.
Input data: x = 2.0, Actual value: y = 5.0
Model prediction: 4.2
Loss (MSE) = (5.0 - 4.2)^2 = 0.64
2. Gradient Calculation
Calculate the gradient of the loss function to determine how much the current weight needs to be adjusted.
Current weight: 0.5
Gradient: -0.3
3. Update Weights
Use the gradient to update the weights. Adjust the updating speed by multiplying with the Learning Rate (α)
.
Current weight: 0.8
Gradient: -0.2
Learning rate: 0.1
New weight: 0.8 - (0.1 * -0.2) = 0.82
By repeating this process over all data samples, weights are optimized.
Stochastic Gradient Descent is an essential optimization technique for rapidly learning from large datasets.
In the next lesson, we will explore Batch Gradient Descent
.
Want to learn more?
Join CodeFriends Plus membership or enroll in a course to start your journey.