Determining the Amount of Data to Learn at Once - Batch Size
When studying, tackling many problems at once can lead to decreased focus and inefficiency.
Conversely, moving too slowly by studying one problem at a time can make it difficult to complete the overall syllabus.
Similarly, when training a machine learning model, the hyperparameter that determines the amount of data used in a single training step is called Batch Size
.
Batch size determines how much data the model will learn from in a single update.
Why Batch Size is Important
The model learns by receiving data and adjusting its weights.
The way the model processes data changes depending on the batch size.
When the Batch Size is Too Small
With a small batch size, each training step uses little data, so memory usage remains low.
However, frequent weight updates can make training unstable.
It may also slow down progress toward finding optimal weights.
When the Batch Size is Too Large
Using a large amount of data in a training step results in more stable learning.
However, it increases memory usage and may slow down how often the model updates its weights.
How to Set the Batch Size
Batch sizes are typically powers of two, like 32, 64, 128
, which are favorable for optimizing calculations on hardware accelerators such as GPUs.
When setting the batch size, consider the following points.
In the next lesson, we will learn about epochs
, which determine how many times the model will learn the entire dataset.
Want to learn more?
Join CodeFriends Plus membership or enroll in a course to start your journey.