Determining the Amount of Data to Learn at Once - Batch Size
When studying, tackling many problems at once can lead to decreased focus and inefficiency.
Conversely, moving too slowly by studying one problem at a time can make it difficult to complete the overall syllabus.
Similarly, when training a machine learning model, the hyperparameter that determines the amount of data used in a single training step is called Batch Size
.
Batch size determines how much data the model will learn from in a single update.
Why Batch Size is Important
The model learns by receiving data and adjusting the weights.
The way the model processes data changes depending on the batch size.
When the Batch Size is Too Small
Small amounts of data are used in a training step, so memory usage is low.
However, the model frequently updates the weights, which can lead to unstable training.
It may also take a long time to find the optimal weights.
When the Batch Size is Too Large
Using a large amount of data in a training step results in more stable learning.
But, it increases memory usage and can slow down the weight updates.
How to Set the Batch Size
Batch sizes are typically powers of two, like 32, 64, 128
, which are favorable for optimizing calculations on hardware accelerators such as GPUs.
When setting the batch size, consider the following points.
In the next lesson, we will learn about epochs
, which determine how many times the model will learn the entire dataset.
Want to learn more?
Join CodeFriends Plus membership or enroll in a course to start your journey.