Key Components of CNN
CNNs are composed of several layers, each responsible for extracting significant features from the images.
The key components of CNN are the Convolutional Layer
, Pooling Layer
, Activation Function
, and Fully Connected Layer
.
1. Convolutional Layer
Convolution
is the process of analyzing small sections of an image one by one.
When you zoom in on a photograph, you'll see that it's made up of small squares (pixels), and CNNs analyze these pixels using several small windows (filters).
Filters identify important parts of the image, like the shape of eyes or edge lines in a face image.
The convolution operation uses filters like magnifying glasses to closely examine certain parts of the image, identifying where specific features are located.
The Convolutional Layer repeats this convolution process multiple times to extract the crucial features of the image.
2. Pooling Layer
Pooling
is the process of reducing the image size to decrease the computational load.
It's similar to reducing a photo's size while maintaining as much quality as possible.
In the pooling process, the Max Pooling
method is often used, where the most significant value (e.g., the brightest spot) from a small region is selected as a representative value.
For example, if there are four numbers, Max Pooling will select the largest value, like 7.
Input: [2, 7, 3, 6]
Output: 7 (Select the largest value)
The pooling layer reduces the size of the image while preserving the features extracted from the Convolutional Layer, thus reducing the computational load.
3. Activation Function
CNNs use Activation Functions
to enable the neural network to learn complex patterns. The most widely used activation function is ReLU
.
ReLU Function: An activation function that converts negative values to zero while keeping positive values unchanged, helping the model learn faster.
CNNs use activation functions to regulate outputs and extract features from the image.
4. Fully Connected Layer
At the end of a CNN is the Fully Connected Layer
.
In a Fully Connected Layer, every neuron is connected to every neuron in the previous layer, and it performs the final prediction based on the features extracted through convolution and pooling.
Input Image: Cat Picture
Output Probability:
Cat: 0.85 (85%)
Dog: 0.15 (15%)
The Fully Connected Layer compares output probabilities and selects the class with the highest probability as the final prediction.
Thus, CNNs extract image features and ultimately classify images through convolution, pooling, activation function, and the fully connected layer.
In the next lesson, we'll dive deeper into the details of the convolution operation.
Want to learn more?
Join CodeFriends Plus membership or enroll in a course to start your journey.