The Role of Activation Functions

When working with neural networks, you will often come across the term activation function.

In a neural network, activation functions play a crucial role by transforming the input values to determine the output.

A perceptron or artificial neural network cannot solve complex problems by merely adding input values.

Activation functions help neural networks learn not just simple rules but also more complex patterns.

1. Why Activation Functions Are Needed

In a neural network, a neuron's task is to receive input values, multiply them by weights, add them up, and then output the results.

Without activation functions, all neurons would only perform linear operations, which involve simply multiplying and adding input values.

y = w_1 x_1 + w_2 x_2 + w_3 x_3 + b

These linear operations essentially have the effect of separating data with a single line (or plane).

Therefore, even if multiple layers are added, the same kind of linear operations repeat, and the neural network cannot learn more complex patterns.

However, in reality, many non-linear problems cannot be separated by a single line.

For instance, consider a classification task where the goal is to separate points inside and outside a circle.

In this case, it is impossible to divide the data with a single line.

2. How Activation Functions Transform Data

Activation functions enable neural networks to go beyond linear transformations by modifying the output non-linearly, allowing learning of curves and shapes.

In other words, the core role of activation functions is to transform the input according to certain rules rather than just passing it through directly.

What Does It Mean for Output Values to Have Turning Points?

For instance, applying a popular activation function like ReLU (Rectified Linear Unit) changes the input to zero if it is less than zero and allows it to pass unchanged if greater than zero.

f(x) = \begin{cases} x, & \text{if } x > 0 \\ 0, & \text{if } x \leq 0 \end{cases}

This function has the property that the output value changes suddenly based on the input being zero or not.

Thus, the graph takes a form that 'bends' at $x = 0$ .

This sudden change in value is where non-linearity is introduced.

3. How Activation Functions Enable Learning of Complex Patterns

Without activation functions, the same type of linear operations would be repeated in every layer, greatly limiting the patterns a neural network can learn.

However, by using activation functions, data get non-linearly transformed as they pass through layers, allowing neural networks to learn data divided by curves.

Take a neural network with two layers for example:

The first layer tries to divide the data with a single line, but it cannot do so perfectly.
When an activation function is applied to the second layer, it introduces turning points, creating more diverse boundaries to classify data.
Repeating this allows the network to separate data using curves or multiple boundaries instead of just straight lines.

Activation functions are essential for enabling neural networks to capture and learn complex, non-linear patterns.

In the next lesson, we will explore the sigmoid function, one of the most commonly used activation functions.

Want to learn more?

Join CodeFriends Plus membership or enroll in a course to start your journey.

1. Why Activation Functions Are Needed​

2. How Activation Functions Transform Data​

What Does It Mean for Output Values to Have Turning Points?​

3. How Activation Functions Enable Learning of Complex Patterns​

Want to learn more?

1. Why Activation Functions Are Needed

2. How Activation Functions Transform Data

What Does It Mean for Output Values to Have Turning Points?

3. How Activation Functions Enable Learning of Complex Patterns