Neural Network: What is Activation Function? How does Activation Function work?

January 16, 2024

Overview

Neural Network is the ability through which we can teach a model how to react to activities like a human performs. It is not just about to perform some activities but also to mimic the human brain’s function and structure. Let’s jump into the Neural Network and explore how it can change the human world by enabling the models to perform human tasks.

What’s Next!

In this article, we will cover the basics of Neural Networks moving ahead and highlight related terms like Activation function, types of activation function, etc.

Before jumping right to the Activation function, let’s take a look at Neural Network!

What is a Neural Network?

Neural Network term refers to Deep Learning, based on the human brain’s function. It mimics the human brain’s function and structure and draws a pattern to understand how exactly the human brain works. A Neural Network is considered an algorithm inspired by the human brain to train the model to take action like humans.

A Neural Network is divided into three layers: input, hidden, and output. An input layer is defined as a Node, whereas a hidden layer is defined as Neurons.

The input layer transfers the input or raw data to the hidden layer. In the next phase, two procedures are followed by the hidden layer, the first is to find the weighted summation, and the second is the weight summation is transferred to the Activation function to predict the output, finally, the output is sent to the output layer to determine the result.

What is an Activation Function?

An Activation Function computes whether the Neurons should be activated or not. It is based on the raw data transferred through the input layer to the hidden layer. Further, the predicted output is transferred to the output layer to compare the predicted and actual output to represent the final output by the output layer.

In other words, the role of the activation function is to take information from the input function to convert it to weighted summation and to apply the activation function to predict the output to send it to the output layer.

How does an Activation Function work?

The Activation Function works upon three types of layers, input, hidden, and output layer.

The first step is to receive the raw data from the input layer. Next, the raw data is transferred to the hidden layer, in which the weighted summation is found and applied with the activation function to predict the output. Now, the predicted output is sent to the output layer to represent the desired output.

Note: Which Activation function to be applied to raw data is decided on the nature of the raw data.

What are the feed/ forwardprogation and backpropagation?

The procedure of transferring information through the network from the input layer to the output layer is known as feed/forward propagation.

The procedure of adjustment in the weighted and biases is known as backpropagation. It helps to reduce the difference between the predicted and actual output to deliver the exact output.

What are the types of Activation Functions?

Let’s consider the two most common activation functions:

1. Sigmoid

2. ReLU

Sigmoid function: The sigmoid function measures the output between 0 to 1. The weight summation is transferred to the sigmoid function and the output, if the output is less than 0.5, the output will be close to 0 hence, Neurons are not activated but if the output is greater than 0.5 the output will be close to 1 hence, Neurons are activated.

The more positive output the more chances that Neurons are activated, and the more negative output the more chances to Neurons deactivated.

Let’s see the reasons to use the Sigmoid function:

The Sigmoid function is used because of its range feature of 0 to 1. It is majorly used in the condition, where the model predicts the output as the probability. Because of the sigmoid range feature, the probability of an output is determined between 0 to 1.
The function is represented by the S shape in the graph, which shows how easier it is to compare the output between 0 to 1 because the sigmoid function stops the value from jumping too high.

ReLU function: The Rectified Linear Unit (ReLU) function takes the weight summation and computes the final output as the maximum output value. It compares the values between the output value and 0 by using the max() feature.

ReLU Function formula: max(0, x), where the x is the weighted summation. If the max value is determined as negative it means the weight value is transferred to 0 and If the max value is determined as positive it means the weight value is transferred to that positive value.

The main thing is that the ReLU function does not activate all Neurons at the same time. It deactivates the Neurons if the output value is determined as a negative value i.e., less than 0.

Let’s see the reasons to use the ReLU function:

ReLU function activates a few neurons instead of all neurons, therefore it is very easy to compute the output as compared to other activation functions like sigmoid and tanh functions.