Deep Learning — Activation Functions (Part-6)

Anurag
3 min readJul 31, 2024

--

Activation functions are a fundamental component of artificial neural networks in machine learning. They are used to introduce non-linearity into the neural network, which allows the network to learn a wide range of complex relationships between input and output. Activation functions are applied to the output of each neuron in a neural network, and their role is to determine whether a neuron should be activated and contribute to the output of the network.

Several different types of activation functions are commonly used in machine learning, each with its properties and characteristics. Some of the most popular activation functions include:

  1. Sigmoid: The sigmoid function is a widely used activation function that maps any input value to a value between 0 and 1. It is defined as f(x) = 1 / (1 + e^(-x)), where e is the base of the natural logarithm. The sigmoid function is useful for problems where the output should be a probability value, such as binary classification. However, it has a problem of vanishing gradients, which makes it difficult to adjust the weights of the network during backpropagation.
  2. ReLU (Rectified Linear Unit): The ReLU activation function is defined as f(x) = max(0, x). It maps any negative input value to 0, and any positive input value to itself. ReLU has proven to be an efficient and practical activation function in deep learning, as it increases the convergence speed and allows the neural network to learn sparse representations.
  3. Tanh (hyperbolic tangent): The tanh activation function is similar to the sigmoid function, but it maps input values to the range between -1 and 1. It is defined as f(x) = 2/(1 + e^(-2x)) — 1. The tanh function is useful for problems where the output should be centered around zero, such as image data.
  4. Leaky ReLU: An extension of the ReLU function, which tries to solve the “dying ReLU” problem, some neurons in the network become dead, that is, they stop producing outputs. The Leaky ReLU solves this problem by adding a small constant value to the negative part of the ReLU function, in form f(x)= max(αx,x) where α is a small positive constant.

In choosing an activation function, it is important to consider the specific characteristics and requirements of the problem at hand and the properties of the activation function itself. In many cases, the choice of activation function can significantly impact the performance of the neural network, and it may take some experimentation to find the best activation function for a particular problem.

In conclusion, activation functions are a key component of artificial neural networks in machine learning. They introduce non-linearity into the network, which allows it to learn a wide range of complex relationships between input and output. There are several different types of activation functions available, each with its properties and characteristics. The choice of activation function can significantly impact the network’s performance.

--

--

No responses yet