Deep Learning — AlexNet
AlexNet is a convolutional neural network (CNN) architecture that was developed by Alex Krizhevsky, Ilya Sutskever, and Geoffrey Hinton in 2012. The network was trained on a large dataset called ImageNet, which contains over 1 million images and 1000 classes, and was able to achieve a top-5 error rate of 15.3%, significantly outperforming the state-of-the-art at the time.
One of the main innovations of AlexNet is the use of ReLU (Rectified Linear Unit) activation functions, which are more efficient than the traditional sigmoid and tanh activation functions, in the hidden layers. It also uses local response normalization (LRN), a technique that normalizes the response of neurons in a small region of the input image to reduce overfitting.
Another key innovation of AlexNet is the use of dropout, which is a technique that randomly sets some of the neurons in the network to zero during training. This helps to reduce overfitting by forcing the network to learn multiple independent representations of the same input data.
The architecture of AlexNet is divided into 8 layers, 5 convolutional layers, and 3 fully connected layers. The first convolutional layer has 96 filters of size 11x11, the second has 256 filters of size 5x5, and so on. The fully connected layers are followed by a softmax layer, which is used for classification. The overall architecture is illustrated below:
CONV1 -> POOL1 -> NORM1 -> CONV2 -> POOL2 -> NORM2 -> CONV3 -> CONV4 -> CONV5 -> POOL5 -> FC6 -> FC7 -> FC8 -> Softmax
AlexNet was a breakthrough in the field of deep learning and computer vision, it marked a turning point in the use of deep neural networks, showing that they were able to outperform the state-of-the-art on a challenging dataset, and the innovations in AlexNet architecture and training techniques like the use of ReLU, Local Response Normalization, Dropout, and large-scale data helped to pave the way for future advances in deep learning and the development of more sophisticated architectures like VGG, GoogLeNet, and ResNet.
In conclusion, AlexNet is a convolutional neural network (CNN) architecture that was developed by Alex Krizhevsky, Ilya Sutskever, and Geoffrey Hinton in 2012. It was trained on the ImageNet dataset and was able to achieve a top-5 error rate of 15.3%. The AlexNet architecture was innovative for its time because of its use of ReLU activations, local response normalization, and dropout which helped the network to achieve high accuracy and to overcome overfitting. The AlexNet architecture has served as an inspiration for many architectures that have come after it.
AlexNet Features:
- Image Augmentation — such as flipping, clipping, and color changes.
- Reduced Overfitting by Dropout in FC Layers
- AlexNet showed that using ReLU nonlinearity, deep CNNs could be trained much faster than using saturating activation functions like tanh or sigmoid.
- Convolutional Layers for Feature Extraction (Low and High Level)
- Fully Connected Layers for Classification /Regression