ImageNet Classification with Deep Convolutional Neural Networks
Krizhevsky, Sutskever, Hinton
What it says
The authors train an 8-layer CNN with 60M parameters on ImageNet-1k using two GTX 580 GPUs. They introduce (or popularize) several tricks: ReLU activations instead of tanh, dropout in the fully-connected layers, data augmentation with random crops and horizontal flips, local response normalization, and overlapping pooling. The network takes top-5 error from the previous best of ~26% down to 15.3%. A landslide.
Why it matters
AlexNet is the paper where deep learning went from “academic curiosity” to “obvious answer for computer vision”. Every lab pivoted within months. ReLU, dropout, and GPU training became universal, and ImageNet became the benchmark that launched a decade of progress. Without AlexNet, the rest of the story doesn’t happen.
Read next
- VGG (Simonyan & Zisserman, 2014) — simpler, deeper, trained-from-scratch backbone.
- GoogLeNet / Inception (Szegedy et al, 2014) — efficient multi-branch blocks.
- ResNet (He et al, 2015) — skip connections that finally let networks go truly deep.