We train a large, deep convolutional neural network to classify the 1.2 million high-resolution images in the ImageNet LSVRC-2010 contest into 1000 different classes. On the test data, we achieved record-breaking results. The network contains 60 million parameters and 650,000 neurons, and consists of five convolutional layers, some of which are followed by max-pooling layers, and three fully-connected layers with a final 1000-way softmax. To make training faster, we used non-saturating neurons and a very efficient GPU implementation of the convolution operation. To reduce overfitting in the fully-connected layers, we employed a newly developed regularization method called “dropout” that proved to be very effective. We also used a new data augmentation scheme to increase the size of the training set. We achieved a top-1 error rate of 37.5% and a top-5 error rate of 17.0% on the ImageNet LSVRC-2010 test set, which was significantly better than the previous state-of-the-art. The network’s depth is essential for its high performance, which is computationally expensive to train, but feasible thanks to fast GPUs. We believe that such large and deep neural networks are feasible to train on available hardware and can achieve state-of-the-art results on challenging image classification tasks.
{
"id": "a79a677d-0bab-4144-930f-7683da1e36cb",
"title": "ImageNet Classification with Deep Convolutional Neural Networks",
"slug": "imagenet-classification-with-deep-convolutional-neural-networks",
"video_url": "https://www.youtube.com/watch?v=IItLItxm3Cs",
"url": "https://papers.nips.cc/paper_files/paper/2012/hash/c399862d3b9d6b76c8436e924a68c45b-Abstract.html",
"resource_category": "research",
"image_url": null,
"thumbnail_url": null
}