Transfer Learning with Deep Convolutional Networks

Transfer Learning Using ResNet50 and CIFAR-10 – mc.ai

Abstract

Image classification has been widely studied since humans want to automate repetitive tasks and make them more accurate than ever. From that enormous work, we have now the possibility to use pre-trained models to perform classification on different data sets and leverage the previous achievements to build new models that can be trained faster and with far better results compared with those that can be obtained from models created from scratch. This work is intended to create a new model from a well known Convolutional Neural Network like VGG-16 [2] to classify images from the data set CIFAR-10 [1].

Introduction

The main objective when taking a pretrained model is to extract and use features learned in the base model and adapt new layers that can specific to the new problem. In that aim, it is necessary to understand the base model to have clear which layers can contribute in the resolution of the new images, and how many and how big we need the new layers to learn the new features. Here is shown a series of trials with VGG-16 [4] network to improve transfer learning to classify CIFAR-10 images.

Materials and Methods

The following four techniques for transfer learning are commonly used to achieve good results:

  • The second technique is to fine-tune the top layers of the source CNN and freeze the bottom layers, assuming the bottom layers are very generic and can be used for any kind of image dataset [3].
  • The third technique is to fine-tune the entire network’s weights using a very small learning rate to avoid losing the source weights, then remove the last fully connected layers, and add another layer to suit the target dataset.
  • The last technique is to use the CNN’s original architecture without importing weights, which means, to initialize the weights from scratch. The point of this technique is using a well-known architecture that has been used with large datasets and performed well.

Results

Trials were run in Kaggle’s Notebooks to take advantage of free GPU usage to accelerate trainings and have quicker results and improve the new network’s faster.

Loss and Accuracy from VGG-16 with all layers
loss: 0.2242 - accuracy: 0.9223 - val_loss: 0.5086 - val_accuracy: 0.8456
Loss and Accuracy from first three blocks of VGG-16
loss: 0.0025 - accuracy: 0.9993 - val_loss: 0.6435 - val_accuracy: 0.8923

Discussion

The trials made show that we need to explore different characteristics of models when trying to make transfer learning due to the specificity that the features of the base model can have specially in the last layers, where has been proved that models resolve tasks, and store features that can lead us to poor performances in new sets of images.

Literature cited:

1.

Software and Chemical Engineer