Transfer Learning with Deep Convolutional Networks

Transfer Learning Using ResNet50 and CIFAR-10 – mc.ai

Abstract

Introduction

Materials and Methods

  • The first is to freeze the source CNN’s weights and then remove the original fully connected layers and add a new fully connected layer to use the original weights for feature extraction.
  • The second technique is to fine-tune the top layers of the source CNN and freeze the bottom layers, assuming the bottom layers are very generic and can be used for any kind of image dataset [3].
  • The third technique is to fine-tune the entire network’s weights using a very small learning rate to avoid losing the source weights, then remove the last fully connected layers, and add another layer to suit the target dataset.
  • The last technique is to use the CNN’s original architecture without importing weights, which means, to initialize the weights from scratch. The point of this technique is using a well-known architecture that has been used with large datasets and performed well.

Another useful technique to increase the speed of the training is batch normalization, which leave features with an average of zero, and we can do this adding a layer after the base model with keras [6].

Results

The first experiments were made following the first technique, using all the 5 blocks from the original VGG-16 model, and only replacing the last dense layer, ending up with next results:

Blue line (Training loss-accuracy) and Red line (Validation loss-accuracy)

Loss and Accuracy from VGG-16 with all layers

The final numbers for this training were:

loss: 0.2242 - accuracy: 0.9223 - val_loss: 0.5086 - val_accuracy: 0.8456

As we can see, the behavior of the model with the validation set must be improved since loss is growing and accuracy is lower at the end of the process.

After that, using the second technique, the last two layers of the base model were cut off and add the same last layers of the previous trial having the next results:

Blue line (Training loss-accuracy) and Red line (Validation loss-accuracy)

Loss and Accuracy from first three blocks of VGG-16

with the next final results:

loss: 0.0025 - accuracy: 0.9993 - val_loss: 0.6435 - val_accuracy: 0.8923

Details of this training can be seen in the next notebook:

https://www.kaggle.com/rodrigosv/best-cifar10

Discussion

Despite the second trial ended up with better results, the behavior of the validation loss is better in the first due to the fact that is closer to the training loss, but in the second after the 10th epoch validation loss increases linearly, which is not desireable.

Although the last model improve results of the first, there’s still work to do since 89% of validation accuracy is not the best result for transfer learning with VGG-16 over the CIFAR-10 data set, there are many notebooks in Kaggle and other platforms with validation accuracy greater than 95%. However, this can be the first iteration of other transfer learning tasks.

Literature cited:

2.

3.

4.

5.

6.

--

--

Software and Chemical Engineer

Get the Medium app

A button that says 'Download on the App Store', and if clicked it will lead you to the iOS App store
A button that says 'Get it on, Google Play', and if clicked it will lead you to the Google Play store