Introduction to Transfer Learning

Classifying Photographs vs. Drawings

Here we show how to develop a deep learning algorithm to classify images given any small, labeled dataset. To demonstrate the procedure, we train a classifier to identify whether an input image is a photograph of a real object or a painting/drawing. We use a technique called transfer learning to do this.

Daniel George

Overview

Transfer Learning

Humans are able to learn very quickly to identify something after being shown only a few examples because we can utilize our existing knowledge about the world to solve new problems. This ability to transfer knowledge learned for one task in one domain to another task in a different domain is called transfer learning.

Typically, machine learning algorithms do not generalize well to situations very different from the data that is used for training. Furthermore, they typically require large amounts of labeled data to achieve good performance when trained from the ground up.

However, convolutional neural nets trained on images learn to detect simple feature such as edges, corners, color blobs, etc., in their early layers. These features are useful for all kinds of image processing tasks. Therefore, we can reuse these nets already trained on a large dataset of images such as ImageNet, and fine-tune them to build a classifier for a new problem using only a few labeled elements.

Data

A dataset of photographs and drawings can be easily obtained by searching the web (e.g. Google image search) or using the function WebImageSearch. Here we have already scraped together some photographs and drawings; these are shown below.

Photos

Get 150 images of photographs by partitioning the image here, and label each of them “Photo”.

photos=Flatten@ImagePartition

,{224,224}"Photo"//Thread;

Drawings

Get 150 images of drawings from the image here, and label them “Drawing”.

drawings=Flatten@ImagePartition

,{224,224}"Drawing"//Thread;

Splitting into Training and Testing Data

Randomly shuffle and then split the dataset into 200 images for training and 100 images for testing.

{trainingData,testData}=TakeDrop[RandomSample@Join[drawings,photos],200];

Having a separate test set is essential, since it ensures that the algorithm is not simply memorizing the image shown to it without generalizing to unseen images.

Designing Nets

Training

Testing

Comparisons

Conclusion

FURTHER EXPLORATIONS

Transfer Learning

AUTHORSHIP

Daniel George