Wolfram Cloud Document

Handwritten digit recognition via neural networks

1.Initialization

When you run the first command in the next section, you will be asked whether you would like to initialize this notebook. Answer YES. This will take a few seconds, but when finished, you will have access to two sets of digits data extracted from the MNIST data set: tData and vData. We will use the first for training and the second for validation.

2.Getting to know the data

Let’s look at what the MNIST data set looks like: run the following command as many times as you’d like. Each input is a hand-written digit, or really a 28x28 array of numbers between 0 and 1, and each output is the actual digit it represents.

In[]:=

RandomSample[tData,50]

3.Modifying your data

In our experiment, we will pick one digit and train a neural network to distinguish it from among the others. The inputs will still be images of hand-written digits, but the outputs should simply indicate whether or not an image represents our digit.

To do this, we assign our images of our digit the value 1, and the value 0 to images of all the other digits. Pick the digit you would like to study and run the following commands to modify our training and testing data.

In[]:=

ourDigit=7;tData2=pickOurDigit[ourDigit,tData];vData2=pickOurDigit[ourDigit,vData];

This created modified data sets tData2 and vData2. Let’s check whether the input-output pairs for our new data make sense for what we are trying to do.

In[]:=

RandomSample[tData2,50]

We are not done yet. Right now, we have many more instances of ‘other digits’ than ‘our digit’. This class imbalance may skew our model. Let’s equalize their numbers and use 5000 examples of each for our training data:

In[]:=

numOurDigit=5000;numOtherDigits=5000;tData3=adjustBalance[tData2,ourDigit,numOurDigit,numOtherDigits];

Next, do the same for the validation data but use 500 examples for each class:

In[]:=

numOurDigit=500;numOtherDigits=500;vData3=adjustBalance[vData2,ourDigit,numOurDigit,numOtherDigits];

This created modified data sets tData3 and vData3 which we will use to train our digit-recognition model.

4.Build and train a neural net

We can build a neural network in hopes that it can recognize our digit from the other digits. Given an image of a handwritten digit, the neural network will attempt to fit our training data and predict a 1 each time it recognizes our digit and a 0 for the other digits. The fit will not be perfect, and in practice, the neural network will return a number between 0 and 1. We will round the results to make predictions.Now for the fun part: designing the neural network. We have to choose how many hidden layers, how wide each one should be, and what activation function to us in each layer. You may also want to change the implementation of gradient descent, currently set to ADAM. Change the orange text in the command below to values you would like to try. Possible choices of activations are: Ramp Tanh LogisticSigmoid Possible choices of optimizer method are: ADAM SGDTo being training, run the following cell. It builds a neural network with two hidden layers of 10 neurons each using the ReLU activation function (here it is called Ramp). There is one output neuron.

In[]:=

net=NetChain[{10,Ramp,10,Ramp,1},"Input"NetEncoder[{"Image",{28,28},"Grayscale"}],"Output""Scalar"];net=NetTrain[net,tData3,All,Method"ADAM",ValidationSetvData3]trainedNet=net["TrainedNet"]

The following command will let you see you final accuracy as well as some examples of the output of your trained neural network, and some examples where the neural network makes an incorrect prediction:

In[]:=

findAccuracy[vData3,trainedNet]

5.Caveats

Here are a few things to keep in mind:

-Neural networks rarely output nice integers such as a 0 or 1. The usual output is a real number. So how should we interpret an ouput of 0.9843 or 0.1439 when trying to gauge the accuracy of our model? We round!

-For a problem like this one, where outputs in the input-output pairs are either zeros or ones, the choice of the activation function on the output neuron itself is interesting. If you don't have one, any real number is a possible output. If it is a ReLU, the output never be negative, but can be any positive number. If it is a sigmoid, then the network can only output numbers between 0 and 1.

-When training the neural network, we will aim to minimize mean square error: average of the squares of the differences between f(x_i) an the value predicted by the network, and y_i, the correct output value from the data set. But when we evaluate how well a neural network performs the task at hand, we use the error rate: the percent of time when the network predicts the incorrect answer. Note that these are different, but related, quantities.

6.Questions

Now use the above commands to address the questions from the lab. Feel free to explore.