Artificial Neuron

The artificial neuron is the basic building block of artificial, convolutional and other popular neural networks, and forms the basis of most of modern deep learning.
June 23, 2017—Rohan Saxena

Modeling the Artificial Neuron

Historically, the artificial neuron is supposed to have developed from the biological neuron. Let’s try to visualize that.
Pull up a couple of images of a biological neuron.
bioNeuron=WebImageSearch["Neuron","Thumbnails","MaxItems"2]

,

We can see the main parts of this structure seem to be the dendrite, axon and axon terminal.
Now let’s pull up a couple of images of an artificial neuron.
artificialNeuron=WebImageSearch["Artificial neuron","Thumbnails","MaxItems"2]

,

The first image of a biological neuron and the second image of an artificial neuron look similar. So let’s compare them.
bioNeuron[[1]]​​artificialNeuron[[2]]
So the association seems to be:
• Dendrite  Input
• Axon  Net Input
• Axon terminal  Activation (Output)

Nuts and Bolts of the Neuron

Now let’s try to visualize what happens at each of the three parts identified above.

Dendrite

The dendrites take input via a vector product. Let’s visualize vectors and their products.
x=RandomInteger[10,{5}]
{3,6,9,10,3}
w=RandomInteger[10,{5}]
{8,7,9,7,0}
x.w
146
Let’s visualize how the values of the individual vectors have changed after multiplication.
GraphicsRow[{MatrixPlot[{x},PlotLabel"x"],MatrixPlot[{w},PlotLabel"w"],MatrixPlot[{{x.w}},PlotLabel"x.w"]}]
In the analogy of the biological neuron, the value of x.w tells us how intense the inputs collected from the dendrites are (by getting scaled up by the weight vector w) once they have entered the neuron. The higher the value, the more intense the input.

Axon

Looking back at the picture of the artificial neuron, we see that this part sums up a lot of vector products.
x1=RandomInteger[10,{5}]​​w1=RandomInteger[10,{5}]​​x2=RandomInteger[10,{5}]​​w2=RandomInteger[10,{5}]
{10,10,6,1,8}
{2,1,10,1,7}
{0,9,3,5,3}
{2,8,8,0,3}
net=x1.w1+x2.w2
252
ImageCollage[{MatrixPlot[{x1},PlotLabel"
x
1
"],MatrixPlot[{x2},PlotLabel"
x
2
"],MatrixPlot[{w1},PlotLabel"
w
1
"],MatrixPlot[{w2},PlotLabel"
w
2
"],MatrixPlot[{{net}},PlotLabel"net"]}]
The value of the activation (output) of the neuron tells us how intense a signal the neuron is sending to the subsequent neurons in the network.

Axon Terminal

This part uses something called an activation function. An activation function introduces nonlinearity into the values of the neuron. Let’s look at some activation functions.
Plot the sigmoid function.
Plot[LogisticSigmoid[x],{x,-10,10}]
Plot the tanh function.
Plot[Tanh[x],{x,-10,10}]
The tanh function seems to be steeper than sigmoid, and hence is closer to the discontinuous function plotted below.
The above functions are basically differentiable approximations for the following function.
The above function is useful for binary classification (0 being one class, 1 being another). We use the above functions instead because they are differentiable and allow us to use our good friend calculus.
We’ll look at some other popular activation functions.

Applications

FURTHER EXPLORATIONS
Backpropagation
Training a neural network
Recurrent neural network, long short-term memory (LSTM)

Authorship Information

Rohan Saxena
6/23/17