This blog post takes a close look at the first fundamental concept of a neural network that I introduced briefly in my previous post. The layers. Layers are the building blocks of a neural network and contain the network’s knowledge. In order to achieve our goal one needs to first understand what a neural network looks like.
What is a Neural Network?
Let’s have a look at a practical example using the R Boston Housing Data Set sourced from the UCI Machine Learning Repository . This is a regression problem from which we want to predict the house price in Boston . The code below loads the data set, fits the data to a neural network and visualizes the network.
#load required packages library(neuralnet) #load data data(BostonHousing) #fit the neural network set.seed(2123) NN <- neuralnet(medv ~ crim + age + tax, BostonHousing, hidden = c(3,3) , linear.output = T ) #plot neural network plot(NN)
In the neural network above, the variables, crim, age and tax are the input variables or features selected to explain the output variable medv (Median value of owner-occupied homes in $1000’s). Each circle in the figure is called a neuron. The neurons in the first column are called input neurons and make up the input layer. The neurons in the last column are output neurons and make up the output layer. The neurons between the input and output columns are the hidden neurons and make up the hidden layer.
In summary, this neural network has one input layer, two hidden layers, with three neurons in the first hidden layer, two neurons in the second hidden layer and one output layer.
Input and Output Values
From the neural network figure above one can see that between each neuron there is an associated value. This value is called the weight or trainable parameters. To explain the concept on how these parameters are determined I will use a spreadsheet. Jeremy Howard’s fastai neural network course does a great job at explaining neural networks on a spreed sheet. I have shamelessly used some ideas from his course to assist me with understanding and explaining core concepts.
If we translate the above neural network in Ms Excel it will look something like this:
The above spreadsheet with the calculations can be downloaded here:
Note, the neurons in the neural network plot highlighted in blue are the bias weights and are not included in the spreadsheet explanation. This will be covered in future blog post.
The values highlighted in yellow on the spreadsheet are the weights. These are randomly initialized in the network. Ideally any small value between -1 and 1. In order to get the values of neurons h1, h2 and h3 we perform the below operation:
h1 = (i1*i1h1)+(i2*i2h1) + (i3*i3h1)
h2 = (i1*i1h2)+(i2*i2h2) + (i3*i3h2)
h3 = (i1*i1h3)+(i2*i2h3) + (i3*i3h3)
To obtain the values of neuron h4 and h5 we perform the below operations:
h4 = (h1*h1h4)+(h2*h2h4) + (h3*h3h4)
h5 = (h1*h1h5)+(h2*h2h5) + (h3*h3h5)
And finally the final computation is to get the predicted value o1:
o1 = (h4*h4o1)+(h5*h5o1)
The above computations can be summarized as:
This is the most basic definition of a neural network, sequences of matrix multiplication. However, the power in neural networks lies in their ability to approximate any function. This is done with the help of activation functions. The next part of the series will be a continuation of the layers foundation. I will be discussing how non-linear transformations are applied to neuron values h1, h2, h3, h4, h5 and o1.