Neural network back prop using simple gradient descent, with matrix fomulation + bias + minibatch

Intro

In the previous sample only the input layer uses a bias. The bias was handled by conveniently appending a 1 to the input vector.

Now we'll see what do we need to do to add a bias term to every layer

For impatients: Now the details: This sample uses a 2x4x1 NN to compute learn xor operation.
Added batching.

Contact/Questions:

<my_github_account_username>$@gmail.com$.