Training Neural Networks
Given a network like this, think about the matrices.
Where is the layer between the input layer and the first hidden layer, is the layer between the hidden layers, and is the layer between the last hidden layer and the output layer:
How to find the best Hyperparameters
The goal of machine learning is to perform well on unseen data
How to? use a holdout validation set.
For instance, MINIST train 60,000, MNIST Test: 10,000
Try different settings i.e Setting 1 : 6 layers, 6 hidden dimensions Setting 2 : 8 layers, 16 hidden dimensions
Setting N:
Choose the one with the highest accuracy on the test set.
Note: You should not tune your model on the test set. This is because you will overfit on the test set, the test set should be a representation of a real world use case.
Loss
For training, typically there are two losses you should monitor.
- Loss on the training data - how well the model learns
- Loss on the validation data - how well the model generalizes

At some tipping point the training data’s boundary will become so sophisticated it cannot be generalized. To find this tipping point take a general average and find when the validation loss starts increasing.