Machine learning: Lab: MNIST redux

MNIST via dense neural networks #

This lab has two goals:

  • It will give you a very brief introduction building and training neural networks in Keras, and
  • Allow you to gain some intuition about the hyperparameter selection process when training neural networks.

As always, we will use a Colab notebook in our work. We will replicate the work we have already done on the MNIST data set; last time we did this using logistic regression. We will now use dense neural networks.

Central question #

We fill again focus on a simple version of the digit classification question:

Train a dense neural network classifier that differentiates between two digits of your choice.

The difficulty of this problem varies depending on the choice of two digits. Differentiating between a 4 and a 9 is the most difficult.

Procedure #

Follow the outline below and the hints in the Colab notebook as your proceed.

  1. Start by downloading the MNIST data set and spend a minute or two reacquainting yourself with the images and the original labels.
  2. Choose a pair of digits for this project; rather than classifying all ten, your work will simply distinguish between these two. Also choose the number of images in your training set. Values of 100, 500, and 1,000 are all interesting. Smaller training sets will lead to a less accurate classifiers but the training itself may be more interesting. For validation, we will use all instances of your digits in the pre-defined MNIST validation set.
  3. Once you have defined training and validation sets, you are ready to define a neural network graph. Choose the number of layers, the number of neurons, and the activation function for each layer. Before beginning the optimization process, you will have to make some hyperparameter choices including learning rate and batch size.
  4. Explore learning hyperparameters and create a model will the smallest loss you can.
  5. Finally, inspect examples of when your classifier was correct and when it was not.
Exercise: You are given a budget of 6 neurons to be used among hidden layers. Using all the hyperparameters available to you, train the most accurate model that you can that distinguishes the digits 4 and 9.
Exercise: Suppose that you built a neural network with no hidden layers to solve this problem. What are you, in fact, doing?