MNIST digits and the perceptron #
By now you are familiar with the MNIST data set of handwritten digits. We used logistic regression and gradient descent to learn a model that distinguishes pairs of digits. Can we do the same, this time using the perceptron?
Homework exercise:
Follow the outline of our original MNIST lab to train a digit-recognition perceptron. The Colab notebook contains some useful code, although you will have to fill in some of the details. In your write up, address the following questions:
- How does the size of the training data affect validation accuracy? Report your results in a table and repeat your expriment for least two pairs of digits.
- What happens when you find a separating hyperplane for the training data early on in the training process and your validation accuracy is not yet good enough? Describe a possible modification to the perceptron algorithm that will allow you to keep learning longer.
- Explain why classification accuracy instead of MSE or logistic loss is the appropriate measure of the quality of the perceptron.
- Does the data set \(\mathcal{D}\) actually need to be linearly separable for the perceptron algorithm to be useful?