Machine learning is a vast discipline that draws from and informs a large number of fields of study. In attempts to work with data and make accurate predictions about the future, machine learning has developed a large number of tools and models. It would be easy to structure a class, each day describing a new and different approach, and only scratching the surface of this subject. For instance, here are some examples of machine learning models:
In Math 2805, I will emphasize the contributions of three subfields of mathematics in the machine learning conversation: analysis, linear algebra, and optimization.
Linear algebra forms the computational backbone of machine learning. Our modus operandi will be to rephrase diverse machine learning models in the language of linear algebra and then exploit the underlying theory to further our data-centered goals. It is amazing how often we will be able to do this.
Our approach will often be phrased in terms of minimizing predictive error of our models. We will take advantage of a rich set of tools developed in optimization and multilinear calculus.
Because machine learning is such a large field, there are many ways one can structure a course in the subject, and many different goals one can have in mind. As best as I can enunciate it, this is my goal for the class:
Goal: Machine learning techniques evolve rapidly. My goal for this course is not to simply describe the many existing models and machine learning tools in turn, but to provide the perspective and tools with which new ones can be analyzed and developed new. To be fair, we will see and use a number of machine learning techniques and apply them to analyze data, but this will be done to serve this, perhaps more ambitious, goal.
Machine learning falls into three broad categories: supervised learning, unsupervised learning, and reinforcement learning. The course focuses on the first two. Below are some examples of problems from each of the categories; we will study them all in much greater detail at some point in the semester.
Face recognition: Given an image of a face, identify its owner. Do this in an accurate and unbiased way.
Character recognition: Identify a handwritten digit. This is perhaps the most classical machine learning image problem dating back to the US Postal Service’s effort to automate ZIP code sorting.
Species identification: Given a collection of related flowers and a set of physical measurements for each individual, identify the number of subspecies present in your sample.
Movie taste clusters: Identify taste clusters among Netflix users and use them to make suggestions for future viewing.
Basketball positions: Identify the “hidden” positions on a sports team.