Approximating \(f: \mathbb{R}^d \rightarrow \mathbb{R}^k\) #
Background #
The original version of the Weierstrass Theorem shows that any continuous function \(f:\mathbb{R}\rightarrow\mathbb{R}\) can be approximated by a sequence of polynomials. But single-variable functions are boring from the machine learning perspective; it is rare that we want to predict a single value based on a single feature present in data! In class we derived a multivariate version of the Weierstrass Theorem:
We can push this a little further. We really would like to be able to approximate continuous functions \(f: \mathbb{R}^d \rightarrow \mathbb{R}^k\). The key will be to break such \(f\) into simpler pieces called coordinate functions.
Coordinate functions #
I will ask you to do the heavy-lifting on this part. Consider the following questions in turn:
Suppose that \([\vec{a},\vec{b}] \) is a multi-interval in \(\mathbb{R}^d\) and that our task is to study a continuous function \(f: [\vec{a},\vec{b}] \rightarrow \mathbb{R}^k\).
- Describe \(f\) in terms of \(k\) individual real-valued functions.
- Are each of these individual real-valued functions continous as well? An argument via frantic hand-waving wil be fine.
- How could you measure distance between two continuous functions \(f, g: [\vec{a},\vec{b}] \rightarrow \mathbb{R}^k\)? There are multiple choices; pick one that will make our work simpler.
- As formally as you can, state a version of the Weierstrass Theorem for a continuous function \(f: [\vec{a},\vec{b}] \rightarrow \mathbb{R}^k\).
Solution
-
Each such \(f\) is really \(k\) individual functions of the form \(f_i : [\vec{a},\vec{b}] \rightarrow \mathbb{R}\) whose value is just the value of \(f\) on the \(i\)th coordinate. The \(f_i\) are called the sought-after coordinate functions for \(f\).
-
Coordinate functions are continuous. For an almost formal proof, consider the function \(p_i : \mathbb{R}^n \rightarrow \mathbb{R}\) that takes a vector and returns its \(i\)th coordinate. This function is called a projection. There are a couple of ways to prove that projections are continuous; for instance, what \(\delta\) could you use in a \(\delta-\epsilon\) proof? Finally, note that \(f_i = p_i \circ f\) and both functions on the right are continuous.
-
Here is one way: compute a distance between \(f\) and \(g\) on each coordinate (choose your favorite one). This will give you a vector of non-negative numbers. And now compute a distance of this vector from the origin. There is a choice to make; two simple ways are \(\rho_1\) and \(\rho_\infty\).
-
This I will do in class! There is a bit of typesetting…