Machine learning: 1. Approximating functions

Approximating continuous functions #

Suppose that we have identified a hypothesis class \(\mathcal{H}\) we would like to use in our supervised learning task. The goal of this part of the course is to develop a technique which we can use to verify that functions in a given hypothesis class \(\mathcal{H}\) can be used to approximate any continuous function.

Metrics #

So how does one go about finding a sequence of functions in \(\mathcal{H}\) that converge to a given continuous function \(f\)? Before even asking this question, we have settle on what we mean by converge and this means we will need to identify a metric. The standard one seen in a first course in analysis is the supremum metric:

\[\rho(f,g) = \sup_x |f(x) - g(x)|.\]

When the functions involved are continuous and the supremum is taken over a compact set, the supremum is attained and we have a metric space in which to work. But there are other ways of measuring distance we will need this semester.

The convolution product #

The convolution product is an exotic way of multiplying two functions. It is ubiquitous in mathematics appearing in fields as diverse as abstract algebra and signal processing. We will see it in more than one context in this class, but in the current setting, we will use it to answer the question posted above:

Theorem: For a suitably chosen sequence of functions \(\{h_n\}_{n \in \mathbb{N}}\), we have \[\lim f*h_n = f.\]

If we play our cards right (and we will), the sequence \(\{f*h_n\}_{n \in \mathbb{N}}\) will consist of functions in \(\mathcal{H}\) and we will know that our hypothesis class is dense in the set of continuous functions.

Labs and exercises #

1. Discontinuous functions
2. Kernel smoothing
3. Metrics
4. Convolution product
5. Feature detection
6. Integral transforms
7. Trig polynomials
8. The Landau kernel
9. Indefinite integral
10. Higher dimensions
11. Stone-Weierstrass Theorem