Machine learning: Equvariance

Equivariant and invariant neural networks #

Equivariant and invariant machine learning modes are designed to take advantage of symmetries present in the data. As we have seen in our experiments with the MNIST hand-written digits data set, even small rotations lead to degradation of the accuracy of the classifier.

Accuracy of a CNN MNIST classifier on rotated images

A classical way to resolve this problem is to use a particular data augmentation technique.

Data augmentation: To train a classifier invariant under symmetries from a set \(G\), augment the training data \(\mathcal{D}\) by including points of the form \((g \vec{x}_i, \vec{y}_i)\) where \(g\) is randomly sampled symmetry from \(G\). For instance, an image classifier that is to be rotation-invariant can be trained not only on the original images, but also on their rotated versions.

Data augmentation is sometimes successful, especially if the number of images in the original training set is small.

Invariant networks #

But data augmentation is an unsatisfying solution. It is very desirable to be able to build machine learning models that intrinsically take advantage of symmetries in the data. How best to do this is by no means a solved problem. The goal of this project is to survey the existing literature on invariant and equivariant neural networks and apply one of the existing models to a data set that exhibits a set of symmetries.

  • You should start with some background reading. There is quite a bit of literature already written on the subject, but this article and book chapter are a start.

  • Find a data set which is naturally invariant under a set of symmetries. Images are a natural choice, but there are others.

  • Build an invariant network for the data set you found above. Assess its performance, accuracy, and invariance. There are several Python packages that implement some of the more popular solutions, so you will not have to code from scratch!

Your paper should include a exposition of the mathematics of equivariant and invariant machine learning models, as well as detail your empirical experiments.