Multivariable calculus: Regression

Regression #

Introduction to regression: One method of making predictions from data starts by finding a function \(f(x)\) that “best” fits a data set. We introduce regression and discuss what “best” might mean.

Fitting now wisely, but too well: Is it possible for a model to fit data too well? This is in fact possible and is known as overfitting. We show how this phenomenon occurs for polynomial models.

Training and validation sets: When the data set available to us is large, there is a nice way to assess whether a machine learning model will be able to make accurate predictions. We introduce the notion of a validation set and use it to make polynomial models.

Modeling the COVID-19 infection: We use a polynomial regression model to predict the rate of early stage COVID-19 infection using CDC data.

Anscombe’s quartet: A cautionary tale about fitting lines to data sets.