Thomas Pietraho

Content: The course from now on will follow a flipped classroom paradigm. Your introduction to the material will come from readings as well as video lectures; the details will depend on the week. I will hold office hours to discuss the material; you should be ready with questions. Each week's work will be detailed as one of the modules appearing below.

Reading and watching mathematics: Learning mathematics is not a spectator sport. Reading mathematics is not like reading a novel; watching mathematics is not like watching an action thriller. Some paragraphs are easy to digest, but you may find yourself looking at one line of text for five or more minutes trying to understand what the author is trying to say. Use the pause button when watching a video. As you read or watch, take notes, just as you do in class. This is crucial! If questions arise, write them down and ask during office hours. In each module, I will let you know how long you should expect to spend reading and watching the material. Some weeks, this will be a substantial commitment of time even before you start the homework.

Office hours: I will hold multiple office hours each week via Zoom. The scheduled times as well as links to join will appear in each week's module. I am also happy to talk with you at a different time; just let me know and we can schedule a meeting. The first two office hours will focus on the week's material broadly; the second two will focus on the homework, although this is not a firm rule.

Homework: Your homework will be graded using [GradeScope]; To begin, you will need to set up an account. I will send you the code for our class by email. Each homework assignment will need to be submitted as a .pdf file. If you typeset your homework using LateX, this is straightforward. If you write-up your homework the old-fashioned way using pencil and paper, use a scanner or a phone scanner app. See the GradeScope help document for a list of suggestions [pdf]. Please let me know if this does not work for you; I will come up with an alternative.

Collaboration: I have set up a Microsoft Teams account for this course. There is a course chat room as well as teleconferencing capability. Let me know if you need help figuring how to access these resources. I would love for you to work together on this material.

Feedback: I expect the nature of instruction in the class to be an evolving paradigm. Let me know what things work for you and which ones do not. The sooner, the better.

Approximations are crucial in applications of modern linear algebra. There are two basic examples we will keep in mind:

Data is often represented as a matrix $A$ . A common example occurs when each row represents statistics describing an individual in some population. There may be many such individuals, so and so $A$ may have many rows. But data collection is a noisy process, and the matrix one actually has to work with: $\widetilde{A}$ , is only an approximation of $A$ . However, the matrices $A$ and $\widetilde{A}$ are usually close to each other, at least in some sense we are yet to specify.

As we saw in our first lab, it is often possible to replace a vector $v$ with another vector $\widetilde{v}$ so that is still a good representation of $v$ , and the former takes a lot less room to store in a computer's memory. For instance, here are two pictures of Albert Einstein:

The second image takes one percent of the space necessary to store the first one. Underlying each image is a matrix whose entries are numbers between 0 and 1 representing a shade of gray for each pixel. Whether one image looks like the other is a question about closeness of the two matrices.

From these examples, it is clear that to be able to talk about closeness among vectors or matrices, we need to be able to measure distance between them.

Goal: Define notions of distance for both vectors and matrices.

The amazing thing is that there many possible ways of measuring distance and which one should be used depends heavily on the application! Our reading will discuss the common properties shared by all reasonable ways of measuring distance, as well as introduce some specific ways this distance can be measured. As you read, keep track of the relationship between norms, inner products, and the measurement of distance.

Assignment: There are three parts. First, complete readings from Meyer's Matrix Analysis; for your convenience, I am including the relevant sections as .pdf files. Then complete the introductory exercises to test your understanding. The solutions are included, so there is no need to turn anything in. The last part is this week's official homework.

Reading 1: Read Section 5.1 of Mayer's book [PDF], ignoring the proof of the CBS inequality and example 5.1.3. Allocate ninety minutes for this, reading slowly. Reading mathematics, although just as exciting (?), is not like reading a novel, and there may be times when it takes a substantial amount of time to get through a single line of text. If you get really stuck, move on and formulate a question for office hours.
Complete exercises 5.1.1 and 5.1.2. Here are their [solutions].
Reading 2: Read Section 5.2 of Mayer's book [PDF] .
Complete exercises 5.2.1, 5.2.2, and 5.2.3 Here are their [solutions].
Official homework: Our formal homework will follow a familiar format. Here is a [PDF]. You should submit your solutions as a single .pdf file; a nice way to do create this document is to use a [phone scanner app]. Homework is due on Wednesday, April 1 at 5pm. Submit it via [GradeScope].

[PDF]

This week's office hours: I will hold office hours via Zoom this week. They are dedicated to this course, so you won't have to share time with pesky analysis students. Click on the following links when you are ready to join:

Thursday 10am	Monday 10am
Friday 1pm	Tuesday 1pm

Although I will be happy to entertain any discussion that comes up, the focus of the first two meetings will be to discuss the reading. The second two will focus on the homework. Let me know if you would like to schedule a meeting at another time.

One idea I would like you to focus on in this week's readings is the relationship between inner products, norms, and distance. From last week's homework, we know that if we have a norm, we can always turn it into a way of measuring distance between vectors: we simply let

$dist(x,y) = ||x-y||$ .

Since there are many different way of defining norms, there are many different ways of measuring distance between vectors. This week you will read about inner products, which are a more general way of thinking about the dot product in $\mathbb{R}^n$ .

It turns out that if we have an inner product, we can always define a norm from it by letting

$||x|| = \sqrt{\langle x , x \rangle}$ .

So inner products beget norms, and norms beget distance. As you will find out at the end of Section 5.3, this does not mean that every norm comes from some inner product, and not every notion of distance comes from a norm. When this happens was a question that stumped mathematicians for a long time. The answer has to do with the parallelogram identity.

The second idea that is crucial this week comes in the second section of the reading. When we have an inner product, we can talk about orthogonality and angles between vectors in any vector space. So we can even talk about angles between different polynomials. As you read, pay special attention to the notion of a Fourier expansion of a vector. Fourier did his work studying functions trying to find solutions to differential equations, and his expansions were in terms of sines and cosines. But in modern linear algebra, his work has a much broader meaning.

Assignment: As last week, there are three parts. First, complete readings from Meyer's Matrix Analysis; for your convenience, I am including the relevant sections as .pdf files. Then complete the introductory exercises to test your understanding. The solutions are included, so there is no need to turn anything in. The last part is this week's official homework.

Reading 1: Read Section 5.3 of Mayer's book [PDF], again allocate ninety minutes for this, reading slowly. If you get really stuck, move on and formulate a question for office hours.
Complete exercises 5.3.1, 5.3.2, and 5.3.7 Here are their [solutions].
Reading 2: Read Section 5.4 of Mayer's book [PDF] .
Complete exercises 5.4.1, 5.4.4, and 5.4.5 Here are their [solutions].
Official homework: The .pdf of our homework [PDF]. You should submit your solutions as a single .pdf file; a nice way to do create this document is to use a [phone scanner app]. Homework is due on Wednesday, April 8 at 5pm. Submit it via [GradeScope].

[PDF]

This week's office hours: I will hold office hours via Zoom this week. They are dedicated to this course, so you won't have to share time with pesky analysis students. Click on the following links when you are ready to join:

Thursday 10am	Monday 10am
Friday 1pm	Tuesday 1pm

Although I will be happy to entertain any discussion that comes up, the focus of the first two meetings will be to discuss the reading. The second two will focus on the homework. Let me know if you would like to schedule a meeting at another time.

This week we will begin what I consider to be the most important section of the course. The results we derive will have wide-ranging applications; I will be especially interested in those focused on data analysis. We will start by building up quite a bit of theory, and as our comfort grows, I will introduce increasingly more complex applications.

The singular value decomposition accomplishes what we have failed to do so far: diagonalize every matrix. This week, I will proceed by "lecturing" via a sequence of short clips. Watch the following at your leisure, but make sure to take notes, pause, and jot down questions for our office hour.

Watch the following clip first to remind yourself of what we have done so far trying to diagonalize symmetric matrices. You should take your own notes as you watch the lecture, but my own version is available in the accompanying .pdf file. I also set up notation that will be used in the later lectures.

Matrix diagonalization, a quick review

[PDF]

The next clip introduces the singular value decomposition theorem for any matrix as well as its proof:

The SVD

[PDF]

Next, is a purely practical clip where I go through how to compute the singular value decomposition of a matrix using some software:

Computing the SVD

[PDF]

Assignment: As last week, there are three parts. First, watch this week's lectures above. Make sure to take your own notes, but you can also refer to the ones listed above. Second, work through the introduction to computational linear algebra in the COLAB and computational linear algebra tab above. Once you are ready, revisit the sound compression lab in the Lab: Sound compression tab. I will go through both exercises in the office hour on Thursday and Friday. You will need the results of the lab to complete this week's homework.

Official homework: The .pdf of our homework [PDF]. You should submit your solutions as a single .pdf file; a nice way to do create this document is to use a [phone scanner app]. Homework is due on Wednesday, April 15 at 5pm. Submit it via [GradeScope]. Here are the solutions [PDF]

This week's office hours: I will hold office hours via Zoom this week. They are dedicated to this course, so you won't have to share time with pesky analysis students. Click on the following links when you are ready to join:

Thursday 10am	Monday 10am
Friday 1pm	Tuesday 1pm

Although I will be happy to entertain any discussion that comes up, the focus of the first two meetings will be to discuss the reading. The second two will focus on the homework. Let me know if you would like to schedule a meeting at another time.

There are a number of valuable applications to the SVD of a matrix; we will only have time to discuss a few of them in this course. My main goal for this week is to develop some intuition and basis properties of this decomposition so that we will be ready for the applications.

The first clip gives some geometric intuition for the SVD. The executive summary is that you get something like eigenvalues and eigenvectors, but not quite. It will also turn out that this geometry will help with computing matrix norms.

Geometry of the SVD

[PDF]

A short clip relating the eigenvalues and singular values of a matrix:

Eigenvalues vs. singular values

[PDF]

Another short clip about the SVD for rectangular matrices:

Rectangular SVD

[PDF]

It will turn out that the rank of a matrix will play a crucial role in almost everything we do for the rest of the semester. So a brief review of what rank is, how to think about it, as well as what rank might mean if your matrix consists of data:

Rank of a matrix

[PDF]

Finally for this week, it is often desirable to approximate matrices with other matrices that have lower rank. Unsurprisingly, this can be done using the SVD. The first part of the lecture gives some motivation and context for what we are about to do.

Low rank approximations

[PDF]

Assignment: Nothing fancy. Watch the videos above and complete the homework problems. One of the problems asks you to denoise an image, you will have to complete a short lab first.

Official homework: The .pdf of our homework [PDF]. You should submit your solutions as a single .pdf file; a nice way to do create this document is to use a [phone scanner app]. Homework is due on Wednesday, April 22 at 5pm. Submit it via [GradeScope].Here are the solutions [PDF]

This week's office hours: I will hold office hours via Zoom this week. They are dedicated to this course, so you won't have to share time with pesky analysis students. Click on the following links when you are ready to join:

Thursday 10am	Monday 10am
Friday 1pm	Tuesday 1pm

Although I will be happy to entertain any discussion that comes up, the focus of the first two meetings will be to discuss the reading. The second two will focus on the homework. Let me know if you would like to schedule a meeting at another time.

The first mini-lecture this week addresses the Eckhart-Young Theorem which tells us that the SVD provides us with the best possible low-rank approximations to a matrix. This is an amazing result; it is rare that we know how to find an approximation like this without relying on iterative optimization methods like gradient descent.

The Eckhart-Young Theorem

[PDF]

A very important application of the SVD allows us to fit polynomials models to data. The following two mini-lectures provide the details. The first lecture shows that given any reasonable set of points in the plane, we can find a polynomial that fits them exactly. It also shows that using this polynomial to make real-world predictions is probably a terrible idea.

Polynomial fitting, Part I

[PDF]

The second shows how to use the SVD to make better polynomial models. Also a story about John von Neumann and elephant trunks.

Polynomial fitting, Part II

[PDF]

Assignment: First, I have prepared a Python lab about image compression for this week under the Lab: Image compression tab. Have fun with it and let me know if you run into trouble. I ask a question about it on this week's homework. Second, some old-fashioned homework problems: [PDF]. If you would like to plot polynomials using CoLab, here is a notebook that leads you through the process:

[Notebook]

You should submit your solutions as a single .pdf file; a nice way to do create this document is to use a [phone scanner app]. Homework is due on Wednesday, April 29 at 5pm. Submit it via [GradeScope]. Here are the solutions [PDF].

This week's office hours: I will hold office hours via Zoom this week. They are dedicated to this course, so you won't have to share time with pesky analysis students. Click on the following links when you are ready to join:

Thursday 10am	Monday 10am
Friday 1pm	Tuesday 1pm

Although I will be happy to entertain any discussion that comes up, the focus of the first two meetings will be to discuss the reading. The second two will focus on the homework. Let me know if you would like to schedule a meeting at another time.

The main lecture this week is about the use of the singular value decomposition when visualizing high-dimensional data.

Data visualization

[PDF]

After watching this lecture, complete the data visualization lab. It concerns linguistics!

Assignment: Some old-fashioned homework problems: [PDF]. You should submit your solutions as a single .pdf file; a nice way to do create this document is to use a [phone scanner app]. Homework is due on Wednesday, May 6 at 5pm. Submit it via [GradeScope].

This week's office hours: I will hold office hours via Zoom this week. They are dedicated to this course, so you won't have to share time with pesky analysis students. Click on the following links when you are ready to join:

Thursday 10am	Monday 10am
Friday 1pm	Tuesday 1pm

Although I will be happy to entertain any discussion that comes up, the focus of the first two meetings will be to discuss the reading. The second two will focus on the homework. Let me know if you would like to schedule a meeting at another time.

The computational aspect of this class will be done in Google's Colaboratory. By way of an introduction, please read Google's own blurb about it: [link]. It is a wonderful platform to get your feet wet doing machine learning using Python. We will use it as the computational platform for our linear algebra course.

You will need a Google account. Computations occur in a notebook which can be simply saved as a Google Doc. Start by working through the following notebook which illustrates how do use Python for some basic computations in linear algebra:

You will need to save your own copy of the notebook obtained from the following link: go to File and then Save a copy to Drive. Modify it as you see fit!

Introduction to matrix computations

If you have not done so already, work through the COLAB and computational linear algebra tab above for an introduction to our software. You have seen a verion of this lab done in MATLAB at the beginning of the course. I have streamlined it and adapted it to Python. I like this setup a lot better! For your reference, here is a copy of the original lab document: [PDF]. The modern version of the lab is on Google's Colaboratory. Work through the following:

Sound compression using the Haar basis

Potential project: At the end of the new version of the lab, I introduce yet another basis, the discrete cosine basis. If you a looking for a fun final course project, work through this part, see whether it gives a better compression, and figure out where this basis comes from. Talk to me if you are interested; I can provide some more insight and resources.

We will use the SVD to denoise images. First watch this short video on low rank denoising

Low rank denoising
[PDF]

Then work through the following Python notebook: Image denoising using the SVD. I will ask you a question on an upcoming homework about this exercise, so keep your results handy!

A grayscale image composed of pixels can readily be interpreted as a matrix. For instance, consider the following 3 x 3 image:

If each shade of gray is represented by a number between 0 and 1 with the latter corresponding to white, the matrix on the right represents the image on the left. In general, images contain a certain level of structural redundancy. There are many ways in which this is manifested, but in the example above, all three columns of our matrix are multiples of one column vector:

$u = (2,1,2)^T$ .

In other words, the rank of our image is one, and we could recreate the picture by remembering the scalar factor by which $u$ needs to be multiplied by to recreate each of our columns. In this lab, we will focus on precisely this type of structural redundancy. While very few images have rank exactly equal to one, for many the rank is much smaller than the full rank of a matrix of the same size.

In this lab, we will focus on precisely this type of structural redundancy. Work thorough the following COLAB notebook. Have fun, and let me know if you run into trouble.

Image Compression Lab

Consider the following plot of height versus weight for the 2013 NFL roster:

Each point represents one player and is colored by his position; it is a beautiful summary of the physical requirements of each role. This data set is also fairly simple: each point has exactly two coordinates height and weight, and graphing it is straightforward.

Most data we have available today is high-dimensional, even thousands of coordinates for every point. To be able to visualize it is an important task and enlightening examples abound. Below is a visualization of the genetic makeup of Europeans from a famous study of John Novembre et. al.:

Properly represented, the distribution of these points convincingly recapitulates geography. Another example summarizes due to Olivier H. Beauchesne summarizes the voting records of Quebec politicians:

Each politician's voting record in a given year was used to generate a plot clearly distinguishing the political parties. So how do you visualize intinsically high-dimensional data in a two-dimensional plot?

There are a number of approaches, the most recent being t-SNE and UMAP. This goal of this lab is to introduce you to one of the most traditional and effective methods that follows directly from the singular value decomposition. For a longer introduction, read the lab handout: [PDF]. The details of the lab are in the following CoLab notebook:

Data Visualization Lab

As part of your final evaluation for the course, I would like you to complete a project that shows your mastery of some of the concepts covered in the course. You will hand in a written report of your work, the goal of which is to show

your mastery of knowledge of some component of linear algebra that we have studied this semester, as well as

your ability to build upon what we have learned, either by learning additional material beyond what we have covered or by applying techniques of linear algebra to pursue some further goal.

The following tabs suggest some projects you can pursue. However, you should not feel limited to choose only from the projects presented here, nor should you view the project descriptions as instructions that must be followed to the letter. This is your chance to follow an interest in linear algebra!

Instructions

The projects will be due on May 16th. Although you should work on these projects as part of a larger group, your final submission must be completed individually. You are encouraged to use any external resources that you deem appropriate (both animate as well as inanimate), but you must make sure to cite them.

A reasonable written project is expected to be in the neighborhood of ten single-spaced pages. Written work should demonstrate your mastery of the theoretical material as well as your ability to apply it to solve a problem. In particular, a significant fraction of your written work should include an exposition of the relevant theory that we have learned in this course, and in certain cases, that you have learned from other sources during your research.

As a general rule of thumb, you should aim to write a paper that could be read and understood by an outstanding student who has just completed a first course in linear algebra.

Suggested projects

First, a handout detailing a few possible projects, including a few non-computational ideas for you to consider [PDF] . Below are a few investigations that require a bit of computational machinery.

The motivation for this project is a question I posed for myself a long time ago and never had time to thoroughly investigate, so I am very interested in its outcome. Here's the basic idea:

Question: There are a number of different types of images we commonly find online: natural images, scans of documents, logos and art generated using software, cats, etc. Are some of them easier to compress than others?

Image compression relies on structural redundancy within images and it is successful if we have the ability to exploit it. An answer to this question would let us know whether different image types exhibit different levels of such redundancy. One limiting factor is that I have only taught you one method of compression, using low-rank approximations via the SVD so this project will not answer my question completely, but I suspect the results will nevertheless be interesting.

As with all projects, feel free to formulate your questions to investigate and use the above just as a starting point. To get you started, I enclose a Python notebook with a more comprehensive outline of the project.

Image compression

Superficially, this project is about sports. However, the mathematics involved can be used to answer questions about ranking in many different contexts, including politics, consumer preferences, and sociology. But let’s start with the language of football. Perennially each fall, arguments “rage” about the rankings of the best college football teams. While this is attractive from the perspective of sports radio talk show hosts, for whom this discussion supplies a steady source of employment, it is unsettling for mathematicians. A classical paper by James P. Kenner proposed a scheme for ranking college football teams using Perron-Frobenius theory. Your job is to investigate his work and apply it to a data set of your choice.

Kenner's paper is available here [PDF]. If you look around, there are more recent articles describing similar approaches, including this one [PDF]. Read them with a focus on the linear algebra involved.

Choose a set of data to analyze. It may be based on records of sports teams, but you can think more broadly. Any data resulting from uneven paired competition (where not all pairs compete) requiring ranking the competitors is ripe for analysis by these methods. Be creative! Your ultimate goal will be a ranking of the competitors.

Your write up should include a detailed discussion of the underlying mathematics, including a discussion of the Perron-Frobenius theory central in this work.

For reference, here is our original lab: [PDF]. I also wrote a short COLAB notebook with some useful commands and a bare-bones outline of the project: Ranking and the Perron-Frobenius Theorem.

The goal of this project is to analyze a set of three games using Markov chains. The results will be surprising. Although our approach superficially uses the games for the less-than-honorable purpose of gambling, they form the basis of the biological phenomenon of Brownian ratchets. Without further ado, consider the following two games:

Game A: We will toss an unfair coin

I win with probability 0.5 + $\epsilon.$ You win with probability 0.5 – $\epsilon.$

Winner gets a $1 from the loser.

Game B: Let M be the amount of money in your pocket and assume this is an integer. We will toss two coins: If M is divisible by 3, we will flip Coin 1, otherwise we will flip Coin 2.

Coin 1: I win with probability 0.9 + $\epsilon.$

Coin 2: I win with probability 0.25 + $\epsilon.$

Again, at each turn the loser pays the winner $1.

The final game is an innocuous sounding combination of the previous two games. Simply stated:

Game C: Randomly alternate playing Game A and Game B.

Follow this outline in analyzing this sequence of games. We first focus on the first two:

Simulate both Game A and Game B in CoLab. For each trial, flip a coin 100 times and keep track of your aggregate winnings. Repeat this for 10,000 trials, average your results, and plot them. If someone proposes to play these games with you, should you accept?
Use basic probability to analyze the two games. How much money do you expect to win or lose at each turn? Does this agree with your simulations?
Game B can be modeled by a Markov chain. There are three states, each corresponding to the possible remainders when M, the amount of money in your pocket, is divided by 3. Find the transition probabilities of your Markov chain and its steady-state vector. Do your results change your analysis of Game B above?

Of course, the next step is to analyze Game C. Follow the same procedure as above, first simulating the game and then using a three-state Markov chain to analyze it. What is going on?

In your write up, include clear and detailed explanation of the mathematical underpinnings of your analysis. Included in your work should be a discussion of Markov chains, the Perron-Frobenius Theorem, and discrete dynamical systems. There are a number of references available that you can consult, including the original work of Juan Parrondo. You are encouraged to expand the scope of this project. It spawns a variety of further questions. To ease the pain, sample code for this project on CoLab.

Parrondo's Games.

1.	[PROBLEMS]	[SOLUTIONS]
2.	[PROBLEMS]	[SOLUTIONS]
3.	[PROBLEMS]	[SOLUTIONS]
4.	[PROBLEMS]	[SOLUTIONS]
5.	[PROBLEMS]	[SOLUTIONS]
6.	[PROBLEMS]	[SOLUTIONS]
7.	[PROBLEMS]	[SOLUTIONS]
8.	[PROBLEMS]	[SOLUTIONS]
9.	[PROBLEMS]	[SOLUTIONS]
10.	[PROBLEMS]	[SOLUTIONS]
11.	[PROBLEMS]	[SOLUTIONS]
12.	[PROBLEMS]	[SOLUTIONS]
13.	[PROBLEMS]	[SOLUTIONS]

Introduction to MATLAB	[HANDOUT]
Lab 1: The Haar Wavelet Basis	[HANDOUT]	[MATLAB]
Lab 2: The PageRank Algorithm	[HANDOUT]

Organization
Syllabus		[HANDOUT]
Cover Sheet		[HANDOUT]
Motivation
Course Themes		[SLIDES]