# Random Fact #3

## Today’s Fact: Linear Discriminant Analysis

Linear Discriminant Analysis, or LDA for short, is an ML model used in classification. Specifically, for a model of `d`

features and `n`

training points, it assumes that the `d`

features are all distributed Normally (Gaussian), with the same variance. More formally, taken from my professor’s cs189 notes, we get:

This doesn’t help much for those who are unfamiliar with Machine Learning, so he also included helpful diagrams. Below is a two-class classification (imagine your input is either of class `A`

or class `B`

, and you’re deciding which class it’s supposed to be in). The decision function (breaking point between choosing `A`

versus `B`

) is the black line; less than 0.5 indicates a stronger guess towards the left side, and greater than 0.5 indicates a stronger guess towards the right class. Below that, we have a multi-class problem, where each class’s boundaries are broken up by black line:

When Andi saw this, he immediately thought of k-means clustering, which is a type of unsupervised learning meant to cluster similar data without knowing their actual labels. K-means often uses similar voronoi diagrams to represent its output, so he raised a question: How does K-means compare to LDA, even though they fundamentally were quite different (supervised vs unsupervised, classification vs grouping)? And so we simulated it.

### Simulation

First, we examined the case of Normally distributed data that is decently spread out from each other, like so:

Here, we see that LDA and KMeans actually (somewhat surprisingly!) performed about the same:

Now, we looked at the case where data is overlapping, like so:

We see that the results are different this time (!):

Notice that in the second case, where there are overlapping data, *LDA still learns of a distinction between class Yellow and class Purple*. K-means completely lost it there, since it has no idea

*what the labels are supposed to be*. This is expected, since K-means is intended to group similar data, not learn things like this. We thought that it was quite interesting that K-means performed as well as it did in part 1.