Gaussian Mixture Models
2 minute read
K-means uses Euclidean distance and assumes that clusters are spherical (isotropic) and have the same variance across all dimensions.
Places a circle or sphere around each centroid.
- What if the clusters are elliptical ? K-Means Fails with Elliptical Clusters.

GMM: ‘Probabilistic evolution’ of K-Means
GMM provides soft assignments and can model elliptical clusters by accounting for variance and correlation between features.
Note: GMM assumes that all data points are generated from a mixture of a finite number of Gaussian Distributions with unknown parameters.
GMM can Model Elliptical Clusters.

‘Combination of probability distributions’.
Soft Assignment: Instead of a simple ‘yes’ or ’no’ for cluster membership, a data point is assigned a set of probabilities, one for each cluster.
e.g: A data point might have a 60% probability of belonging to cluster ‘A’, 30% probability for cluster ‘B’, and 10% probability for cluster ‘C’.
Gaussian Mixture Model Example:

Note: The term \(1/(\sqrt{2\pi \sigma ^{2}})\) is a normalization constant to ensure the total area under the curve = 1.
Multivariate Gaussian Example:

Whenever we have multivariate Gaussian, then the variables may be independent or correlated.
Feature Covariance:

Gaussian Mixture with PDF

Gaussian Mixture (2D)

End of Section