Clustering Quality Metrics
Clustering Quality Metrics
2 minute read
How to Evaluate Quality of Clustering?
- 👉 Elbow Method: Quickest to compute; good for initial EDA (Exploratory Data Analysis).
- 👉 Dunn Index: Focuses on the ‘gap’ between the closest clusters.
- 👉 Silhouette Score: Balances compactness and separation.
- 👉 Domain specific knowledge and system constraints.
Elbow Method
⭐️Heuristic used to determine the optimal number of clusters (k) for clustering by visualizing how the quality of clustering improves as ‘k’ increases.
🎯The goal is to find a value of ‘k’ where adding more clusters provides a diminishing return in terms of variance reduction.

Dunn Index [0, \(\infty\))
⭐️Clustering quality evaluation metric that measures: separation (between clusters) and compactness (within clusters)
Note: A higher Dunn Index value indicates better clustering, meaning clusters are well-separated from each other and compact.
👉Dunn Index Formula:
\[DI = \frac{\text{Minimum Inter-Cluster Distance(between different clusters)}}{\text{Maximum Intra-Cluster Distance(within a cluster)}}\]\[DI = \frac{\min_{1 \le i < j \le k} \delta(C_i, C_j)}{\max_{1 \le l \le k} \Delta(C_l)}\]
👉Let’s understand the terms in the above formula:
\(\delta(C_i, C_j)\) (Inter-Cluster Distance):
- Measures how ‘far apart’ the clusters are.
- Distance between the two closest points of different clusters (Single-Linkage distance). \[\delta(C_i, C_j) = \min_{x \in C_i, y \in C_j} d(x, y)\]
\(\Delta(C_l)\) (Intra-Cluster Diameter):
- Measures how ‘spread out’ a cluster is.
- Distance between the two furthest points within the same cluster (Complete-Linkage distance). \[\Delta(C_l) = \max_{x, y \in C_l} d(x, y)\]
Measure of Closeness
- Single Linkage (MIN): Uses the minimum distance between any two points in different clusters.
- Complete Linkage (MAX): Uses the maximum distance between any two points in same cluster.
End of Section