Unsupervised learning: group similar data points into clusters
Using K-Means++ initialization
K-Means++ is a smart initialization algorithm that selects initial centroids far apart from each other. This leads to:
Algorithm: First centroid is chosen randomly. Each subsequent centroid is selected with probability proportional to the square of its distance from the nearest existing centroid.
Find optimal k by looking for the "elbow" in the inertia curve
Type: Unsupervised Learning
Goal: Minimize within-cluster variance
Init Methods:
Complexity: O(n × k × i) where i = iterations
K-Means++ Advantage: Reduces iterations needed and improves final cluster quality by up to 1000x compared to random initialization.