K均值的算法步骤
首先选择K个初始质心,其中K是用户指定的参数,即所期望的簇的个数。每个点指派到最近的质心(指派到一个质心的点集为一个簇)。然后根据指派到簇的点,更新每个簇的质心。重复指派和更新步骤,知道簇不再发生变化。
- 1 step:
- Initialize: Choose the number of culsters K and randomly initialize K cluster centroids.
- 2 step:
- Assign instances: Assign each instance(data point) to the nearest centroid based on the distance metric(usually Euclidean distance).
- 3 step:
- update centroids: Recalculate the centroids by computing the mean of all instances assigned to each centroid.
- 4 step:
- Repeat steps 2 and 3 until the centroids no longer change.