Select Page

# Chapter 4: K-Medoids

In order to use any distance measure and still ensure convergence, one can use K-Medoids. One can view K-Medoids as the most general version of K-Means.

K-Medoids will converge regardless of the distance measure used, so in this example, we’ll use cosine distance.

```import numpy as np
import copy

from unsupervised.kmeans import KMeans
from utils.distances import pdist, euclidean, manhattan, cosine

class KMedoids(KMeans):

def _distance(self, a, b):

return cosine(a, b)
```

Like K-Medians, K-Medoids requires a slightly different way to define the centroids. K-Medoids defines a centroid, in this case a medoid, as the most central point in a cluster. In practice, it means that every data point could potentially be a centroid, but the one minimizing the inertia the most is picked at each step.

```    def _compute_centroids(self, X, labels):

if not hasattr(self, "distances"):
self.distances = pdist(X, self._distance)

centroids = []

for i in range(self.k):

distances = self.distances[np.ix_(labels == i, labels == i)]
within_cluster_sum_of_distances = np.sum(distances, axis=0)

centroid = X[labels == i][np.argmin(within_cluster_sum_of_distances)]

centroids.append(centroid)

return np.array(centroids)
```

And like K-Medians, we don’t need to define the rest of the logic, as it is the exact same being used by K-Means!