Chapter 3: Summary


  • Easy to implement
  • Can perform well even with limited data
  • Can be used for classification, regression, or as a meta-learner
  • Lazy learner, so easy to update


  • Simplistic (linear boundaries)
  • Computing distances can be expensive
  • Appropriate similarity measure has to be picked
  • Predictions can be slow

Why is KNN called a lazy learner?

A lazy learner is a model that only learns when it is asked to make a prediction. KNN is a lazy learner because as we’ve shown, the inference happens in the predict method as opposed to the fit method.

What is the best value for \(K\)?

\(K\) is a hyperparameter and picking the right one is highly dependent on the task at hand. Typically, only odd numbers are chosen to avoid ties, and because of Occam’s razor, it is often assumed that smaller values of  \(K\) are preferable. Sometimes, a rule of thumb of choosing  \(K=\frac{n}{\sqrt{2}}\) is proposed if there are \(n\) data points, but there is no strong theory behind it.

What distance measure if best?

Again, picking an appropriate distance measure can make a huge difference with KNN. Since it is a lazy learner, and therefore doesn’t require training time, it is often possible
to plainly compare different distance measures and pick the best one.

Why is weighting used?

Weighting the neighbours can turn out to be useful to increase the performance, but it is always wise to experiment with and without weighting to see what performs best. In our case, we gave higher weights to the closest neighbours, but more complex weighting schemes can be used.

Why is KNN a meta-learner?

KNN is a meta-learner in the sense that the voting scheme can be made as complex as you want. If you’d like to replace the majority voting scheme explained in this tutorial, by a deep neural network, it is entirely possible. It is considered a meta-learner because it learns “how to learn” by selecting the instances on which a decision has to then be made.

Are there other ways to find the closest neighbours?

The K-nearest neighbours according to a certain distance measure will always be the same, but there exist many different algorithms to find them efficiently. A famous one involves storing the data in a KD-tree. In this tutorial, we used brute force, but in some cases, this might take too much time.