Chapter 1: Introduction

K-Nearest Neighbours is one of the oldest machine learning algorithms. It can be used for both classification and regression, and can also be used as a meta-learner. The intuition behind K-Nearest Neighbours is one of the core assumptions in machine learning: 

“Data points that are similar to each other will also behave similarly.”

In the real world, you would, for instance, guess that two houses that are very similar to each other could be sold for approximately the same price. Or for instance, two e-mails that are similar to each other are pretty much as likely to be spam or not.

By looking at the plot above, the gray point is surrounded by red points, and therefore, it seems natural to classify it as red. This reasoning can be essentially summarized as “a point will have the class that most of its neighbours also have”. Similarly, if all of these points were assigned some sort of value, it would make sense to assign the gray point, a value that is similar to the value of its neighbours.

This simple intuition is what K-Nearest Neighbours uses for both classification and regression!