Big Data Algorithms: The K-Nearest Neighbour Classification Algorithm Explained

Free Webinar | Thursday, March 10th, 2022 | 10:00 AM Singapore

Webinar Description

The K-Nearest Neighbour classification algorithm (abbreviated to k-NN) is one the most basic and frequently used algorithms for classification purposes. The k-NN classifier is used in databases in which variables (the columns) indicate specific target conditions, so that a new data point can be classified as one of these target classes.

k-NN is a non-parametric algorithm, which means that the classifier does not make any assumptions on the underlying data distribution or database. More simply stated, the model structure is determined by the underlying data, which ensures that the k-NN algorithm can be used in a wide variety of situations.

The k-NN algorithm looks at the k number of nearest neighbours (whereby k is specified by the user) and classifies the new data point according to the neighbours that are closest to that data point. To calculate which data point is “nearest,” the k-NN algorithm calculates the distance from the new data point to the other data points and selects the smallest distance to other data points.

Learn the Algorithm

The Big Data Algorithm webinars consist of two main components. In the first section of this webinar, you will learn the fundamental theory of the K-Nearest Neighbour classification algorithms. We will take a closer look at the inner workings of the algorithm, and the mathematical rules which the K-Nearest Neighbour classifier is following to achieve its predictions.

The second part of the webinar will be a practical showcase, that will show how the algorithm works in practice. To showcase how the K-Nearest Neighbour classification algorithm works, we will make use of standardized Python or R scripts to show you how to build a classifier by yourself. All data, scripts and examples will be provided to the participants who sign up for this webinar.