Machine Learning Cheat Sheet-K Nearest Neighbors Algorithm

Shashwat Tiwari
Analytics Vidhya
Published in
3 min readNov 12, 2020

--

Photo by NASA on Unsplash

Hello Everyone 👋,

Nowadays Machine Learning and its Application are advancing day by day. It’s becoming very hard for us to recall basic concepts related to Machine learning on daily basis.

Hence Introducing Machine learning Algorithm Cheat Sheet Series in which we will be recalling core concepts related to Machine learning Algorithm which will be helpful for you in cracking any Data Science Interview or projects.

It will be a point to point explanation for quick revision and understanding of Machine learning Algorithms.

So Hold Tight……..

K Nearest Neighbour Algorithm

“Birds of a feather flock together.”

  • KNN is a simple Machine learning Algorithm that comes under supervised learning techniques.KNN Algorithm can be used for both classification and regression problems but widely used for Classification in industry.
  • KNN Algorithm works on the assumption between the similarity between the new data points and available cases. It predicts the new data point into the category that is most similar to the available category.
  • Case is classsified by a majority vote of its neighbours data points.Assignement of case with respective to class is done by its K nearest neighbors measured by a distance function.
Source
  • K-NN algorithm stores all the available data and classifies a new data point based on the similarity. This means when new data appears then it can be easily classified into a good suite category by using K- NN algorithm.
  • In case of continous features above mentioned distance techniques will be followed up however for categorical featues the Hamming distance must be used.
  • In order to Select the right K value for your data the algorithm should be run iterately and algorithm should able to generalize well on test data in order to make better predictions.Also choosing right value of K must reduces the number of errors in algorithm.

Advantanges of KNN Algorithm

  • Simple and Easy to Implement
  • KNN Algorithm solves classification,regression and used as KNN search in Recommendation Systems.
  • Requires no traning before making predictions, new data can be added seamlessly which will not impact the accuracy of the algorithm.

Disadvantage of KNN Algorithm

  • KNN Algorithm does not fit well with large datasets.
  • Features standardization and normalization becomes important as we are dealing with distance measures before training KNN Model.
  • KNN is sensitive to missing values and outliers in dataset.

Applications

  • Recommendation System
  • Credit Card Ratings Prediction
  • Handwriting detection (like OCR), image and video recognition.
  • Loan Default Prediction.

Scikit learn implementation of KNN Algorithm can be found here.

R implementation of KNN Algorithm can be found here.

With the above info, I hope you will get a better understanding of the KNN Algorithm.Also you can able to crack any interview question related to KNN Algorithm.

For the Next Cheat Sheet of ML, Algorithm Please refer to this link.

Do Checkout My Other Blogs related to ML/DL here.

If you like this Post, Please follow me. If you have noticed any mistakes in the way of thinking, formulas, animations, or code, please let me know.

Cheers!

--

--

Shashwat Tiwari
Analytics Vidhya

Senior Applied Data Scientist at EY || Machine Learning and Deep Learning Ardent ||