K-Nearest Neighbors (KNN)

  • A non-parametric supervised learning algorithm used for both classification and regression
  • Steps
    1. Choose K value - typically an odd number
    2. Initialization - given point
    3. Calculate distance (Euclidean) between given point and all data points in the training set
    4. Sort the results in increasing order
    5. Type
      1. Classification - Majority Vote
      2. Regression - Average
    6. Best K value
      1. Use Cross Validation and Learning Curve
    7. K-Value overfitting/underfitting
      1. Small K - Low Bias, High Variance (Overfitting)
      2. High K - High Bias, Low Variance (Underfitting)
ProsCons
Easy to understand and simple to implementSuffers from Curse of Dimensionality
Suitable for small datasetsRequires high memory storage