K-Nearest Neighbors (KNN)
- A non-parametric supervised learning algorithm used for both classification and regression
- Steps
- Choose K value - typically an odd number
- Initialization - given point
- Calculate distance (Euclidean) between given point and all data points in the training set
- Sort the results in increasing order
- Type
- Classification - Majority Vote
- Regression - Average
- Best K value
- Use Cross Validation and Learning Curve
- K-Value overfitting/underfitting
- Small K - Low Bias, High Variance (Overfitting)
- High K - High Bias, Low Variance (Underfitting)
Pros | Cons |
---|---|
Easy to understand and simple to implement | Suffers from Curse of Dimensionality |
Suitable for small datasets | Requires high memory storage |