Linear Regression

A supervised learning algorithm used for regression problems
Model an output variable as a linear combination of input features, finds a line (or surface) that best fits the data
Formula: y_hat = W^T·X
- y_hat, dependent/response variable, target
- W^T, weights or coefficients
- X, independent/predictor variable(s), features
Polynomial Regression: add polynomial features
Assumptions
1. Linear Relationship - a linear relationship between each predictor variable and the response variable
2. No Multicollinearity - none of the predictor variables are highly correlated with each other
3. Independence - each observation in the dataset is independent
4. Homoscedasticity - residuals have constant variance at every point in the linear model
5. Multivariate Normality - residuals of the model are normally distributed

Pros	Cons
Highly interpretable	Sensitive to outliers
Fast to train	Can underfit with small, high-dimensional data

Approaches
1. Ordinary Least Squares, Normal Equation (instant approach)
2. Gradient Descent (iterative approach)
  - Feature Scaling
  - Squared Error Cost Function
Model Performance Evaluation
- R^2
- MAE, RMSE, MAPE