Neural Networks

A supervised learning algorithm that can be used for regression and classification problems
Non-linear model
Components
- Neurons
- Input Layer
- Hidden Layer(s)
- Output Layer
Optimizer Function
- Adam (Adaptive Moment Estimation)
Non-linear Activation Functions
- ReLU, Sigmoid, TanH
- Softmax - often used in the output layer for multiclass classification
Regularization: dropout
Loss Function
- MSE
- Binary Cross Entropy (Log Loss)
- Categorical Cross Entropy
Forward Propagation: making inference
Backward Propagation (Chain Rule): computes derivatives of your cost function with respect to the parameters

Pros	Cons
Can be used for both regression and classification problems	Black box
Able to solve linearly inseparable problem	Require significant amount of data
	Computationally expensive to train

Approaches
- Gradient Descent
  - Batch Gradient Descent
  - Stochastic Gradient Descent
  - Mini-Batch Gradient Descent