MSE [Regression]
Description
Mean Squared Error (MSE) is the most commonly used loss function for regression. The loss is the mean overseen data of the squared differences between true and predicted values
- Use case: Regression problems
 - When to use: Is typically used when you are solving regression problems where the goal is to predict continuous numerical values. Squaring ensures that both positive and negative differences contribute equally to the loss.
 - Key Property: Sensitive to outliers. Large errors are penalized more due to squaring.
 -  
Example applications:
- Predicting house prices
 - Estimating sales revenue
 - Forecasting temperature
 
 
Formula
Gradient Descent
Sample Derivative Calculation for Linear Regression
The goal is to find the partial derivatives of the cost function \(J(\vec{w}, b)\) with respect to \(\vec{w}\) and \(b\).
Substituting \(f_{\vec{w},b}(\vec{x}^{(i)}) = \vec{w} \cdot \vec{x}^{(i)} + b\):
We use the chain rule. Let \(u = \vec{w} \cdot \vec{x}^{(i)} + b - y^{(i)}\). We are finding \(\frac{\partial}{\partial \vec{w}} [u^2]\).
- Apply Chain Rule: \(\frac{\partial (u^2)}{\partial \vec{w}} = 2u \cdot \frac{\partial u}{\partial \vec{w}}\)
 - Part 1: \(2u = 2(\vec{w} \cdot \vec{x}^{(i)} + b - y^{(i)})\)
 - Part 2: \(\frac{\partial u}{\partial \vec{w}} = \frac{\partial}{\partial \vec{w}} (\vec{w} \cdot \vec{x}^{(i)} + b - y^{(i)}) = \vec{x}^{(i)}\)
 
Combine:
Simplify: The \(\frac{1}{2}\) and \(2\) cancel.
Or, substituting \(f_{\vec{w},b}\) back in:
We use the same chain rule setup. Let \(u = \vec{w} \cdot \vec{x}^{(i)} + b - y^{(i)}\). We are finding \(\frac{\partial}{\partial b} [u^2]\).
- Apply Chain Rule: \(\frac{\partial (u^2)}{\partial b} = 2u \cdot \frac{\partial u}{\partial b}\)
 - Part 1: \(2u = 2(\vec{w} \cdot \vec{x}^{(i)} + b - y^{(i)})\)
 - Part 2: \(\frac{\partial u}{\partial b} = \frac{\partial}{\partial b} (\vec{w} \cdot \vec{x}^{(i)} + b - y^{(i)}) = 1\)
 
Combine:
Simplify: The \(\frac{1}{2}\) and \(2\) cancel.
Or, substituting \(f_{\vec{w},b}\) back in:



