Curve fitting
Curve fitting is the process of defining/determining (fit) a function (curve) that best approximates the relationship between dependent and independent variables.
- Underfitting is when models are too basic
- Overfitting is when models are too complex, which may lead to incorrect predictions for values outside of the training data.
Underfitting
Visualize the fit of the model on the test data and bias variance tradeoff.
If a model has high bias and low variance, the model undefits the data
Error of the model
Failed to parse (SVG (MathML can be enabled via browser plugin): Invalid response ("Math extension cannot connect to Restbase.") from server "https://wikimedia.org/api/rest_v1/":): {\displaystyle \text{Error of the model} = \text{Bias}^2 + \text{Irreducible Error}} If a model has high bias and low variance, the model undefits the data. If the opposite occurrs, it overfits.
Bias Variance Tradeoff

Bias is the error between average model prediction and the ground truth
Failed to parse (SVG (MathML can be enabled via browser plugin): Invalid response ("Math extension cannot connect to Restbase.") from server "https://wikimedia.org/api/rest_v1/":): {\displaystyle \text{Bias}^2 = E [(E[g(x)] - f(x))^2 ]}
Variance is the error between the average model prediction and the model prediction
Failed to parse (SVG (MathML can be enabled via browser plugin): Invalid response ("Math extension cannot connect to Restbase.") from server "https://wikimedia.org/api/rest_v1/":): {\displaystyle \text{Variance} = E [(E[g(x)] - g (x))^2 ]}
Bias and variance have an inverse relationship.
Regression Models
Mean squared error uses the mean of a collection of differences between the prediction and the truth squared as a measurement for fit.
Failed to parse (SVG (MathML can be enabled via browser plugin): Invalid response ("Math extension cannot connect to Restbase.") from server "https://wikimedia.org/api/rest_v1/":): {\displaystyle MSE = \frac{1}{n} \sum (y_i - \hat{y_i})^2}
It can be used to measure the fit of the model on the training and test data.
