Maximum likelihood estimation (MLE) is one of the methods to find the coefficients of a model that minimizes the RSS in linear regression. MLE does this by maximizing the likelihood of observing the training data given a model.

Background

Consider objective function

Failed to parse (SVG (MathML can be enabled via browser plugin): Invalid response ("Math extension cannot connect to Restbase.") from server "https://wikimedia.org/api/rest_v1/":): {\displaystyle y = w_0 x_0 + w_1 x_1 + \ldots + w_m x_m + \epsilon = g(x) + \epsilon}

where Failed to parse (SVG (MathML can be enabled via browser plugin): Invalid response ("Math extension cannot connect to Restbase.") from server "https://wikimedia.org/api/rest_v1/":): {\displaystyle y = g(x)} is the true relationship and Failed to parse (SVG (MathML can be enabled via browser plugin): Invalid response ("Math extension cannot connect to Restbase.") from server "https://wikimedia.org/api/rest_v1/":): {\displaystyle \epsilon} is the residual error/noise

We assume that Failed to parse (SVG (MathML can be enabled via browser plugin): Invalid response ("Math extension cannot connect to Restbase.") from server "https://wikimedia.org/api/rest_v1/":): {\displaystyle x_0 = 1} , y values are independent of each other, and Failed to parse (SVG (MathML can be enabled via browser plugin): Invalid response ("Math extension cannot connect to Restbase.") from server "https://wikimedia.org/api/rest_v1/":): {\displaystyle \epsilon \sim N(0, \sigma^2)}

Likelihood function

The likelihood function determines the likelihood of observing the data given the parameters of the model. A high likelihood indicates a good model.

Failed to parse (SVG (MathML can be enabled via browser plugin): Invalid response ("Math extension cannot connect to Restbase.") from server "https://wikimedia.org/api/rest_v1/":): {\displaystyle L(w_i, \sigma^2|x,y) = \prod \frac{1}{ \sqrt{ 2 \pi \sigma^2}} exp \left( - \frac{(y_i - g(x_i))^2}{2 \sigma^2 } \right)}

The likelihood of observing the data is the product of observing each data point, given by the probability density function of standard distribution.

The weights are then changed to fit it better, and the process repeats.

The computation can be simplified to the following

<math

Anonymous

Search

Maximum likelihood estimation

Namespaces

More

Page actions

Background

Likelihood function

Navigation

Navigation

Wiki tools

Wiki tools

Anonymous

Search

Maximum likelihood estimation

Background

Likelihood function

Navigation

Wiki tools

Page tools

Categories