How it works

First, a weight ${\bf {w}}$ is selected. This is the starting point from which we iteratively improve the solution.

For each datapoint in the dataset, the gradient of the loss function with respect to weights is computed and a learning rate is selected. These two statistics determine the speed and direction the model ${\bf {w}}$ converges to.

Then, a GD update rule is used to converge the weights to the desired outcome based on the gradient and learning rate.

After processing all data points, all weights are updated and 1 epoch is completed. MSE is then measured.

GD Update Rule

The GD update rule is used to update the weights after an iteration.

Anonymous

Search

Stochastic Gradient Descent

Namespaces

More

Page actions

How it works

GD Update Rule

Navigation

Navigation

Wiki tools

Wiki tools

Anonymous

Search

Stochastic Gradient Descent

How it works

GD Update Rule

Navigation

Wiki tools

Page tools

Categories