Rice: Created page with " = How it works = First, a weight $\bf{w}$ is selected. This is the starting point from which we iteratively improve the solution. For ''each datapoint'' in the dataset, the ''gradient'' of the loss function with respect to weights is computed and a learning rate is selected. These two statistics determine the speed and direction the model $\bf{w}$ converges to. Then, a '''GD update rule''' is used to converge the weights to the desired outcome ba..."

2024-04-15T18:25:30Z

Created page with " = How it works = First, a weight <math>\bf{w}</math> is selected. This is the starting point from which we iteratively improve the solution. For ''each datapoint'' in the dataset, the ''gradient'' of the loss function with respect to weights is computed and a learning rate is selected. These two statistics determine the speed and direction the model <math>\bf{w}</math> converges to. Then, a '''GD update rule''' is used to converge the weights to the desired outcome ba..."

New page

= How it works =
First, a weight <math>\bf{w}</math> is selected. This is the starting point from which we iteratively improve the solution.

For ''each datapoint'' in the dataset, the ''gradient'' of the loss function with respect to weights is computed and a learning rate is selected. These two statistics determine the speed and direction the model <math>\bf{w}</math> converges to.

Then, a '''GD update rule''' is used to converge the weights to the desired outcome based on the gradient and learning rate.

After processing all data points, all weights are updated and 1 '''epoch''' is completed. MSE is then measured.

== GD Update Rule ==
The '''GD update rule''' is used to update the weights after an iteration.
[[Category:Machine Learning]]

Stochastic Gradient Descent - Revision history