Batch Gradient Descent

From Rice Wiki
Revision as of 18:23, 15 April 2024 by Rice (talk | contribs) (Created page with "In '''batch gradient descent''', the unit of data is the entire dataset, in contrast to Stochastic Gradient Descent whose unit of data is one data point. It uses the ''average of the computed gradients'' to update the weights of a ''batch'' of data points. * Faster * Less performing/precise (not always)")
(diff) ← Older revision | Latest revision (diff) | Newer revision → (diff)

In batch gradient descent, the unit of data is the entire dataset, in contrast to Stochastic Gradient Descent whose unit of data is one data point. It uses the average of the computed gradients to update the weights of a batch of data points.

  • Faster
  • Less performing/precise (not always)