Batch Gradient Descent: Difference between revisions

Revision as of 04:01, 25 April 2024

In batch gradient descent, the unit of data is the entire dataset, in contrast to Stochastic Gradient Descent whose unit of data is one data point. It uses the average of the computed gradients to update the weights of a batch of data points.

Faster
Less performing/precise (not always)

A variation, mini batch GD, uses smaller batches (not the entire dataset). It mitigates the lack in precision.

@@ Line 3: / Line 3: @@
 * Faster
 * Less performing/precise (not always)
-A variation is to use smaller batches (not the entire dataset). It mitigates the lack in precision.
+A variation, '''mini batch GD,''' uses smaller batches (not the entire dataset). It mitigates the lack in precision.

Anonymous

Search

Batch Gradient Descent: Difference between revisions

Namespaces

More

Page actions

Revision as of 04:01, 25 April 2024

Navigation

Navigation

Wiki tools

Wiki tools

Anonymous

Search

Batch Gradient Descent: Difference between revisions

Revision as of 04:01, 25 April 2024

Navigation

Wiki tools

Page tools