
<?xml version="1.0"?>
<feed xmlns="http://www.w3.org/2005/Atom" xml:lang="en">
	<id>http://ricefriedegg.com:80/mediawiki/index.php?action=history&amp;feed=atom&amp;title=Back_propagation</id>
	<title>Back propagation - Revision history</title>
	<link rel="self" type="application/atom+xml" href="http://ricefriedegg.com:80/mediawiki/index.php?action=history&amp;feed=atom&amp;title=Back_propagation"/>
	<link rel="alternate" type="text/html" href="http://ricefriedegg.com:80/mediawiki/index.php?title=Back_propagation&amp;action=history"/>
	<updated>2026-04-10T00:33:28Z</updated>
	<subtitle>Revision history for this page on the wiki</subtitle>
	<generator>MediaWiki 1.41.0</generator>
	<entry>
		<id>http://ricefriedegg.com:80/mediawiki/index.php?title=Back_propagation&amp;diff=645&amp;oldid=prev</id>
		<title>Rice: Created page with &quot;&#039;&#039;&#039;Back propagation&#039;&#039;&#039; is a error calculation technique. It consists of passing a loss function &#039;&#039;backwards&#039;&#039; through a neural network layer-by-layer to update its weights.  = Procedure = Consider the following RSS loss function (halved for easier derivative).  &lt;math&gt; E = \frac{1}{2}\sum (y_i - \bf{w}\bf{x})^2 &lt;/math&gt;  After each feed-forward pass where one data point is passed through the neural network, the gradient of the loss function is computed.  We compute gra...&quot;</title>
		<link rel="alternate" type="text/html" href="http://ricefriedegg.com:80/mediawiki/index.php?title=Back_propagation&amp;diff=645&amp;oldid=prev"/>
		<updated>2024-05-01T03:18:36Z</updated>

		<summary type="html">&lt;p&gt;Created page with &amp;quot;&amp;#039;&amp;#039;&amp;#039;Back propagation&amp;#039;&amp;#039;&amp;#039; is a error calculation technique. It consists of passing a loss function &amp;#039;&amp;#039;backwards&amp;#039;&amp;#039; through a &lt;a href=&quot;/mediawiki/index.php/Neural_network&quot; title=&quot;Neural network&quot;&gt;neural network&lt;/a&gt; layer-by-layer to update its weights.  = Procedure = Consider the following RSS loss function (halved for easier derivative).  &amp;lt;math&amp;gt; E = \frac{1}{2}\sum (y_i - \bf{w}\bf{x})^2 &amp;lt;/math&amp;gt;  After each feed-forward pass where one data point is passed through the neural network, the gradient of the loss function is computed.  We compute gra...&amp;quot;&lt;/p&gt;
&lt;p&gt;&lt;b&gt;New page&lt;/b&gt;&lt;/p&gt;&lt;div&gt;&amp;#039;&amp;#039;&amp;#039;Back propagation&amp;#039;&amp;#039;&amp;#039; is a error calculation technique. It consists of passing a loss function &amp;#039;&amp;#039;backwards&amp;#039;&amp;#039; through a [[neural network]] layer-by-layer to update its weights.&lt;br /&gt;
&lt;br /&gt;
= Procedure =&lt;br /&gt;
Consider the following RSS loss function (halved for easier derivative).&lt;br /&gt;
&lt;br /&gt;
&amp;lt;math&amp;gt;&lt;br /&gt;
E = \frac{1}{2}\sum (y_i - \bf{w}\bf{x})^2&lt;br /&gt;
&amp;lt;/math&amp;gt;&lt;br /&gt;
&lt;br /&gt;
After each feed-forward pass where one data point is passed through the neural network, the gradient of the loss function is computed.&lt;br /&gt;
&lt;br /&gt;
We compute gradient separately at each layer due to difference in [[activation function]] by computing the loss function for each neuron of that layer and add up the result.&lt;br /&gt;
&lt;br /&gt;
&amp;lt;math&amp;gt;&lt;br /&gt;
E = \sum \left[ \frac{1}{2}\sum (y_i - \bf{w} \bf{x})^2 \right]&lt;br /&gt;
&amp;lt;/math&amp;gt;&lt;br /&gt;
&lt;br /&gt;
Then, we compute the gradient of the loss function on that layer w.r.t. the weights that created that layer.&lt;br /&gt;
&lt;br /&gt;
&amp;lt;math&amp;gt;&lt;br /&gt;
\frac{\partial E}{\partial w}&lt;br /&gt;
&amp;lt;/math&amp;gt;&lt;br /&gt;
&lt;br /&gt;
&lt;br /&gt;
[[Category:Machine Learning]]&lt;/div&gt;</summary>
		<author><name>Rice</name></author>
	</entry>
</feed>