Diffusion model: Difference between revisions
No edit summary |
|||
(3 intermediate revisions by the same user not shown) | |||
Line 7: | Line 7: | ||
=== Forward diffusion === | === Forward diffusion === | ||
Mathematically, each step of the forward diffusion process is defined as <math>q(x_t|x_{t-1})=N(x_t, \sqrt{1-\beta_t}x_{t-1} , \beta_tI)</math>, where ''x<sub>t</sub>'' is the process output at step ''t'', ''q'' is a distribution at the current step depending on the previous step, ''N'' is a normal distribution, the square root is the mean, and the last one being the variance. | Mathematically, each step of the forward diffusion process is defined as <math>q(x_t|x_{t-1})=\mathcal{N}(x_t, \sqrt{1-\beta_t}x_{t-1} , \beta_tI)</math>, where ''x<sub>t</sub>'' is the process output at step ''t'', ''q'' is a distribution at the current step depending on the previous step, ''N'' is a [[multivariate normal distribution]], the square root is the mean, and the last one being the variance. | ||
Explained more textually, the process generates a probability distribution of the current step (''x<sub>t</sub>'') given the last step. The distribution is normal, with randomness injected by a schedule. The image starts with a non-random distribution, and through this process we destroy that distribution into noise. In real scenarios, the noise is not added iteratively, but this is the principle. | Explained more textually, the process generates a probability distribution of the current step (''x<sub>t</sub>'') given the last step. The distribution is normal, with randomness injected by a schedule. The image starts with a non-random distribution, and through this process we destroy that distribution into noise. In real scenarios, the noise is not added iteratively, but this is the principle. | ||
==== Schedule ==== | |||
''Beta t'' is the hyperparameter called ''schedule''. It increases w.r.t. time step ''t'' and controls the variance of the distribution during the forward process. It is a value ranging from 0 to 1, and as a result, it brings the mean of the gaussian closer to 0 and the variance closer to ''I'', which is the goal of the forward process. | |||
We want a small schedule such that learning how to undo it isn't too difficult. | |||
==== Noising process ==== | |||
Thus, the entire noise added at time step ''T'' is defined to be <math>q(x_{1:T} |x_0)=\prod^T_{t=i}q(x_t|x_{t-1})</math>, coming form probability. | |||
With the derivation in source 1, we can simplify it to a single equation instead of an iterative process, significantly speeding up the algorithm. | |||
= Sources = | = Sources = | ||
# https://erdem.pl/2023/11/step-by-step-visual-introduction-to-diffusion-models | |||
# [https://www.youtube.com/watch?v=fbLgFrlTnGU&t=75s https://www.youtube.com/watch?v=fbLgFrlTnGU] | |||
[[Category:Computer Science]] | [[Category:Computer Science]] | ||
[[Category:Machine Learning]] | [[Category:Machine Learning]] |
Latest revision as of 23:08, 5 July 2024
In a nutshell, diffusion models work by making something more random/noisy, and through the process finding the inverse, which is making noisy things less random, thus generating data.
Diffusion models are frequently used for computer vision tasks such as text-to-image. This page will focus exclusively on that use case.
Principle
More specifically, diffusion models consists of two steps: forward diffusion, which turns an image into noise, and backward diffusion, which turns noise back into an image.
Forward diffusion
Mathematically, each step of the forward diffusion process is defined as , where xt is the process output at step t, q is a distribution at the current step depending on the previous step, N is a multivariate normal distribution, the square root is the mean, and the last one being the variance.
Explained more textually, the process generates a probability distribution of the current step (xt) given the last step. The distribution is normal, with randomness injected by a schedule. The image starts with a non-random distribution, and through this process we destroy that distribution into noise. In real scenarios, the noise is not added iteratively, but this is the principle.
Schedule
Beta t is the hyperparameter called schedule. It increases w.r.t. time step t and controls the variance of the distribution during the forward process. It is a value ranging from 0 to 1, and as a result, it brings the mean of the gaussian closer to 0 and the variance closer to I, which is the goal of the forward process.
We want a small schedule such that learning how to undo it isn't too difficult.
Noising process
Thus, the entire noise added at time step T is defined to be , coming form probability.
With the derivation in source 1, we can simplify it to a single equation instead of an iterative process, significantly speeding up the algorithm.