Latest revision as of 04:08, 25 April 2024

Continuous random variables have an inifinite number of values for any given interval. While similar, the approach to analysis is very different from discrete variables

Summation becomes integration
Probability becomes area under a curve

Probability Density Function

The probability density function (pdf) maps a continuous variable to a probability density.

As the name "density" suggests, the area under the pdf curve between a range is the probability of the variable being in that range.

$P(c\leq x\leq d)=\int _{c}^{d}f(x)dx=F(d)-F(c)$

Total area under the curve must be $1$ , as chances of events happening is 100% if the range includes all possible events.

$\int _{-\infty }^{\infty }f(x)dx=1$

There is no area under a single point

$P(X=a)=0$

Mean and Variance

The mean and variance calculations are pretty much the same as that of discrete random variables, except the summations are swapped out for integrals.

$E(X)=\mu _{X}=\int _{-\infty }^{\infty }xf(x)dx$

$Var(X)=\sigma _{X}^{2}=\int _{-\infty }^{\infty }(x-\mu _{X})^{2}f(x)dx=\int _{-\infty }^{\infty }x^{2}f(x)dx-\mu _{X}^{2}$

Median and Percentile

The a-th percentileis the point at which a percent the area under the curve is to one side. You want $P(X\leq x)$ to be a%, the calculation of which is in the page above.

By the same logic, the quartiles are at 25%, 50%, and 75% accordingly.

Uniform Distribution $X\sim Uniform(a,b)$

Uniform random variable is described by two parameters: $a$ is minimum, and $b$ is maximum. It has a rectangular distribution, where every point has the same probability density.

PDF

$f(x)={\begin{cases}{\frac {1}{b-a}}&a\leq x\leq b\\0&{\text{otherwise}}\end{cases}}$

CDF

$F(x)={\begin{cases}0&x<a\\{\frac {x-a}{b-a}}&a\leq x\leq b\\1&x>b\end{cases}}$

Mean

$\mu _{X}={\frac {a+b}{2}}$

Variance

$\sigma ^{2}={\frac {1}{12}}(b-a)^{2}$

Exponential Distribution

The exponential distribution models events that occurs

Continuously
Independently
At a constant average rate

It takes in one parameter: $\lambda$ , the rate parameter. Defined by the mean below, it is the average rate per unit time/space.

Exponential distribution has the memoryless property: the probability to an event does not change no matter how much time has passed.

In probability terms, the probability that we must wait an additional $t$ units given that we have waited $s$ units

$P(T>t+s|T>s)=P(T>t)=e^{-\lambda t}$

Notably, it models time until some event has happened, in contrast to poisson distribution, which measures the number of events in a unit time.

PDF

$f(x)={\begin{cases}\lambda e^{-\lambda x}&a\leq x\leq b\\0&{\text{otherwise}}\end{cases}}$

CDF

$F(x)=1-e^{-\lambda x}$

Mean

Integration by parts

$\mu _{X}={\frac {1}{\lambda }}$

Variance

Integration by parts

$\sigma ^{2}={\frac {1}{\lambda ^{2}}}$

Exponential and Poisson

Exponential distribution and poisson RVs are related:

$X\sim Poisson(\lambda )$ : the number of events in a unit time
$X\sim Exp(\lambda )$ : waiting time until an event

Normal Random Variable

Normal random variables (aka. Gaussian RV) are the most widly used continuous RV in statistics, characterizing many natural phenomenons. It is the famous bell curve.

They are characterized by two parameters: mean and variance.

$Y\sim N(\mu _{Y},\sigma _{Y}^{2})$

Normal random variables are perfectly symmetric at the mean.

Standardizing Normal Distribution

Standardization of a data means to make its mean 0 and its standard deviation 1. We do this by subtracting the mean and dividing by the standard deviation:

$Z={\frac {Y-\mu }{\sigma }}$

Intuitively, this moves the dataset and changes the scale. We do this to simplify probability calculations.

Z score

The z-score is the number of standard deviations above or below the mean. A positive z score is above, and a negative is below.

$z={\frac {y-\mu }{\sigma }}$

PDF

The pdf for normal random variable is the following.

$f(y)={\frac {1}{\sqrt {2\pi \sigma ^{2}}}}e^{-{\frac {1}{2}}{\frac {(y-\mu )^{2}}{\sigma ^{2}}}}$

After standardizing the normal RV, we can use the following instead.

$f(y)={\frac {1}{\sqrt {2\pi \sigma ^{2}}}}e^{-{\frac {1}{2}}z^{2}}$

where $z$ is the z-score covered in the last section.

$f(z)={\frac {1}{\sqrt {2\pi }}}e^{-{\frac {1}{2}}z^{2}}$

Quantiles

Quantiles are points dividing the range of a probability distribution. Quartiles and precentiles are types of quantiles.

For normal distributions, there are special points (critical values) that correspond to particular probabilities: $z_{a}$ , where $a$ is the probability in the right tail.

Standard Normal Table

The standard normal table calculate lower tail values based on the standard normal distribution (i.e. area under the curve left of the point).

Linear Combinations of Independent Normal RV

$W=aX+bY$

$W\sim N(a\mu )X+b\mu _{y},a^{2}\sigma _{X}^{2}+b^{2}\sigma _{y}^{2})$

Other distributions

Two Numerical RVs

Anonymous

Search

Continuous Random Variable: Difference between revisions

Namespaces

More

Page actions

Latest revision as of 04:08, 25 April 2024

Contents

Probability Density Function

Mean and Variance

Median and Percentile

Uniform Distribution $X\sim Uniform(a,b)$

PDF

CDF

Mean

Variance

Exponential Distribution

PDF

CDF

Mean

Variance

Exponential and Poisson

Normal Random Variable

Standardizing Normal Distribution

Z score

PDF

Quantiles

Standard Normal Table

Linear Combinations of Independent Normal RV

Other distributions

Navigation

Navigation

Wiki tools

Wiki tools

@@ Line 1: / Line 1: @@
-== Standard Normal Distribution ==
+[[Category:Distribution (Statistics)]]
+Continuous random variables have an inifinite number of values for any
+given interval. While similar, the approach to analysis is very
+different from discrete variables
+* Summation becomes integration
+* Probability becomes area under a curve
+= Probability Density Function =
+The probability density function (pdf) maps a continuous variable to a
+probability density.
+As the name "density" suggests, the area under the pdf curve between a
+range is the probability of the variable being in that range.
+<math>
+P(c \leq x \leq d) = \int_c^d f(x) dx = F(d) - F(c)
+</math>
+Total area under the curve must be <math> 1 </math>, as chances of
+events happening is 100% if the range includes all possible events.
+<math>
+\int_{-\infty}^\infty f(x) dx = 1
+</math>
+There is no area under a single point
+<math>
+P(X = a) = 0
+</math>
+= Mean and Variance =
+The mean and variance calculations are pretty much the same as that of
+[[Discrete Random Variable|discrete random variables]], except the summations are swapped out for
+integrals.
+<math>
+E(X) = \mu_X = \int_{-\infty}^\infty x f(x) dx
+</math>
+<math>
+Var(X) = \sigma^2_X = \int_{-\infty}^\infty (x - \mu_X)^2 f(x) dx
+= \int_{-\infty}^\infty x^2 f(x) dx - \mu_X^2
+</math>
+= Median and Percentile =
+The a-th percentileis the point at which a percent the area under the
+curve is to one side. You want <math> P(X \leq x) </math> to be a%, the
+calculation of which is in the page above.
+By the same logic, the quartiles are at 25%, 50%, and 75% accordingly.
+= Uniform Distribution <math> X \sim Uniform(a, b) </math> =
+Uniform random variable is described by two parameters: <math> a </math>
+is minimum, and <math> b </math> is maximum. It has a rectangular
+distribution, where every point has the same probability density.
+==== PDF ====
+<math>
+f(x) = \begin{cases}
+    \frac{ 1 }{ b - a } & a \leq x \leq b \\
+& \text{otherwise}
+\end{cases}
+</math>
+==== CDF ====
+<math>
+F(x) = \begin{cases}
+& x < a \\
+    \frac{ x - a }{ b - a } & a \leq x \leq b \\
+& x > b
+\end{cases}
+</math>
+==== Mean ====
+<math>
+\mu_X = \frac{ a + b }{ 2 }
+</math>
+==== Variance ====
+<math>
+\sigma^2 = \frac{ 1 }{ 12 } (b - a)^2
+</math>
+= Exponential Distribution =
+The exponential distribution models events that occurs
+* Continuously
+* Independently
+* At a constant average rate
+It takes in one parameter: <math>\lambda</math>, the '''rate parameter.''' Defined by the mean below, it is the ''average rate per unit time/space.''
+Exponential distribution has the '''memoryless property''': the
+probability to an event does not change no matter how much time has
+passed.
+In probability terms, the probability that we must wait an
+additional <math>t</math> units given that we have waited <math>s</math>
+units
+<math>
+P(T > t + s | T > s) = P(T > t) = e^{-\lambda t}
+</math>
+Notably, it models time until some event has happened, in contrast to [[Discrete Random Variable#Poisson|poisson distribution]], which measures the number of events in a unit time.
+==== PDF ====
+<math>
+f(x) = \begin{cases}
+    \lambda e ^{ - \lambda x } & a \leq x \leq b \\
+& \text{otherwise}
+\end{cases}
+</math>
+==== CDF ====
+<math>
+F(x) = 1 - e^{- \lambda x}
+</math>
+==== Mean ====
+Integration by parts
+<math>
+\mu_X = \frac{1}{\lambda}
+</math>
+==== Variance ====
+Integration by parts
+<math>
+\sigma^2 = \frac{ 1 }{ \lambda^2 }
+</math>
+== Exponential and Poisson ==
+Exponential distribution and poisson RVs are related:
+* <math>X \sim Poisson(\lambda)</math>: the number of events in a unit time
+* <math>X \sim Exp(\lambda)</math>: waiting time until an event
+= Normal Random Variable =
+[[File:Z score table.png|thumb|Z score table]]
+'''Normal random variables''' (aka. Gaussian RV) are the most widly used continuous RV in
+statistics, characterizing many natural phenomenons. It is the famous
+bell curve.
+They are characterized by two parameters: mean and variance.
+<math>
+Y \sim N(\mu_Y, \sigma^2_Y)
+</math>
+Normal random variables are perfectly symmetric at the mean.
+==== Standardizing Normal Distribution ====
+Standardization of a data means to make its mean 0 and its standard
+deviation 1. We do this by subtracting the mean and dividing by the
+standard deviation:
+<math>
+Z = \frac{Y - \mu}{\sigma}
+</math>
+Intuitively, this moves the dataset and changes the scale. We do this to
+simplify probability calculations.
+==== Z score ====
+The z-score is the number of standard deviations above or below the
+mean. A positive z score is above, and a negative is below.
+<math>
+z = \frac{y - \mu}{\sigma}
+</math>
+==== PDF ====
+The pdf for normal random variable is the following.
+<math>
+f(y) = \frac{1}{\sqrt{2 \pi \sigma^2}} e^{-\frac{1}{2} \frac{(y -
+\mu)^2}{\sigma^2}}
+</math>
+After standardizing the normal RV, we can use the following instead.
+<math>
+f(y) = \frac{1}{\sqrt{2 \pi \sigma^2}} e^{-\frac{1}{2} z^2}
+</math>
+where <math>z</math> is the z-score covered in the last section.
+<math>
+f(z) = \frac{1}{\sqrt{2 \pi}} e^{-\frac{1}{2} z^2}
+</math>
+==== Quantiles ====
+Quantiles are points dividing the range of a probability
+distribution. Quartiles and precentiles are types of quantiles.
+For normal distributions, there are special points (critical values)
+that correspond to particular probabilities: <math>z_a</math>, where
+<math>a</math> is the probability in the right tail.
+==== Standard Normal Table ====
+The standard normal table calculate lower tail values based on the
+standard normal distribution (i.e. area under the curve left of the
+point).
+==== Linear Combinations of Independent Normal RV ====
+<math>
+W = aX + bY
+</math>
+<math>
+W \sim N(a\mu)X + b\mu_y, a^2 \sigma^2_X + b^2 \sigma^2_y)
+</math>
+= Other distributions =
+[[Two Numerical RVs]]
+[[Category:Statistics]]

Anonymous

Search

Continuous Random Variable: Difference between revisions

Latest revision as of 04:08, 25 April 2024

Probability Density Function

Mean and Variance

Median and Percentile

Uniform Distribution X ∼ U n i f o r m ( a , b ) {\displaystyle X\sim Uniform(a,b)}

PDF

CDF

Mean

Variance

Exponential Distribution

PDF

CDF

Mean

Variance

Exponential and Poisson

Normal Random Variable

Standardizing Normal Distribution

Z score

PDF

Quantiles

Standard Normal Table

Linear Combinations of Independent Normal RV

Other distributions

Navigation

Wiki tools

Page tools

Categories

Uniform Distribution $X\sim Uniform(a,b)$