Continuous Random Variable
Continuous random variables have an inifinite number of values for any given interval. While similar, the approach to analysis is very different from discrete variables
- Summation becomes integration
- Probability becomes area under a curve
Probability Density Function
The probability density function (pdf) maps a continuous variable to a probability density.
As the name "density" suggests, the area under the pdf curve between a range is the probability of the variable being in that range.
Total area under the curve must be , as chances of
events happening is 100% if the range includes all possible events.
There is no area under a single point
Mean and Variance
The mean and variance calculations are pretty much the same as that of discrete random variables, except the summations are swapped out for integrals.
Median and Percentile
The a-th percentileis the point at which a percent the area under the curve is to one side. You want to be a%, the calculation of which is in the page above.
By the same logic, the quartiles are at 25%, 50%, and 75% accordingly.
Uniform Distribution
Uniform random variable is described by two parameters: is minimum, and is maximum. It has a rectangular distribution, where every point has the same probability density.
CDF
Mean
Variance
Exponential Distribution
The exponential distribution models events that occurs
- Continuously
- Independently
- At a constant average rate
It takes in one parameter: , the rate parameter. It is defined by the mean below.
Exponential distribution has the memoryless property: the probability to an event does not change no matter how much time has passed.
In probability terms, the probability that we must wait an additional units given that we have waited units
Notably, it models time until some event has happened, in contrast to poisson distribution, which measures the number of events in a unit time.
CDF
Mean
Integration by parts
Variance
Integration by parts
Exponential and Poisson
Exponential distribution and poisson RVs are related:
- : the number of events in a unit time
- : waiting time until an event
Normal Random Variable
Normal random variables are the most widly used continuous RV in statistics, characterizing many natural phenomenons. It is the famous bell curve.
They are characterized by two parameters: mean and variance.
Normal random variables are perfectly symmetric at the mean.
Standardizing Normal Distribution
Standardization of a data means to make its mean 0 and its standard deviation 1. We do this by subtracting the mean and dividing by the standard deviation:
Intuitively, this moves the dataset and changes the scale. We do this to simplify probability calculations.
Z score
The z-score is the number of standard deviations above or below the mean. A positive z score is above, and a negative is below.
The pdf for normal random variable is the following.
After standardizing the normal RV, we can use the following instead.
where is the z-score covered in the last section.
Quantiles
Quantiles are points dividing the range of a probability distribution. Quartiles and precentiles are types of quantiles.
For normal distributions, there are special points (critical values) that correspond to particular probabilities: , where is the probability in the right tail.
Standard Normal Table
The standard normal table calculate lower tail values based on the standard normal distribution (i.e. area under the curve left of the point).
Linear Combinations of Independent Normal RV