Bivariate: Difference between revisions
No edit summary |
No edit summary |
||
Line 10: | Line 10: | ||
Cor(X, Y) = \rho = \frac{Cov(X,Y)}{sd(X) sd(Y)} | Cor(X, Y) = \rho = \frac{Cov(X,Y)}{sd(X) sd(Y)} | ||
</math> | </math> | ||
Correlation is always between -1 and 1 | |||
= Bivariate Normal = | = Bivariate Normal = | ||
Line 42: | Line 44: | ||
<math>E(X), sd(X), E(Y), sd(Y), \rho</math> | <math>E(X), sd(X), E(Y), sd(Y), \rho</math> | ||
If <math>X, Y</math> follows | If <math>X, Y</math> follows bivariate normal distribution, then we | ||
have | have | ||
Line 49: | Line 51: | ||
E(X)}{sd(X)} \right) | E(X)}{sd(X)} \right) | ||
</math> | </math> | ||
The left side is the ''predicted Z-score for Y'', and the right side is | |||
''the product of correlation and Z-score of X = x'' | |||
The variance is given by | |||
<math> | |||
Var(Y | X = x) = (1 - \rho^2) Var(Y) | |||
</math> | |||
Due to the range of <math>\rho</math>, the variance of Y given X is | |||
always smaller than the actual variance. The standard deviation is just | |||
rooted that. |
Revision as of 18:08, 18 March 2024
Consider two numerica random variables and . We can measure their covariance.
The correlation of two random variables measures the line dependent between and
Correlation is always between -1 and 1
Bivariate Normal
The bivariate normal (aka. bivariate gaussian) is one special type of continuous random variable.
is bivariate normal if
- The marginal PDF of both X and Y are normal
- For any , the condition PDF of given is Normal
- Works the other way around: Bivariate gaussian means that condition is satisfied
Predicting Y given X
Given bivariate normal, we can predict one variable given another. Let us try estimating the expected Y given X is x
There are three main methods
- Scatter plot approximation
- Joint PDF
- 5 statistics
5 Parameters
We need to know 5 parameters about and
If follows bivariate normal distribution, then we have
The left side is the predicted Z-score for Y, and the right side is the product of correlation and Z-score of X = x
The variance is given by
Due to the range of , the variance of Y given X is always smaller than the actual variance. The standard deviation is just rooted that.