Bivariate: Difference between revisions

From Rice Wiki
No edit summary
No edit summary
Line 10: Line 10:
Cor(X, Y) = \rho = \frac{Cov(X,Y)}{sd(X) sd(Y)}
Cor(X, Y) = \rho = \frac{Cov(X,Y)}{sd(X) sd(Y)}
</math>
</math>
Correlation is always between -1 and 1


= Bivariate Normal =
= Bivariate Normal =
Line 42: Line 44:
<math>E(X), sd(X), E(Y), sd(Y), \rho</math>
<math>E(X), sd(X), E(Y), sd(Y), \rho</math>


If <math>X, Y</math> follows bivariatte normal distribution, then we
If <math>X, Y</math> follows bivariate normal distribution, then we
have
have


Line 49: Line 51:
E(X)}{sd(X)} \right)
E(X)}{sd(X)} \right)
</math>
</math>
The left side is the ''predicted Z-score for Y'', and the right side is
''the product of correlation and Z-score of X = x''
The variance is given by
<math>
Var(Y | X = x) = (1 - \rho^2) Var(Y)
</math>
Due to the range of <math>\rho</math>, the variance of Y given X is
always smaller than the actual variance. The standard deviation is just
rooted that.

Revision as of 18:08, 18 March 2024

Consider two numerica random variables and . We can measure their covariance.

The correlation of two random variables measures the line dependent between and

Correlation is always between -1 and 1

Bivariate Normal

The bivariate normal (aka. bivariate gaussian) is one special type of continuous random variable.

is bivariate normal if

  1. The marginal PDF of both X and Y are normal
  2. For any , the condition PDF of given is Normal
    • Works the other way around: Bivariate gaussian means that condition is satisfied

Predicting Y given X

Given bivariate normal, we can predict one variable given another. Let us try estimating the expected Y given X is x

There are three main methods

  • Scatter plot approximation
  • Joint PDF
  • 5 statistics

5 Parameters

We need to know 5 parameters about and

If follows bivariate normal distribution, then we have

The left side is the predicted Z-score for Y, and the right side is the product of correlation and Z-score of X = x

The variance is given by

Due to the range of , the variance of Y given X is always smaller than the actual variance. The standard deviation is just rooted that.