Proportion Estimation: Difference between revisions

From Rice Wiki
No edit summary
Tag: Manual revert
No edit summary
 
(One intermediate revision by the same user not shown)
Line 21: Line 21:
SE = \sqrt{\frac{\hat{p} (1 - \hat{p})}{n}}
SE = \sqrt{\frac{\hat{p} (1 - \hat{p})}{n}}
</math>
</math>
= Assumptions =
We assume that
* A random sample was taken
* <math>y \geq 5</math> and <math>n - y \geq 5</math>
** rooted in normal approximation of binomial


= Wilson-Adjusted CI for p =
= Wilson-Adjusted CI for p =
Line 28: Line 36:


<math>
<math>
\widetilde{p}
\widetilde{p} = \frac{y + 2}{n + 4}
</math>
 
This essentially adds two samples that fits the traits and two that does
not into consideration. It doesn't impact large samples but works "well"
with small samples.
 
with standard error
 
<math>
SE(\widetilde{p}) = \sqrt{\frac{\widetilde{p} (1 - \widetilde{p})}{n + 4}}
</math>
</math>
Remember that the confidence interval is mean plus or minus standard
error.
<math>\widetilde{p}</math> is slightly skewed towards <math>0.5</math>,
but results in better CIs for <math>p</math>. I don't know why.
= Confidence Interval =
We use ''normal distribution'' since <math>p</math> is bounded between 0
and 1, and we don't have extra error from extra parameters such as
multiple sample mean.
Remember that the confidence interval is just mean plus-or-minus error
margin, and the error margin is just the z score multiplied by standard
error (since we are using normal distribution).
Notaby, it is possible to have a bound ''above 1 or below 0''. This
usually happens when the point estimate is close to 0 or 1. In this
case, instead of listing the impossible bounds, we report that they have
been cut off.


[[Category:Sample Statistics]]
[[Category:Sample Statistics]]

Latest revision as of 17:43, 18 March 2024

Proportion estimation is another common task for sample statistics.

We have sample proportion

where is the number of subjects in the sample with a particular trait, and is the sample size.

We have

and standard error

Assumptions

We assume that

  • A random sample was taken
  • and
    • rooted in normal approximation of binomial

Wilson-Adjusted CI for p

Correcting the sample proportion narrows the confidence interval. We do this with the Wilson-Adjusted estimate for

This essentially adds two samples that fits the traits and two that does not into consideration. It doesn't impact large samples but works "well" with small samples.

with standard error

Remember that the confidence interval is mean plus or minus standard error.

is slightly skewed towards , but results in better CIs for . I don't know why.

Confidence Interval

We use normal distribution since is bounded between 0 and 1, and we don't have extra error from extra parameters such as multiple sample mean.

Remember that the confidence interval is just mean plus-or-minus error margin, and the error margin is just the z score multiplied by standard error (since we are using normal distribution).

Notaby, it is possible to have a bound above 1 or below 0. This usually happens when the point estimate is close to 0 or 1. In this case, instead of listing the impossible bounds, we report that they have been cut off.