Proportion Estimation: Difference between revisions

From Rice Wiki
No edit summary
Tag: Manual revert
No edit summary
Line 21: Line 21:
SE = \sqrt{\frac{\hat{p} (1 - \hat{p})}{n}}
SE = \sqrt{\frac{\hat{p} (1 - \hat{p})}{n}}
</math>
</math>
= Assumptions =
We assume that
* A random sample was taken
* <math>y \geq 5</math> and <math>n - y \geq 5</math>
** rooted in normal approximation of binomial


= Wilson-Adjusted CI for p =
= Wilson-Adjusted CI for p =
Line 28: Line 36:


<math>
<math>
\widetilde{p}
\widetilde{p} = \frac{y + 2}{n + 4}
</math>
</math>
with standard error
<math>
SE(\widetilde{p}) = \sqrt{\frac{\widetilde{p} (1 - \widetilde{p})}{n + 4}}
</math>
Remember that the confidence interval is ca
<math>\widetilde{p}</math> is slightly skewed towards <math>0.5</math>,
but results in better CIs for <math>p</math>. I don't know why.
= Confidence Interval =
We use ''normal distribution'' since <math>p</math> is bounded between
0 and 1, and we don't have extra error from extra parameters such as
multiple sample mean.
Remember that the confidence interval is just mean plus-or-minus error
margin, and the error margin is just the z score multiplied by standard
error (since we are using normal distribution).
Notaby, it is possible to have a bound ''above 1 or below 0''. This
usually happens when the point estimate is close to 0 or 1. In this
case, instead of listing the impossible bounds, we report that they have
been cut off.


[[Category:Sample Statistics]]
[[Category:Sample Statistics]]

Revision as of 02:32, 16 March 2024

Proportion estimation is another common task for sample statistics.

We have sample proportion

where is the number of subjects in the sample with a particular trait, and is the sample size.

We have

and standard error

Assumptions

We assume that

  • A random sample was taken
  • and
    • rooted in normal approximation of binomial

Wilson-Adjusted CI for p

Correcting the sample proportion narrows the confidence interval. We do this with the Wilson-Adjusted estimate for

with standard error

Remember that the confidence interval is ca

is slightly skewed towards , but results in better CIs for . I don't know why.

Confidence Interval

We use normal distribution since is bounded between 0 and 1, and we don't have extra error from extra parameters such as multiple sample mean.

Remember that the confidence interval is just mean plus-or-minus error margin, and the error margin is just the z score multiplied by standard error (since we are using normal distribution).

Notaby, it is possible to have a bound above 1 or below 0. This usually happens when the point estimate is close to 0 or 1. In this case, instead of listing the impossible bounds, we report that they have been cut off.