Revision as of 01:22, 16 March 2024

Hypothesis test is a technique where sample data is used to determine if the confidence interval supports a particular claim. Hypothesis tests quantify how likely our data is given a particular claim.

This page will focus on usage of hypothesis tests in the context of mean comparison.

Procedure (Mean Comparison)

1. Null and Alternative Hypothesis

To perform hypothesis test with mean comparison, we need two things:

The null hypothesis Failed to parse (SVG (MathML can be enabled via browser plugin): Invalid response ("Math extension cannot connect to Restbase.") from server "https://wikimedia.org/api/rest_v1/":): {\displaystyle H_0} is the statement which we assume to be true
The alternative hypothesis $H_{A}$ is the complement of the null hypothesis.

Mean comparison work with the difference in means

Failed to parse (SVG (MathML can be enabled via browser plugin): Invalid response ("Math extension cannot connect to Restbase.") from server "https://wikimedia.org/api/rest_v1/":): {\displaystyle \mu_1 - \mu_2 }

As such, there are three sets of hypotheses:

Failed to parse (SVG (MathML can be enabled via browser plugin): Invalid response ("Math extension cannot connect to Restbase.") from server "https://wikimedia.org/api/rest_v1/":): {\displaystyle H_0: \mu_1 - \mu_2 = 0} vs Failed to parse (SVG (MathML can be enabled via browser plugin): Invalid response ("Math extension cannot connect to Restbase.") from server "https://wikimedia.org/api/rest_v1/":): {\displaystyle H_A: \mu_1 - \mu_2 \neq 0}
Failed to parse (SVG (MathML can be enabled via browser plugin): Invalid response ("Math extension cannot connect to Restbase.") from server "https://wikimedia.org/api/rest_v1/":): {\displaystyle H_0: \mu_1 - \mu_2 \geq 0} vs Failed to parse (SVG (MathML can be enabled via browser plugin): Invalid response ("Math extension cannot connect to Restbase.") from server "https://wikimedia.org/api/rest_v1/":): {\displaystyle H_A: \mu_1 - \mu_2 < 0}
Failed to parse (SVG (MathML can be enabled via browser plugin): Invalid response ("Math extension cannot connect to Restbase.") from server "https://wikimedia.org/api/rest_v1/":): {\displaystyle H_0: \mu_1 - \mu_2 \leq 0} vs Failed to parse (SVG (MathML can be enabled via browser plugin): Invalid response ("Math extension cannot connect to Restbase.") from server "https://wikimedia.org/api/rest_v1/":): {\displaystyle H_A: \mu_1 - \mu_2 > 0}

2. Test-Statistic

Next, we need to calculate a test-statistic Failed to parse (SVG (MathML can be enabled via browser plugin): Invalid response ("Math extension cannot connect to Restbase.") from server "https://wikimedia.org/api/rest_v1/":): {\displaystyle t_s} . This measures how much our sample data differ from Failed to parse (SVG (MathML can be enabled via browser plugin): Invalid response ("Math extension cannot connect to Restbase.") from server "https://wikimedia.org/api/rest_v1/":): {\displaystyle H_0} . It summarizes our data to one number to perform hypothesis test on.

For mean comparison, the hypothesized difference is 0 (i.e. the means are the same). Therefore, the test-statistic is calculated as follows:

Failed to parse (SVG (MathML can be enabled via browser plugin): Invalid response ("Math extension cannot connect to Restbase.") from server "https://wikimedia.org/api/rest_v1/":): {\displaystyle t_s = \frac{\bar{y_1} - \bar{y_2} - 0 }{ \sqrt{\frac{s_1^2}{n_1} + \frac{s_2^2}{n_2}}} }

On the numerator, the difference mean is subtracted by 0 to since that is the comparison point; all three sets of hypotheses in mean comparison compares against 0.

On the denominator, the value is divided by the sample standard deviation. This is a surprise tool that will help us later (bottom of 3.)

Failed to parse (SVG (MathML can be enabled via browser plugin): Invalid response ("Math extension cannot connect to Restbase.") from server "https://wikimedia.org/api/rest_v1/":): {\displaystyle t_s} , the more our data differs from Failed to parse (SVG (MathML can be enabled via browser plugin): Invalid response ("Math extension cannot connect to Restbase.") from server "https://wikimedia.org/api/rest_v1/":): {\displaystyle H_0} . Notice that it increases with sample mean difference and decreases with variance.

3. Find P-value

The p-value is the probability of observing our data or more extreme if Failed to parse (SVG (MathML can be enabled via browser plugin): Invalid response ("Math extension cannot connect to Restbase.") from server "https://wikimedia.org/api/rest_v1/":): {\displaystyle H_0} is in fact true. To find this, we first need to know the sampling distribution of our random variable.

Distribution

In the case of mean comparison, because sample mean has normal distribution, by RV linear combination, the sampling distribution of Failed to parse (SVG (MathML can be enabled via browser plugin): Invalid response ("Math extension cannot connect to Restbase.") from server "https://wikimedia.org/api/rest_v1/":): {\displaystyle \bar{Y}_1 - \bar{Y}_2} is

Failed to parse (SVG (MathML can be enabled via browser plugin): Invalid response ("Math extension cannot connect to Restbase.") from server "https://wikimedia.org/api/rest_v1/":): {\displaystyle ( \bar{Y_1} - \bar{Y_2} ) \sim N(\mu_1 - \mu_2, \frac{\sigma_1^2}{n_1} + \frac{\sigma_2^2}{n_2}) }

Both means follow the t-distribution, therefore the difference also follows t-distribution.

We are not going to derive it, but the degree of freedom in this case is

Failed to parse (SVG (MathML can be enabled via browser plugin): Invalid response ("Math extension cannot connect to Restbase.") from server "https://wikimedia.org/api/rest_v1/":): {\displaystyle df = \upsilon = \frac{ (\frac{s_1^2}{n_1} + \frac{s_2^2}{n_2} )^2 } { \frac{(s_1^2 / n_1)^2}{ n_1 - 1} + \frac{(s_2^2 / n_2)^2}{ n_2 - 1} } }

where Failed to parse (SVG (MathML can be enabled via browser plugin): Invalid response ("Math extension cannot connect to Restbase.") from server "https://wikimedia.org/api/rest_v1/":): {\displaystyle \upsilon} is rounded down when using the t-table.

Remember how the test-statistic has the sample deviation on the denominator? This is so that we can use the t-distribution to calculate the probability! Now that we know the degrees of freedom and the test-statistic to compare against, we can calculate the p-value.

P-value

In the case of mean comparison, we have the following p-values:

For Failed to parse (SVG (MathML can be enabled via browser plugin): Invalid response ("Math extension cannot connect to Restbase.") from server "https://wikimedia.org/api/rest_v1/":): {\displaystyle H_A: \mu_1 - \mu_2 \neq 0} , the p-value is Failed to parse (SVG (MathML can be enabled via browser plugin): Invalid response ("Math extension cannot connect to Restbase.") from server "https://wikimedia.org/api/rest_v1/":): {\displaystyle 2P(t > |t_s|)}
- Two tails
For Failed to parse (SVG (MathML can be enabled via browser plugin): Invalid response ("Math extension cannot connect to Restbase.") from server "https://wikimedia.org/api/rest_v1/":): {\displaystyle H_A: \mu_1 - \mu_2 > 0} , the p-value is Failed to parse (SVG (MathML can be enabled via browser plugin): Invalid response ("Math extension cannot connect to Restbase.") from server "https://wikimedia.org/api/rest_v1/":): {\displaystyle P(t > t_s)}
- Upper tail
For Failed to parse (SVG (MathML can be enabled via browser plugin): Invalid response ("Math extension cannot connect to Restbase.") from server "https://wikimedia.org/api/rest_v1/":): {\displaystyle H_A: \mu_1 - \mu_2 < 0} , the p-value is Failed to parse (SVG (MathML can be enabled via browser plugin): Invalid response ("Math extension cannot connect to Restbase.") from server "https://wikimedia.org/api/rest_v1/":): {\displaystyle P(t < t_s)}
- Lower tail

The smaller the p-value, the less likely it is to observe our data or more extreme if Failed to parse (SVG (MathML can be enabled via browser plugin): Invalid response ("Math extension cannot connect to Restbase.") from server "https://wikimedia.org/api/rest_v1/":): {\displaystyle H_0} is true, meaning that our data is unlikely if our claim is true.

4. Conclusion

We decide a cutoff point for our p-values, typically at Failed to parse (SVG (MathML can be enabled via browser plugin): Invalid response ("Math extension cannot connect to Restbase.") from server "https://wikimedia.org/api/rest_v1/":): {\displaystyle \alpha = 0.1, 0.05, 0.01} , called the level of significance.

If Failed to parse (SVG (MathML can be enabled via browser plugin): Invalid response ("Math extension cannot connect to Restbase.") from server "https://wikimedia.org/api/rest_v1/":): {\displaystyle p < \alpha} , our data supports Failed to parse (SVG (MathML can be enabled via browser plugin): Invalid response ("Math extension cannot connect to Restbase.") from server "https://wikimedia.org/api/rest_v1/":): {\displaystyle H_A} , therefore Failed to parse (SVG (MathML can be enabled via browser plugin): Invalid response ("Math extension cannot connect to Restbase.") from server "https://wikimedia.org/api/rest_v1/":): {\displaystyle H_0} is rejected. Otherwise, we failed to reject Failed to parse (SVG (MathML can be enabled via browser plugin): Invalid response ("Math extension cannot connect to Restbase.") from server "https://wikimedia.org/api/rest_v1/":): {\displaystyle H_0} .

A CI that covers Failed to parse (SVG (MathML can be enabled via browser plugin): Invalid response ("Math extension cannot connect to Restbase.") from server "https://wikimedia.org/api/rest_v1/":): {\displaystyle 0} implies that there is no significant difference, as it is plausible for the population means to be equal.

@@ Line 40: / Line 40: @@
 </math>
-where the difference mean is subtracted by 0 to since that is the
+On the numerator, the difference mean is subtracted by 0 to since that is the
 comparison point; all three sets of hypotheses in mean comparison
 compares against 0.
+On the denominator, the value is divided by the sample standard
+deviation. This is a surprise tool that will help us later (bottom of
+.)
 <math>t_s</math>, the more our data differs from <math>H_0</math>.
@@ Line 77: / Line 81: @@
 where <math>\upsilon</math> is '''rounded down''' when using the t-table.
-Now that we know the ''degrees of freedom'' and the ''test-statistic''
+Remember how the test-statistic has the sample deviation on the
-to compare against, we can calculate the p-value.
+denominator? This is so that we can use the t-distribution to calculate
+the probability! Now that we know the ''degrees of freedom'' and the
+''test-statistic'' to compare against, we can calculate the p-value.
 === P-value ===

Anonymous

Search

Hypothesis Test: Difference between revisions

Namespaces

More

Page actions

Revision as of 01:22, 16 March 2024

Contents

Procedure (Mean Comparison)

1. Null and Alternative Hypothesis

2. Test-Statistic

3. Find P-value

Distribution

P-value

4. Conclusion

Navigation

Navigation

Wiki tools

Wiki tools

Anonymous

Search

Hypothesis Test: Difference between revisions

Revision as of 01:22, 16 March 2024

Procedure (Mean Comparison)

1. Null and Alternative Hypothesis

2. Test-Statistic

3. Find P-value

Distribution

P-value

4. Conclusion

Navigation

Wiki tools

Page tools

Categories