Hypothesis Tests

Reading

Practice Problems

4.6.3 (Page 209)
4.17, 4.18, 4.19, 4.20, 4.21, 4.22, 4.24, 4.25, 4.26, 4.27, 4.28, 4.31, 4.32

Notes

Hypothesis Tests in general

Hypothesis Testing has a ubiquitous presence in applied statistics and related disciplines. Almost every study out there is accompanied by appropriate hypothesis tests.

We start by describing some standard terminology associated with hypothesis tests.

Null and Alternative Hypotheses

In every hypothesis test there are two statements, the null hypothesis and the alternative hypothesis.

It is vital to keep in mind, that the conclusion/decision from the test must be one of the following:

You can never prove the null hypothesis. You can only reject it in favor of the alternative.

For example, say we want to test a new diet pill, to see if it is effective.

Types of Errors

In every hypothesis test there are two kinds of errors that we might make.

The probabilities of making those errors are related to how we set up our test and what parameters and bounds we use. We will see some of this in a bit.

We have seen these types of errors before, when we were looking at medical tests.

As a general principle, we want to keep both of those errors to a minimum. Some times however, one of the two types is more critical and we might be willing to allow for a higher chance of error in the other type in order to make the chance of error in the first type smaller.

Hypothesis Test for population mean

We will now focus specifically on tests for the population mean \(\mu\). We assume that we are in the IID setting, and that \(n\) is sufficiently large for us to use the Central Limit Theorem. Therefore we are working under the assumption that:

\[\bar x \sim N\left(\mu, \frac{\sigma}{\sqrt{n}}\right)\]

There are three types of hypotheses tests we may encounter:

Depending on the wording of the question, you have to choose the appropriate test. The first one, with the so-called “two-sided alternative”, is the default choice.

The logic behind the hypothesis tests goes like this:

How we compute the \(P\)-value depends on which of the three types of hypothesis tests we are considering. We will phrase our answers in terms of \(z\).

An Example

The college claims that the average GPA of our students is at least \(3.1\). Suppose we know that the standard deviation of all Hanover GPAs is \(0.55\). We will use a sample of \(50\) students. We find the mean in our sample to be only \(2.9\). We would like to see if this is enough evidence to reject the college’s claim. We therefore set a hypothesis test, one-sided since the college claimed “at least”:

\[\begin{cases}H_0:&\mu \geq 3.1\\ H_a:&\mu < 3.1\end{cases}\]

To carry out the test, we would compute

\[z = \frac{\bar x - \mu}{\sigma/\sqrt{n}} = \frac{2.9-3.1}{0.5/\sqrt{50}} = -2.83\]

We then compute the area below that value and we find it to be \(0.0023\). Therefore there is a less than \(0.25\%\) chance that we would see such a low mean if the college’s claim was correct. As this is extremely unlikely, we can consider it very strong evidence against the college’s claim (null hypothesis). We would therefore reject the null hypothesis, and instead suggest that the true mean is smaller. In fact we may provide a confidence interval.

Significance Level

When setting up a hypothesis test, it is important to agree beforehand how small a percentage we would consider strong enough evidence. This is called the significance level and is denoted by \(\alpha\). For example we might decide that we want \(\alpha = 0.05\), meaning that we will reject the null hypothesis as long as the \(P\)-value we compute is less than that.

A significance level of \(5\%\) is fairly typical.