Type I and Type II Errors
In statistical hypothesis testing, we always have a null hypothesis $H_0$ which refers to the statement to be tested. We have two possible conclusions from a hypothesis testing,
- to accept the hypothesis, that is concluding that $H_0$ is true,
- to reject the hypothesis, that is concluding that $H_0$ is false.
However, it is possible that our conclusion is not correct. There are four possible results.
|$H_0$ is True (Ground Truth)||$H_0$ is False (Ground Truth)|
|Accept $H_0$ (after hypothesis testing)||Correct||Type II Error|
|Reject $H_0$ (after hypothesis testing)||Type I Error||Correct|
We could tell that there are two types of errors:
- Type I: The hypothesis $H_0$ is correct but we rejected it:
- Type II: the hypothesis $H_0$ is wrong but we accepted it.
What Kind of Mistakes
We all make mistakes. The question is, what kind of mistakes.
If we forget about the name “Null Hypothesis” and only consider just any hypothesis, the name I and II won’t matter. So there is a reason that we design our null hypothesis correctly.
Why is it import that we design the null hypothesis carefully?
If we look at the threshold of p-value in a hypothesis testing, we are basically managing the risks of different types of errors.
Here I quote this very wise paragraph from Elements of Statistics II as shown in the reference.
A P value can be thought of as a descriptive statistic that measures how much support the data give to the null hypothesis: the smaller the P value, the less the support. But what level of support is considered so small that the null hypothesis should be rejected?
– 16.8 in the book Elements of Statistics II by Stephen Bernstein and Ruth Bernstein
We will denote the threshold of the hypothesis testing as $p_t$. If the $p < p_t$, then we reject our hypothesis. Here $p_t$ is linked to our risk of type I errors. The larger the threshold we choose, the higher the risk of making type I errors.
In hypothesis testing, it is crucial that we place the actual null hypothesis $H_0$ we would like to test so that type I error is the type of error we care about.
Table of Contents