Types of Errors in Statistical Hypothesis Testing

Type I and Type II Errors

In statistical hypothesis testing, we always have a null hypothesis $H_0$ which refers to the statement to be tested. We have two possible conclusions from a hypothesis testing,

to accept the hypothesis, that is concluding that $H_0$ is true,
to reject the hypothesis, that is concluding that $H_0$ is false.

However, it is possible that our conclusion is not correct. There are four possible results.

	$H_0$ is True (Ground Truth)	$H_0$ is False (Ground Truth)
Accept $H_0$ (after hypothesis testing)	Correct	Type II Error
Reject $H_0$ (after hypothesis testing)	Type I Error	Correct

We could tell that there are two types of errors:

Type I: The hypothesis $H_0$ is correct but we rejected it:
Type II: the hypothesis $H_0$ is wrong but we accepted it.

What Kind of Mistakes

We all make mistakes. The question is, what kind of mistakes.

If we forget about the name “Null Hypothesis” and only consider just any hypothesis, the name I and II won’t matter. So there is a reason that we design our null hypothesis correctly.

Why is it import that we design the null hypothesis carefully?

For cancer screening, we definitely don’t want to miss out some real cancer samples. If we are using “the sample is a cancer sample” as a hypothesis, we would like to reduce type I errors. However, if we are using “the sample is not a cancer sample” as a hypothesis, we would like to reduce type II errors. In fact, null hypothesis should be the statement “the sample is not a cancer sample”.

If we look at the threshold of p-value in a hypothesis testing, we are basically managing the risks of different types of errors.

Here I quote this very wise paragraph from Elements of Statistics II as shown in the reference.

A P value can be thought of as a descriptive statistic that measures how much support the data give to the null hypothesis: the smaller the P value, the less the support. But what level of support is considered so small that the null hypothesis should be rejected?
– 16.8 in the book Elements of Statistics II by Stephen Bernstein and Ruth Bernstein

We will denote the threshold of the hypothesis testing as $p_t$. If the $p < p_t$, then we reject our hypothesis. Here $p_t$ is linked to our risk of type I errors. The larger the threshold we choose, the higher the risk of making type I errors.

In hypothesis testing, it is crucial that we place the actual null hypothesis $H_0$ we would like to test so that type I error is the type of error we care about.

However, I believe that the theory doesn’t prevent us from using a non-null hypothesis if we insist. But null hypothesis is the most important one when we are dealing with new findings. If you have different opinions, I would appreciate it if you leave a comment.

Planted: 2019-05-31 by L Ma;

References:

Elements of Statistics II by Stephen Bernstein and Ruth Bernstein

Dynamic Backlinks to wiki/statistical-hypothesis-testing/type-1-error-and-type-2-error:

Confusion Matrix (Contingency Table)

Confusion Matrix It is much easier to understand the confusion matrix if we use a binary …

Types of Errors in Statistical Hypothesis Testing

We all make mistakes. The question is, what kind of mistakes.

L Ma (2019). 'Types of Errors in Statistical Hypothesis Testing', Datumorphism, 05 April. Available at: https://datumorphism.leima.is/wiki/statistical-hypothesis-testing/type-1-error-and-type-2-error/.