Type II Error

A type II error (type 2 error) is one of two types of statistical errors that can result from a hypothesis test (the other being a type I error). Technically speaking, a type II error occurs when a false null hypothesis is accepted, also known as a false negative.

In any hypothesis testing situation, the null hypothesis states that the subject of the test is not significantly different in the experimental versus the control group, and so any difference observed is the result of some error. The alternative hypothesis, by contrast, states that there is a significant difference.

As a result of this setup, there are four possible outcomes from any hypothesis test:

  1. we reject a false null hypothesis,
  2. we reject a true null hypothesis,
  3. we accept a true null hypothesis,
  4. or we accept a false null hypothesis.

1 and 3 are correct inferences; 2 is a type I error (a false positive), and 4 is a type II error (a false negative).

When Type II Errors are Acceptable

Since it’s statistically impossible to entirely eliminate both type I and type II errors, individuals performing experiments must decide which type of error is more acceptable to them and structure their experiments to eliminate the less acceptable one as much as possible.

As an example of when a type II error might be more acceptable than a type I error, let’s look at email spam checking. The alternative hypothesis is that the email is spam, and thus the null hypothesis is that the email is not spam. Committing a type I error means marking a legitimate email as spam, preventing its normal delivery. Committing a type II error means a spam email being marked as legitimate and sent to the user’s inbox.

A significant number of type II errors points to an ineffective spam filter, but a significant number of type I errors means the spam filter is overall doing more harm than good by preventing users from seeing legitimate communications. Therefore, the goal of email spam filtering systems should be to bring down the number of type II errors while keeping the number of type I errors at near-zero.

By contrast, in a biometric security system, such as a fingerprint scanner on a mobile phone or facial recognition software on a personal computer, then the alternative hypothesis is “the scanner doesn’t identify the person on its list of authorized users” and thus the null hypothesis is “the scanner does identify the person on its list of authorized users”.

In this situation, a significant number of type II errors would mean an insecure device, whereas a significant number of type I errors would mean some minor user inconvenience of needing to demonstrate their authorization another way (such as with a password or pin code). Therefore, the system should be designed to bring down the number of type I errors while keeping the number of type II errors at near-zero.

How to Minimize Type II Errors

Because they arise from the design of the test, minimizing a certain error type requires altering the test. To minimize the number of type I errors, decreasing the p-value (increasing the confidence interval) is an easy way. To minimize the number of type II errors instead, either increase the sample size (or run the experiment for a longer time, in some cases), or increase the p-value.