We have updated our Data Processing Addendum, for more information – Click here.

Glossary

False Negative

A false negative result would indicate that “the change being tested has not improved the key metric significantly when in fact, the change generally has a positive impact on the underlying behavior.”

For online experiments, we test whether a change is improving a metric. In the context of A/B testing, a false negative declares that the treatment (i.e. the new way of doing things) did not lead to an outcome different from the control (the status quo), when it did. A false negative is also known as a type II error, or a mistaken acceptance of the null hypothesis.

Understanding False Negative

To understand why a false negative is the “mistaken acceptance of the null hypothesis”, it’s useful to remember that in a statistical test, we start with the assumption that there will be no difference between the control and treatment (the “A” and “B” in an A/B test). The goal is to then disprove the “null hypothesis” by accumulating enough evidence to observe a difference greater than random chance would introduce. However, there is still a small chance (5 or 10 percent). If we don’t reject the null hypothesis in our findings when in fact there is a difference in the real world, we have mistakenly accepted the null hypothesis, committed a type II error, and found a false negative result.

The best way to avoid a false negative result, is to ensure that your experiment has sufficient power before you conduct it. A power analysis will help you determine how many observations (i.e. users passing through your test) are needed in order to reliably detect a given amount of difference. Power analysis will let you calculate the minimum likely detectable effect (MLDE) for a given sample size, or conversely to calculate the sample size needed to reliably measure a given MLDE.

Definition of False-Negative

Errors are sometimes present during an experiment or an AB-test process. Still, preventing them is essential so that their effect is negligible. Sometimes no matter how much precaution is taken, some errors still slip in. These errors are known as random errors, they can generate false positive or false negative results in medical tests.

In an online experiment, a false positive means that a team releases a change that isn’t effective; a false negative means they don’t release a change that is effective. Releasing a bad implementation (false positive) could be corrected or improved later. However, not releasing a change that is effective can demoralize a team, discouraging them from trying a similar idea.

Technical Difficulties May Cause False-Negatives

Technical difficulties that encompass inappropriate set-up, configuration, or software involved in a test may cause false-negatives. It is also the lack of know-how of the test procedures.

False-Negatives Are Misleading

Every test aims to get the appropriate result which helps to determine the action that follows. Getting a false-negative can be misleading as the wrong line of action follows. Because of this reason, errors that might lead to this type of result must be avoided to its barest negligible minimum. The authenticity of a result is the safety of both the patient and the people in the environment.

Switch It On With Split

The Split Feature Data Platform™ gives you the confidence to move fast without breaking things. Set up feature flags and safely deploy to production, controlling who sees which features and when. Connect every flag to contextual data, so you can know if your features are making things better or worse and act without hesitation. Effortlessly conduct feature experiments like A/B tests without slowing down. Whether you’re looking to increase your releases, to decrease your MTTR, or to ignite your dev team without burning them out–Split is both a feature management platform and partnership to revolutionize the way the work gets done. Schedule a demo or explore our feature flag solution at your own pace to learn more.

Split A/B

Want to Dive Deeper?

We have a lot to explore that can help you understand feature flags. Learn more about benefits, use cases, and real world applications that you can try.

Create Impact With Everything You Build

We’re excited to accompany you on your journey as you build faster, release safer, and launch impactful products.

Want to see how Split can measure impact and reduce release risk? 

Book a demo