A/A Testing

A/A testing involves running an A/B testing process with two identical versions in order to ensure the testing process is in working order.

A/B Testing

A/B testing can be very similar, with only a change in button color, or very different, with a total change in the way a feature behaves.

A/B/n Testing

A/B/n testing is the process of A/B testing with more than two different versions. The “n” is not a third test, it refers to any additional tests.



Canary Deployment

A canary deployment, or canary release, is a deployment pattern that allows you to roll out new code/features to a subset of users as an initial test.

Client Side Testing

Client-side testing refers to any type of testing, commonly A/B testing, multivariate testing or multi-armed-bandit testing occurring in the user’s browser.


Dark Launch

Every product manager dark launches (or should). In a dark launch, features are released to a subset of your users to enable testing before a wider deploy.

Data Pipeline

A data pipeline automates the flow of data from one point to another. Defining how data is collected, and in what schema it should be collected.


Event Stream

Use Event Streams as Telemetry: An event stream is a series of data points that flow into or out of a system continuously, rather than in batches.


False Negative

A false negative result would indicate that “the change being tested has not improved the key metric significantly when in fact, the change generally has a positive impact on the underlying behavior.”

Feature Branch

A feature branch is a copy of the main codebase, where an individual or team of software developers work on a new feature until it is complete.

Feature Flags

Feature flags is a software development tool used to safely activate or deactivate features for testing in production, experimentation, and operations.

Feature Toggles

Feature toggles let developers “toggle” features on and off without releasing new code. They’re an alternative to feature branches with many use cases.






Kill Switch

With a kill switch, when a feature breaks in production, you can turn it off immediately while your team analyzes the issue.


Multi-Armed Bandit

A multi-armed bandit is a problem to which limited resources need to be allocated between multiple options, and the benefits of each are not yet fully known



With the unknowns of our software’s failure modes, we want to be able to figure out what’s going on just by looking at the outputs: we want observability.


Power Analysis

Power analysis is the process of estimating how many users you will need in order to detect an effect of a given size, or how small an effect you can detect.



Simpson’s Paradox

When finding out if an admissions program is biased, a surgery is more successful, or an A/B test variant is superior, Simpson’s Paradox may come into play.



A t-test is a type of hypothesis test which assumes the test statistic follows the t-distribution. It is determines if there is a statistically significant difference between two groups.

Type I Error

A type I error is a type of statistical error where the test gives a false positive result, when a perfect test would report a negative.

Type II Error

A type II error is one of two statistical errors that can result from a hypothesis test. A type II error occurs when a false null hypothesis is accepted.