A/A Testing
A/A testing involves running an A/B testing process with two identical versions in order to ensure the testing process is in working order.
Your guide to progressive delivery and experimentation concepts in software development.
A/A testing involves running an A/B testing process with two identical versions in order to ensure the testing process is in working order.
A/B testing can be very similar, with only a change in button color, or very different, with a total change in the way a feature behaves.
A/B/n testing is the process of A/B testing with more than two different versions. The “n” is not a third test, it refers to any additional tests.
Blue/green deployment is a process that reduces downtime and risk by maintaining two clones of production, called blue and green.
A canary deployment, or canary release, is a deployment pattern that allows you to roll out new code/features to a subset of users as an initial test.
A change advisory board is a collective of representatives from different departments within a company who runs formal change management processes.
Chaos engineering, also known as chaos testing, provides a method and tool-set to deliberately introduce failures and outages in a system.
CI/CD is the acronym in software development for the combination of continuous integration (CI) and continuous delivery (CD).
Client-side testing refers to any type of testing, commonly A/B testing, multivariate testing or multi-armed-bandit testing occurring in the user’s browser.
A way to avoid the issues caused by configuration drift is to set up feature flags to test your code in production safely.
Continuous delivery is a software delivery process allows devs to release software updates to the production environment and to end users at any time.
When implementing continuous delivery, shifting toward purpose-built continuous delivery tools is critical. Here, we discuss the most popular CD tools.
Continuous delivery and continuous deployment are two main terms for how features are deployed to production for testing, experimentation, and release.
Continuous deployment is the practice of automatically promoting code changes to prod after they pass all automated tests in a continuous delivery pipeline.
Continuous integration happens when you automate the process of integrating your code to master to ensure there are no conflicts.
A controlled rollout allows the release of new features gradually, ensuring a good user experience before releasing them to larger groups.
Today, we’ll discuss some of the most commonly-used customer experience metrics and explain the benefits and drawbacks of each.
Every product manager dark launches (or should). In a dark launch, features are released to a subset of your users to enable testing before a wider deploy.
A data pipeline automates the flow of data from one point to another. Defining how data is collected, and in what schema it should be collected.
Do no harm metrics are metrics that teams use to ensure nothing bad is happening to your teams metrics due to a feature rollout.
Use Event Streams as Telemetry: An event stream is a series of data points that flow into or out of a system continuously, rather than in batches.
False Discovery Rate (FDR) is a measure of accuracy when multiple hypotheses are being tested at once. Ex: multiple metrics measured in the same experiment.
A false negative result would indicate that “the change being tested has not improved the key metric significantly when in fact, the change generally has a positive impact on the underlying behavior.”
What is false positive rate, and how is it calculated? How does it compare to other measures of test accuracy, like sensitivity and specificity?
A feature branch is a copy of the main codebase, where an individual or team of software developers work on a new feature until it is complete.
Ordinary software development is rife with guesswork. Introducing feature experimentation with Split’s feature flag based experimentation platform.
Feature flag management is the process of managing feature flags. Management systems can be either bought or built, but good ones have four key traits.
Feature flags is a software development tool used to safely activate or deactivate features for testing in production, experimentation, and operations.
A feature flags framework is a tool for software development that allows individual features of a software product to be individually enabled or disabled.
Feature flags have been instrumental in progressing the DevOps philosophy to achieve continuous delivery of new, high-quality features.
A feature rollout plan allows the introduction of a set of new features to a group of your user base, often to a limited user set initially.
Feature toggles let developers “toggle” features on and off without releasing new code. They’re an alternative to feature branches with many use cases.
Guardrail metrics are business metrics designed to indirectly measure business value and provide alerts […]
Hypothesis driven development is the scientific method applied to software development. It’s the same process we use to figure out if it’s raining.
Using Jira can enhance the feature release process even further, providing real-time feature status from ideation to code deployment.
With a kill switch, when a feature breaks in production, you can turn it off immediately while your team analyzes the issue.
A/B testing for mobile apps is about as similar to standard A/B testing as mobile app development is to standard software development.
A multi-armed bandit is a problem to which limited resources need to be allocated between multiple options, and the benefits of each are not yet fully known
Multivariate testing is a method of experimenting with different variations of elements in a feature to discover which variations will drive user behavior.
With the unknowns of our software’s failure modes, we want to be able to figure out what’s going on just by looking at the outputs: we want observability.
Power analysis is the process of estimating how many users you will need in order to detect an effect of a given size, or how small an effect you can detect.
Progressive delivery uses feature flags to increase speed and decrease deployment risk, and uses a gradual process for rollout and ownership.
Server-side testing refers to any type of testing, commonly A/B testing, multivariate testing or multi-armed bandit testing occurring on the web server.
When finding out if an admissions program is biased, a surgery is more successful, or an A/B test variant is superior, Simpson’s Paradox may come into play.
Smoke testing is a type of regression test which ensure that your most important, critical functional flows work as intended.
Getting a statistically significant result means that the result is likely to be the result of a legitimate, useful trend, instead of random data noise.
A t-test is a type of hypothesis test which assumes the test statistic follows the t-distribution. It is determines if there is a statistically significant difference between two groups.
To test in production means continuing to test features after they have been deployed to production by using staging to test your features.
Trunk-based development (TBD) is a branching model for software development where developers merge every new feature, bug fix, or other code change to one central …
A type I error is a type of statistical error where the test gives a false positive result, when a perfect test would report a negative.
A type II error is one of two statistical errors that can result from a hypothesis test. A type II error occurs when a false null hypothesis is accepted.
A key part of the software development process, usability testing provides invaluable feedback on the user experience of a product.