Product experimentation is the most effective way to determine what works best for end-users. You can measure engagement with a feature and hypothesize ways to improve that engagement, which will ultimately impact one or more critical business metrics. If you’re setting up an experiment, it’s mission-critical to implement A/B testing correctly to ensure your results are accurate. We know that experimentation powers innovation and can drive profound business impact. So, are you A/B testing correctly?
What is A/B Testing?
A/B testing, otherwise known as split testing, is the process of testing two different versions of a web page or product feature against the original, existing version. As products evolve, new improvements and new features are needed to keep up with growing user demands. A/B tests allow companies and product teams to figure out precisely what will drive up their metrics, and ultimately their success.
Get Started A/B Testing
The first thing you need to do when starting an A/B test is to gather relevant data on your current features. What do you already know about their influence on your key business metrics? For example, you may have observed that users who share a report with other users are twice as likely to upgrade their plan over the next 90 days. You might also know that fewer than 20% of your users run a report in the first month of usage. Once you have this baseline data, you should be able to create a hypothesis about how you could improve outcomes by changing how customers interact with one of those features. For example, is there some way you can lead more users to run reports?
The existing functionality of your feature is called your control. The change you believe will improve user experience is called your treatment.
It’s essential to establish your baseline before you start your experiment because you need to know what to measure and to confirm your ability to measure it during your experiment. You also need to make sure that the two variants (the control and treatment) run simultaneously and are randomly distributed across your targeted population, to cancel out external influences on the outcome.
A/B Testing with Feature Flags
Now the question is – how do you segment your user base into two parts and release the feature (the experiment) to only a subset of those users? The answer – feature flags. Feature flags allow you to control how your users interact with your product. With Split, you can segment your users in the UI, and then choose the experience they get in the front end. The developers will then link that feature flag to their code. Once the feature is on for the specified population, you will see data coming into your metrics dashboard. Here, you will be able to measure the impact of your changes. The more traffic you have, the quickly you will start to see statistically significant results. After you have your results, you can implement the new version for all users if there’s a significant positive change.
It’s important to integrate the team’s analytics platform with the feature flag management system to correlate the users’ behavior with which version of the feature they used as described above. Once the test has been completed, you can go into the metrics dashboard and look at the difference in metrics between your control and treatment and decide if the treatment performs better or worse than the control. If it is performing better, then you can confidently roll out the feature to the rest of your features, knowing that it is improving your metrics. However, if the treatment is not performing better than the control, then you can confidently return the control to the rest of your userbase.
Can Your Test More Than One Thing at a Time with A/B Testing?
Let’s say, for example, that you hoping to improve the conversion rate on your website and want to test how a CTA change will impact that rate. You have three variables you want to test, the current CTA, and two new options. instead of two. This is called an A/B/n test. An A/B/n test allows more than one variant for an experiment. In this case, you would have one control, the existing CTA, and two treatments, the two alternates you are testing. Nothing different needs to be done for the experiment setup, but keep in mind that the more variants you have, the longer you might have to run an experiment to get a sufficient amount of traffic to each variant for statistically significant results.
There is also the option of testing different variants together. This is called multivariate testing. Let’s say that in addition to the CTA, you also wanted to test how the headline of the homepage impacts your conversion rate. You would have your different variants: control CTA + control headline, control CTA + experimental headline, experimental CTA + control headline, experimental CTA + experimental headline. When you are not under a time constraint and you want to test different combinations of things, you can run a multivariate test. If, however, you do not have enough time to bring in enough amount of traffic to each variate for statistical significance, stick with A/B testing.
Run A/B Tests to Drive Business Impact
Running a successful A/B test can not only give your users the best experience, but it can also increase your revenue. When you import your company’s KPIs to your feature flag management system, you can easily correlate the data coming in from your experiments to the different KPIs and make business decisions that will better your company and increase your revenue. Feature flags make the process of A/B testing easier by giving you an intuitive UI to segment your user base and monitor the progress of the experiment as time goes on.
Learn More About A/B Testing and Feature Flags
When done correctly, A/B tests can greatly impact your business, increase your revenue, and provide a better user experience. To find out more about A/B testing and feature flags, check out the following posts.
- The Difference Between A/B Testing and Multivariate Testing
- Why You Need Feature Flags as a Service
- Know Your Why: Experimentation and Progressive Delivery at Walmart Grocery
Stay up to date
Don’t miss out! Subscribe to our digest to get the latest about feature flags, continuous delivery, experimentation, and more.
We all know a good digital experience from a bad one. Great digital experiences make our lives easier, and negative experiences drive us, and our business, elsewhere. What many users don’t realize is how difficult it is for product and engineering teams to know that the features they’re building will…
Feature flags provide so much for software organizations: they allow teams to separate code deployment from feature release, test in production, run experiments, and more. However, some rules apply to the feature flagging process that are easy for teams to overlook. I’ve gathered the best practices of feature flags from…
Continuous integration, continuous delivery, and continuous deployment are foundational in today’s agile engineering ecosystem. However, many times they are used interchangeably and often incorrectly. Let’s remove the confusion and settle the differences between continuous integration, continuous delivery, and continuous deployment. What is Continuous Integration? Continuous integration happens when developers regularly…