Feature Flags Java Testing

Since joining Split, it has been clear that feature testing is the fundamental building block of our software development process. We deploy unit tests to validate discrete parts of business logic, end-to-end tests to monitor our APIs, automated usability testing to maintain our user experience, and entire network of bots ensuring the quality of service of our SDKs.Our tests drive development, allowing engineers to think through all possible use cases and edge conditions. Tests provide documentation, showing new members of the team how a block of code is expected to work in a variety of situations. And tests drive stability, constantly seeking out regressions as more and more feature flags are added.

Feature testing, with control

In today’s modern software development lifecycle, high quality feature testing is as important as ever, but it has become increasingly more challenging to ensure your coverage is complete. At the same time that feature flag technology has allowed for releases to be more stable than ever, poor implementations of that technology have brought about cultures who test purely in production – reliant on the ease of ramping and rollback to protect users from widespread catastrophe. Using a controlled rollout platform that is both feature rich and highly testable is a challenge, but here at Split we believe it is essential for successful releases to be able to have the utmost confidence in the features released to clients.

When most organizations start with working with controlled rollout, they default all feature flag checks to the “off” or “control” state and then rely on unit testing the methods specific to the new feature to ensure it behaves as expected. However, as more and more features are enabled in production there can be significant drift between the behavior you test and the behavior customers actually see.

To prevent that drift both during testing and for engineers running their product’s locally, we have integrated into the Split Software Development Kit the recently-announced Localhost mode, which uses a configuration file to set the state of each of your feature flags. Our customers often maintain such a configuration to run their tests against that mirrors the state of an environment. It is even possible to implement your deployment architecture to run multivariate testing with multiple Off-The-Grid configurations, allowing for a variety of states to be tested before deploying to users.

Introducing Split’s new programmatic Java testing mode

Often, though, we will want to test the multiple states of a feature flag directly inside the unit test suite. Rather than pass in multiple configuration files, we prefer a method that allows a single file to test a variety of feature options for consistency, and also to directly test for the expected changes of behavior when a feature is enabled. To that end, we have released a new feature testing mode for our Java client. This testing mode allows you to programmatically change the treatments returned by the Split Client; allowing you to start a test by setting the specific features you wish to test regardless of their state in the Split console.

Examples: how Split testing is done

Java testing is best described in context of a recent feature release here at Split. At Split we know our success is dependent on ensuring our product is built for teams, and to that end we are releasing functionality that allows users to construct their own groups of users and use those teams as a shorthand throughout the product. Part of this project involves transitioning our existing model of handling Split Administrators to utilize this same groups infrastructure. However, making a change to such a critical part of our architecture needs to be done with robust multivariate testing and deliberate roll out.

A best practice we have established for these types of transitions is to utilize the techniques of dual writes and dark reads. Effectively, you start your roll out by writing the relevant data to both the original collection and the new data structure, while continuing to read from the original collection. Then, to validate your new logic, you start reading from both the original source and the new structure and compare the results. You still return the data coming from the original source (this second read happens “in the dark”) but are able to log any occasions that the results differ. Once you confirm that the data is always consistent, you can turn the reads from the new structure “on”, and only once those reads are fully ramp do you transition your writes from “dual” to “off” and deprecate the old code. This process ensures a steady, consistent, and controlled roll out with minimal risk of exposure to customers.

Testing this logic in the old world would be very risky. Typically requiring careful monitoring of logs and dashboards, ramping and de-ramping of the relevant features, and additional deployments to fix bugs found along the way. Using the new Java testing mode, we can add extensive unit tests to confirm all permutations of these features behave as expected.

public class Test { private final SplitClientForTest splitClient = new SplitClientForTest(); @Before public void setup() { // Turn on Writes splitClient.setTreatment("feature_writes", "dual"); } @Test public void testOriginalDatabase() { Controller controller = new Controller(splitClient); // Create in Both Databases via Dual Writes Object expected = controller.create(); // Read returns original logic Object result = controller.get(expected.id()); assert result.equals(expected); } @Test public void testDarkReads() { // Turn on Writes splitClient.setTreatment("feature_reads", "dark"); Controller controller = new Controller(splitClient); // Create in Both Databases via Dual Writes Object expected = controller.create(); // Read hits both, but returns original logic Object result = controller.get(expected.id()); assert result.equals(expected); } @Test public void testNewDatabase() { // Turn on Writes splitClient.setTreatment("feature_reads", "on"); Controller controller = new Controller(splitClient); // Create in Both Databases via Dual Writes Object expected = controller.create(); // Read returns from new logic Object result = controller.get(expected.id()); assert result.equals(expected); } }
Code language: Java (java)

This system allows you to validate multiple feature release states in a single unit test file, both setting treatments at the class level in the setup function and at the test level. However, this model does result in a lot of repeated code. Certainly you could separate out the business logic of the test to a separate function, but here at Split we have a better way.

Alongside the SplitClientForTest, we are also today releasing a JUnit Test Runner and suite of Java Annotations to streamline testing of your application across a wide variety of configurations. The SplitTestRunner is built on the standard JUnit4 test runner and preserves all of the @Before, @After, and @Rule functionality you are likely using today. However in using this runner in your test files and tagging your SplitClient in the file, you can then annotate your tests with the features and treatments you wish to run them against and Split’s Java testing will do the rest.

public class Test { @SplitTestClient private final SplitClientForTest splitClient = new SplitClientForTest(); @Before public void setup() { // Turn on Writes splitClient.setTreatment("feature_writes", "dual"); } @Test @SplitTest(feature="feature_reads", treatments={"on", "dark", "off"}) public void testReading() { Controller controller = new Controller(splitClient); Object expected = controller.create(); Object result = controller.get(expected.id()); assert result.equals(expected); } }
Code language: Java (java)

The SplitTestRunner executes each @SplitTest once for each treatment value, updating the SplitClient with the correct treatments before each run. Any unset features will return the “control” treatment, and anything treatments manually set elsewhere will be preserved unless specifically overwritten by the SplitTest annotation.

In most cases, one is not limited to testing a single feature’s rollout, but ideally you wish to test how a behavior might interact with a variety of other feature states. Therefore we have built the SplitScenario, which allows the developer to define a variety of feature states and the test will run across all possible permutations. In the case at hand, that would be testing both the read and write treatments simultaneously!

@RunWith(SplitTestRunner.class) public class Test { @SplitTestClient private final SplitClientForTest splitClient = new SplitClientForTest(); @Test @SplitScenario(features={ @SplitTest(feature="feature_writes", treatments={"on", "dual", "off"}), @SplitTest(feature="feature_reads", treatments={"on", "dark", "off"}) }) public void testReading() { Controller controller = new Controller(splitClient); Object expected = controller.create(); Object result = controller.get(expected.id()); assert result.equals(expected); } }
Code language: Java (java)

Now in less code than before we are actually running the test 9 times! Our goal is to make it easier to test a wide variety of complex interactions than it would be to simply test two different feature flag states under other frameworks.

In running this test for the first time, one may see certain failing scenarios. With writes disabled to the feature flag, total reads to that feature would likely start failing, and so would that run of the test. Now, the best practice I recommend is to identify these failure scenarios and defensively code against them – falling back to a sane default. These permutations are an excellent way to identify such cases. However, there could be scenarios where a bad combination of feature flags may be inevitable, and we do support limiting your tests to only supported feature combinations with SplitSuites.

@RunWith(SplitTestRunner.class) public class Test { @SplitTestClient private final SplitClientForTest splitClient = new SplitClientForTest(); @Test @SplitSuite(scenarios={ @SplitScenario(features={ @SplitTest(feature="feature_writes", treatments={"on"}), @SplitTest(feature="feature_reads", treatments={"on", "dark"}) }), @SplitScenario(features={ @SplitTest(feature="feature_writes", treatments={"dual"}), @SplitTest(feature="feature_reads", treatments={"on", "dark", "off"}) }), @SplitScenario(features={ @SplitTest(feature="feature_writes", treatments={"off"}), @SplitTest(feature="feature_reads", treatments={"dark", "off"}) }) } public void testReading() { Controller controller = new Controller(splitClient); // Create in Both Databases via Dual Writes Object expected = controller.create(); // Read returns original logic Object result = controller.get(expected.id()); assert result.equals(expected); } @Test(expect=NotFound.exception) @SplitSuite(scenarios={ @SplitScenario(features={ @SplitTest(feature="feature_writes", treatments={"on"}), @SplitTest(feature="feature_reads", treatments={"off"}) }), @SplitScenario(features={ @SplitTest(feature="feature_writes", treatments={"off"}), @SplitTest(feature="feature_reads", treatments={"on"}) }) } public void testReading() { Controller controller = new Controller(splitClient); Object expected = controller.create(); Object result = controller.get(expected.id()); assert result.equals(expected); } }
Code language: Java (java)

In this way you can specifically test for different behavior with different combinations of feature flags. Finally, the SplitTestClient annotation also works as a global level SplitSuite, letting you apply the same set of feature permutations across all tests run in the class (unless specifically overridden at the test level). This configuration even carries over through inheritance, allowing you to define a BaseTest class with a full suite of default feature treatments, and then have your Test files inherit from that BaseTest and further customize accordingly. We do exactly that here at Split, with a BaseTest which both defines our default feature configuration and sets up our Guice injection. That allows our test class in this case to focus specifically on the logic that we care to test:

@RunWith(SplitTestRunner.class) public class Test { @SplitTestClient(scenarios={ @SplitScenario(features={ @SplitTest(feature="feature_writes", treatments={"on", "dual", "off"}), @SplitTest(feature="feature_reads", treatments={"on", "dark", "off"}) }) }) private final SplitClientForTest splitClient; @Before Public void setup() { super.setup(); splitClient = injector.getInstance(SplitClientForTest.class); } @Test public void testReading() { Controller controller = new Controller(splitClient); Object expected = controller.create(); Object result = controller.get(expected.id()); assert result.equals(expected); } @Test public void testDeleting() { Controller controller = new Controller(splitClient); Object expected = controller.create(); boolean success = controller.delete(expected.id()); assert success; } }
Code language: Java (java)

Summary

Testing with high numbers of permutations can be done as part of integration or release testing so as to reduce the runtime of local tests. The Java testing client and annotations suite is available now, and further documentation is available by reviewing the sample tests available in that repository. In coming releases we will be rolling out similar feature testing clients for all of our SDKs, and will be implementing similar annotations and decorators based on language support.