Kicking off an internally developed feature flagging system is standard practice for many companies today as feature flags help engineering teams release faster at a lower risk. The initial use case might be to support a move to continuous integration that provides rapid feedback on how well new code integrates with the main/trunk. Engineers could implement a feature flag by reading from a config file, for example, to control which code path gets exposed to a subset of internal users. This simple on/off switch probably works fine with a small number of developers and proves out the concept of feature flagging without much difficulty.
There’s no doubt that feature flags grow in importance once engineering and product teams begin to recognize the benefits. By separating code deployment from feature release, feature flags transform from a niche project to business-critical functionality.
Dev-driven use cases for feature flags focus on releasing faster with lower risk, including continuous integration, controlled rollouts, and testing in production. Product-driven use cases for feature flags focus on managing and understanding the user experience, including establishing beta programs and paywalls and conducting A/B/n testing.
With a feature flagging system in place, use cases often evolve to increase business value through controlled experiments, where teams measure the impact of new functionality against key business metrics such as product conversions, engagement, growth, subscription rates, or revenue.
From dev-driven to business-driven use cases, the challenges of developing an in-house feature flagging system expands as fast as, if not faster than, the list of requirements. The scope of the in-house application grows and in turn, so does the list of challenges of in-house development. Can the in-house solution of feature flags really scale to meet the needs of the business? Below are the top 10 challenges we see customers run up against when developing an in-house feature flagging system:
Challenge #1: Manual config changes
In-house feature flag solutions typically leverage a file or database for turning features on or off, so an engineer must make a config change for every feature release. When the flagging implementation is database-backed, and a feature is ready for rollout, a column is added to the database with a Boolean on/off indication. Any code change is inherently risky and comes with the potential for error.
Also, config changes often go through the same deploy process as standard code which may take hours or even days to push out code. That means any config change to flags will take just as long. Finally, manual config changes mean product managers are unable to make their own changes and are therefore dependent on engineers for every single change.
Challenge #2: Manual compilation of target segments
With this basic implementation, the feature will be either on or off for all customers. When conducting a controlled rollout, compiling the list of customer IDs for each feature exposure will be a manual process, which requires an engineer to go to an admin page and turn the feature on for each user.
Expanding the customer segment for a phased rollout will require another manual compilation of customer IDs, and so on until 100% of the customer base is reached. Keeping track of the treatment served to each user can quickly get out of hand. Also, a customer base is not going to be static, and names will be continually added and removed over time. Tracking who is entering and who is leaving the customer base will also require manual tracking. Somehow.
Challenge #3: Problematic customer support
Access to which customer has which feature turned on or off becomes a real issue when using feature flags. At best the data is stored in a plain text file with UUIDs, but these aren’t readable by product management or the customer support team. Worse, the data may be hard-coded as environment variables that are impossible to access. Not knowing what experience a customer has received makes customer support more troublesome. Anything that causes a problem for customers or increases support calls should be immediately turned off, but this will take precious time without rapid access to the user’s treatment.
Challenge #4: Technical debt
Many of the companies we’ve worked with often come to us with disjointed solutions across their dev teams with no centralized source of truth of “what is flagged”. Imagine each microservices dev team creating their own unique feature flagging system. As a result, monitoring and clean up of old feature flags become increasingly difficult as teams generate more and more flags. Old flags can reduce code readability or worse, result in accidental misconfiguration. Even with a sourced feature flagging system in place, technical debt can be an issue (we recommend several best practices for managing feature flag debt).
Challenge #5: Lack of documentation
Identifying the owner to a specific feature flag and documenting information such as what the flag does and why it was created can be difficult if an in-house feature flagging solution doesn’t track this data. Employee turnover or simply time passing can result in teams having to re-establish the original purpose of the flag to determine if it’s still needed or risk keeping it in the code and forgetting it altogether.
Challenge #6: Incomplete open source options
Challenge #7: Lack of a UI
The lack of a UI ties the product manager to an engineer for every unique feature request. Product managers will need to work directly with a developer on every feature rollout, which causes significant coordination overhead. Also, product managers have limited visibility into what treatments are served and to whom. Keeping track of which treatment a customer receives is critical during a rollout, especially for those customers who may be considered high risk or high touch.
Everyone on the delivery team needs access to metrics when measuring the impact of the release: “Should we keep releasing or stop because there is a problem?” “Are users responding well to the new functionality?” To make informed decisions while using feature flags, all stakeholders need a centralized dashboard to view the feature-level impact on application performance and business metrics.
Challenge #8: Projects get orphaned
An in-house feature flagging system is not the expertise of the company, and the WIP of the effort can be easily reprioritized to jobs that more clearly reflect business critical requirements. Or that one engineer who spent time on the project leaves or is re-assigned. Typically with little to no documentation (see Challenge #5), there isn’t the institutional knowledge needed to keep the project going.
Challenge #9: Application performance takes a hit
Without careful design of the framework, an in-house feature flagging system may cause a reduction in application performance. If there is a remote API call for every flag computation or decision this could have a significant impact on application performance as the number of feature flags and treatments grows.
Challenge #10: No controlled access or audit history
What was initially a simple on/off switch has grown into a significant application. This can become a burden on the organization’s internal resources, which now has to monitor and maintain the app over time. For organizations to manage the feature flagging solution, there should be change and access control so dev teams don’t have to worry about someone making a change they shouldn’t. You’ll also need audit logging and SAML sign-on.