Flagship 2024 – Day 2 is live! Click here to register and watch now.

Graceful Degradation: Building Planned Failure Into Your App

I find that the best writing on software architecture is often by people who are not software architects. In fact, they may never have written a single line of code. This is because the goal of good software is to fit the needs of the customer so well that the software is invisible (quote repurposed from Donald Norman). And that is a goal we, software engineers, share with designers and marketers whose perspective is also relevant to the design of software.

Seth Godin’s article on Graceful Degradation is a great example of such writing. He says:

Most failures aren’t shocking surprises. The law of large numbers is too strong for that. Instead, they are predictable events that smart designers plan for, instead of wishing them away as rare unpredictable accidents.

Failure is not an exception in software, it is the rule. That is why ‘Graceful Degradation’ is such a key concept in software architecture.

Graceful degradation in cloud software is a wide ranging topic encompassing people, code, and infrastructure. I will highlight four ideas that are critical to graceful degradation:

  • Service Oriented Architecture: or its modern variation of microservices. Independent services allow software architects to localize failure to a single service, thus, preventing failure in non-critical functionality from disrupting critical functionality.
  • Elastic Hardware: Sudden spikes in traffic are a significant source of failures in cloud systems. Hardware that can spin up on demand goes a long way in solving this failure scenario.
  • Fault Tolerant Communication: If a service stops meeting its SLA, calling services should taper calls to it and resort to a backup behavior. This prevents failures from cascading. A good example is Netflix’s Hystrix
  • Controlled Rollouts: Every new feature should be rolled out to a subset of traffic and gradually ramped to all customers. Through this rollout, its impact should be measured on key performance and business metrics. If a metric degrades, the feature should be ramped down. Thus, problems can be ironed out without risking global customer experience. A good example of controlled rollouts is Split.

How do you ensure Graceful Degradation in your systems? Have you come across non-engineers whose writings have influenced your thinking about software? Drop us a line!

Get Split Certified

Split Arcade includes product explainer videos, clickable product tutorials, manipulatable code examples, and interactive challenges.

Switch It On With Split

The Split Feature Data Platform™ gives you the confidence to move fast without breaking things. Set up feature flags and safely deploy to production, controlling who sees which features and when. Connect every flag to contextual data, so you can know if your features are making things better or worse and act without hesitation. Effortlessly conduct feature experiments like A/B tests without slowing down. Whether you’re looking to increase your releases, to decrease your MTTR, or to ignite your dev team without burning them out–Split is both a feature management platform and partnership to revolutionize the way the work gets done. Switch on a free account today, schedule a demo, or contact us for further questions.

Want to Dive Deeper?

We have a lot to explore that can help you understand feature flags. Learn more about benefits, use cases, and real world applications that you can try.

Create Impact With Everything You Build

We’re excited to accompany you on your journey as you build faster, release safer, and launch impactful products.