I find that the best writing on software architecture is often by people who are not software architects. In fact, they may never have written a single line of code. This is because the goal of good software is to fit the needs of the customer so well that the software is invisible (quote repurposed from Donald Norman). And that is a goal we, software engineers, share with designers and marketers whose perspective is also relevant to the design of software.
Seth Godin’s article on Graceful Degradation is a great example of such writing. He says:
Most failures aren’t shocking surprises. The law of large numbers is too strong for that. Instead, they are predictable events that smart designers plan for, instead of wishing them away as rare unpredictable accidents.
Failure is not an exception in software, it is the rule. That is why ‘Graceful Degradation’ is such a key concept in software architecture.
Graceful degradation in cloud software is a wide ranging topic encompassing people, code, and infrastructure. I will highlight four ideas that are critical to graceful degradation:
- Service Oriented Architecture: or its modern variation of microservices. Independent services allow software architects to localize failure to a single service, thus, preventing failure in non-critical functionality from disrupting critical functionality.
- Elastic Hardware: Sudden spikes in traffic are a significant source of failures in cloud systems. Hardware that can spin up on demand goes a long way in solving this failure scenario.
- Fault Tolerant Communication: If a service stops meeting its SLA, calling services should taper calls to it and resort to a backup behavior. This prevents failures from cascading. A good example is Netflix’s Hystrix
- Controlled Rollouts: Every new feature should be rolled out to a subset of traffic and gradually ramped to all customers. Through this rollout, its impact should be measured on key performance and business metrics. If a metric degrades, the feature should be ramped down. Thus, problems can be ironed out without risking global customer experience. A good example of controlled rollouts is Split.
How do you ensure Graceful Degradation in your systems? Have you come across non-engineers whose writings have influenced your thinking about software? Drop us a line!
Stay up to date
Don’t miss out! Subscribe to our digest to get the latest about feature flags, continuous delivery, experimentation, and more.
Testing in production is becoming more and more common across tech. The most significant benefit is knowing that your features work in production before your users have access. With feature flags, you can safely deploy your code to production targeting only internal teammates, test the functionality, validate design and performance,…
The same tools and practices that make modern software delivery sustainable and efficient are what our organizations need to achieve greater adaptability.
Today we remember the release night. We gather (virtually) today not to mourn the death of the release night, but to celebrate the inception of its replacement. Release nights were once a standard for software development teams to have a set date to release their software to their customers. In…