The traditional IT reliability strategy is robustness - optimise for MTBF (mean time between failures) by maintaining a failure-free production environment. This assumes failures are preventable, and depends on risk management theatre such as end-to-end testing. It leaves organisations trapped in discontinuous delivery, and in such circumstances a continuous delivery programme focused on throughput is unlikely to succeed.
A more effective reliability strategy in most scenarios is resilience - optimise for MTTR (mean time to repair) by rapidly responding to production failures. This assumes failures are inevitable, and depends on the adaptive capacity of teams and services being increased with operability practices.
Resilience is also a great enabler of continuous delivery. A continuous delivery programme focused on resilience will increase stakeholder confidence, and lay the groundwork for challenging robustness risk management theatre to increase throughput.
During this talk, Steve Smith will explain why discontinuous delivery is part of the tradition of optimising for MTBF, and how optimising for MTTR can power continuous delivery adoption. This is an overview of a new approach to continuous delivery, backed by examples from private and public sector organisations.
The key learnings for participants are:
1. optimising for MTBF is an antiquated, flawed approach to IT reliability that results in long-term discontinuous delivery
2. if an organisation has optimised for MTBF, a continuous delivery programme focused on throughput is likely to fail
3. optimising for MTTR is a superior reliability strategy that advocates graceful extensibility to limit the impact of failures
4. resilience as a continuous delivery enabler is a heuristic that advocates resilience as the focus of a continuous delivery programme
5. improving the resilience of services by an order of magnitude makes it easier to offer practical alternatives to robustness risk management theatre