it becomes sensible to force failure and observe how our system responds to it as a planned exercise. It’s better to have a system fail for the first time while an entire team is watching and ready to take action, than at 3 a.m. when a system alert wakes up an on-call engineer.