Video will be at SRECon Program: https://www.usenix.org/conference/srecon15/program/presentation/barth
Surviving a large scale outage requires more than just standing up a few extra servers. Validation and capacity planning can mean the difference between proper mitigation, or just a bunch of wasted effort. This talk will explore how to ensure DR success, gleaned from PagerDuty's production systems.