the daily start- of-day mood and cortisol awakening response. EXTENDED WORK AVAILABILITY AND ITS RELATION WITH START-OF-DAY MOOD AND CORTISOL. J OCCUP HEALTH PSYCHOL. 2016 JAN;21(1):105-18. DOI: 10.1037/A0039602. EPUB 2015 AUG 3.
bring in the incident response team together. 5:30pm 5:50pm 6:15 On call adds me as a responder. So, I get paged. 6:00 Alerting team on call receives an alert.
the problem. After a lot of debugging, we found a bug. We sent the fix but there were still inconsistencies. 6:40pm 8:00pm 9:00pm We ran the data sync job and brought back the app into a healthy state. 2:00am
three step escalation paths for different priorities Arrange development duties during on call based on your pager load Assign on-call temporarily to the engineer making the deployment Heroism is not sustainable Even Iron Man needs backup
on-call engineers In 2010, a Massachusetts hospital patient died after alarms signaling a critical event went unnoticed by 10 nurses. The patient safety officials shared that there are many reported deaths because of malfunctioned, turned off, ignored, or unheard alarms. https://www.ncbi.nlm.nih.gov/pmc/articles/PMC4926996/
better than a lot of alerts. Make the different clear: tickets vs alerts Add context on alerts; runbooks, details, logs, on-click remediation actions Identify and review repeating alerts
up Make information accessible to everyone Create open on-call policies Are engineers supposed to be on call during nights? If on call during nights, is there flexibility to work from home the next day or start the next day later than usual? Are engineers supposed to do development work during the on-call time? Maximum how many times in a month would an engineer be on-call?
escalation paths Be open and share knowledge Create a blameless culture, not just postmortems Embrace effective alerting practices Practice incident response Compensate on call KEY TAKEAWAYS