Warning: This Talk Contains Content Known to the State of California to Reduce Alert Fatigue

Warning: This Talk Contains Content Known to the State of
California to Reduce Alert Fatigue Aditya Mukerjee Observability Engineer at Stripe @chimeracoder

@chimeracoder

Why we can learn from clinical healthcare •Direct personal contact
•Visibly high-stakes •Systems which are difficult to control @chimeracoder

Alert Fatigue and Decision Fatigue @chimeracoder

When the frequency or severity of alerts causes the responder
either to ignore important alerts or make mistakes more frequently @chimeracoder Alert Fatigue

When the frequency or complexity of decision points causes a
person to avoid decisions or make mistakes more frequently. @chimeracoder Decision Fatigue

Alert Fatigue deals with the observability of systems @chimeracoder Decision
Fatigue deals with the controllability of systems

72-99% of clinical alarms are false positives @chimeracoder …but certain
patterns of alerts and decisions contribute disproportionately to fatigue!

Four Steps to Reducing Alert Fatigue: STAT @chimeracoder (Supported, Trustworthy,
Actionable, Triaged)

Supported •Who owns this monitor? •Who has the right or
authority to change it? @chimeracoder

@chimeracoder An alerting system includes the people who participate in
responding to alerts, not just the software that generates alerts

The person responding to an alert always has the right
to change it, whether we realize it or not @chimeracoder

Responders must feel ownership over the end result @chimeracoder

Trustworthy • Do I trust this alert to notify me
when a problem happens? • Do I trust this alert to stay silent when all is well? • Do I trust this alert to give me sufficient information to diagnose problems? @chimeracoder

Anomaly detection and opaque algorithms If you don’t understand why
an alert is firing, you don’t understand whether it’s real or not @chimeracoder

When to use modeling for monitors •Does the model represent
the interconnectedness of your systems? •Can the thresholds be adjusted? •Are the model parameters and outputs human-interpretable? @chimeracoder

Actionable •At most one decision required to respond •Alerts that
are difficult to action become alerts that are ignored @chimeracoder

Making alerts more actionable “investigate”, “something”, “somewhere”, “someone” @chimeracoder Decision
trees, interactive tooling, making the alerts specific

If it’s unclear who should be taking action, the alert
is not actionable @chimeracoder

Triaged •Meticulously triage alerts •Alert type should reflect urgency •Urgency
of alerts can change @chimeracoder

Steps for triaging • Commonly-understood tiers • Regular, periodic re-evaluation
process @chimeracoder

What’s wrong with Prop 65 warnings? @chimeracoder

STAT is just the beginning @chimeracoder

Takeaways •Alert fatigue and decision fatigue deplete executive function •Tackle
alert fatigue and decision fatigue in tandem •Use STAT as a quick check to evaluate alerting systems •Regularly re-evaluate your alerts and alerting systems @chimeracoder

Thank you! Aditya Mukerjee @chimeracoder

Warning: This Talk Contains Content Known to th...

Warning: This Talk Contains Content Known to the State of California to Reduce Alert Fatigue

Aditya Mukerjee

More Decks by Aditya Mukerjee

Other Decks in Technology

Featured

Transcript