Big Red Button (Chatbots SG Nov 2019)

780e86312035da00762813aa2e443ae8?s=47 Amy Nguyen
November 07, 2019

Big Red Button (Chatbots SG Nov 2019)

When an incident starts, ten different things need to happen at once. You need to get an incident commander, you need to get all the right people in the room, you need to mitigate the incident, and you need to stay organized. At Stripe, we've built a tool for automating as much of the routine tasks as possible so responders can focus on what humans do best. In this talk, I'll show you the Big Red Button, a web form that sends emails, creates JIRA tickets, opens Slack channels, sends pages, and more. We'll talk about the unique constraints of this tool (such as, how much incident metadata do you ask for up-front?) and how our incident management philosophy influenced our design.

780e86312035da00762813aa2e443ae8?s=128

Amy Nguyen

November 07, 2019
Tweet

Transcript

  1. @amyngyn Big Red Button How Stripe automates incident management Amy

    Nguyen She/Her @amyngyn Chatbots SG
  2. @amyngyn What's Stripe? 2 • Stripe builds economic infrastructure for

    the Internet • Security and reliability are the most important values we can provide to our users • We started an engineering hub in Singapore last year! Who are you? • I'm a software engineer living here in Singapore developing payment APIs • I was on the Reliability Tooling team in San Francisco • Find me online at amy.dev and on twitter @amyngyn
  3. @amyngyn 3 What do you do when something's broken?

  4. @amyngyn 4 What do you do when something's broken? •

    Page someone • Find the largest conference room and take it over • Scream • Fix the problem
  5. @amyngyn 5 What about... • Announce where incident firefighting is

    happening • Update your company's status page • Update stakeholders and leadership • Create a ticket to record the incident • Document the incident timeline • Write a public-facing retrospective • Track remediation items • Sending messages to users • Lock deploys
  6. @amyngyn 6

  7. @amyngyn 7 Introducing Big Red Button

  8. @amyngyn 8

  9. @amyngyn 9

  10. @amyngyn 10

  11. @amyngyn 11

  12. @amyngyn 12

  13. @amyngyn 13

  14. @amyngyn 14

  15. @amyngyn 15

  16. @amyngyn 16

  17. @amyngyn 17

  18. @amyngyn 18

  19. @amyngyn 19

  20. @amyngyn 20

  21. @amyngyn 21

  22. @amyngyn 22

  23. @amyngyn 23

  24. @amyngyn 24

  25. @amyngyn 25

  26. @amyngyn 26

  27. @amyngyn 27

  28. @amyngyn 28

  29. @amyngyn 29

  30. @amyngyn 30

  31. @amyngyn 31

  32. @amyngyn 32 web UI

  33. @amyngyn 33 web UI

  34. @amyngyn 1 34 web UI

  35. @amyngyn 1 3 2 35 web UI

  36. @amyngyn 36 incident event web UI

  37. @amyngyn 37 incident event web UI event event event

  38. @amyngyn 38

  39. @amyngyn 39

  40. @amyngyn 40 abstract service

  41. @amyngyn 41 slack service email service abstract service […] service

  42. @amyngyn 42 slack service email service abstract service […] service

    incident close event handler recovery steps incident [...] event handler
  43. @amyngyn 43

  44. @amyngyn 44 Thanks!