Slide 1

Slide 1 text

Alert Handling with Datadog Incident Management Takeshi Kondo / @chaspy 2020/08/25 JDDUG meetup#1

Slide 2

Slide 2 text

Datadog Incident Management

Slide 3

Slide 3 text

Datadog Incident Management https://www.datadoghq.com/blog/incident-response-with-datadog/

Slide 4

Slide 4 text

Datadog Incident Management https://docs.datadoghq.com/monitors/incident_management/

Slide 5

Slide 5 text

Datadog Incident Management https://docs.datadoghq.com/monitors/incident_management/ Cool

Slide 6

Slide 6 text

Who am I chaspy chaspy_ Lead Software Engineer Site Reliability Engineering at Quipper Takeshi Kondo

Slide 7

Slide 7 text

Agenda • Introduction of Datadog Incident Management • Alert Handling in Quipper

Slide 8

Slide 8 text

Agenda • Introduction of Datadog Incident Management • Alert Handling in Quipper

Slide 9

Slide 9 text

Incident Response 6-Step Plan 1. Preparation 2. Identification 3. Containment 4. Eradication 5. Recovery 6. Review lessons learned https://www.varonis.com/blog/incident-response-plan/

Slide 10

Slide 10 text

Incident Response 6-Step Plan 1. Preparation 2. Identification 3. Containment 4. Eradication 5. Recovery 6. Review lessons learned -> Postmortem https://www.varonis.com/blog/incident-response-plan/ Incident Management

Slide 11

Slide 11 text

Datadog Incident Management • Overview • Timeline • Remediation

Slide 12

Slide 12 text

Datadog Incident Management • Overview • Timeline • Remediation

Slide 13

Slide 13 text

Datadog Incident Management: Overview

Slide 14

Slide 14 text

Severity Levels: Smart default and configurable

Slide 15

Slide 15 text

Status Levels and Properties Fields

Slide 16

Slide 16 text

Datadog Incident Management: Overview

Slide 17

Slide 17 text

Datadog Incident Management • Overview • Timeline • Remediation

Slide 18

Slide 18 text

Datadog Incident Management: Timeline

Slide 19

Slide 19 text

Datadog Incident Management • Overview • Timeline • Remediation

Slide 20

Slide 20 text

Datadog Incident Management: Remediation

Slide 21

Slide 21 text

Agenda • Introduction of Datadog Incident Management • Alert Handling in Quipper

Slide 22

Slide 22 text

See “Alerting Strategy for Self-Contained Team” https://speakerdeck.com/chaspy/alerting-strategy-for-self-contained-team

Slide 23

Slide 23 text

Review alerts Daily

Slide 24

Slide 24 text

Review alerts at Daily Standup

Slide 25

Slide 25 text

Review alerts at Daily Standup

Slide 26

Slide 26 text

Thank You! chaspy chaspy_ Lead Software Engineer at Quipper Takeshi Kondo