Upgrade to Pro
— share decks privately, control downloads, hide ads and more …
Speaker Deck
Speaker Deck
PRO
Sign in
Sign up for free
Alert Handling with Datadog Incident Management
Takeshi Kondo
August 25, 2020
Technology
0
1.1k
Alert Handling with Datadog Incident Management
JDDUG meetup#1
https://datadog-jp.connpass.com/event/185920/
Takeshi Kondo
August 25, 2020
Tweet
Share
More Decks by Takeshi Kondo
See All by Takeshi Kondo
Who owns the Service Level?
chaspy
5
3.3k
多様な働き方を支える Working Agreements / Working agreements that support diverse work styles
chaspy
1
530
SRE を実現するための組織マネジメント / Management to achieve SRE
chaspy
3
3.1k
サービス立ち上げ期におけるSREの取り組み / SRE efforts in the service launch phase
chaspy
0
250
Implementing Site Reliability Engineering in your organization
chaspy
6
2k
How to measure "Site Reliability Engineering"
chaspy
6
1.9k
Site Reliability Engineering における 重要領域とパフォーマンス指標の提案 / Performance Indicators for SRE
chaspy
1
1.6k
Metric-Driven Decision Making with Custom Prometheus Exporter
chaspy
1
800
想定外の負荷を乗り切ったオンライン教育サービスの裏側 / How We Overcame the COVID-19 Crisis
chaspy
7
4.8k
Other Decks in Technology
See All in Technology
What's Data Lake ? Azure Data Lake best practice
ryomaru0825
2
740
多様な成熟度のデータ活用を総合支援するKADOKAWA Connectedのデータ組織について
kadokawaconnected
PRO
0
200
JDK Flight Recorder入門
chiroito
1
500
Target SDK Versionを上げない Notification runtime permission対応
napplecomputer
0
130
データ分析で切り拓け! エンジニアとしてのデータ分析職キャリア戦略
ksnt
0
110
GeoLocationAnchor and MKTileOverlay
toyship
0
110
さいきんのRaspberry Pi。 / osc22do-rpi
akkiesoft
5
5k
スタートアップと技術選定と AWS
track3jyo
PRO
2
320
What's new in Vision
satotakeshi
0
190
The role of the data organization as a business progresses
line_developers
PRO
3
840
【Pythonデータ分析勉強会#33】「DearPyGuiに入門しました」の続き~Image-Processing-Node-Editor~
kazuhitotakahashi
0
110
Microsoft Build 2022 Recap Party!! Azure のデータ & 分析サービス 注目アップデート / microsoft-build-2022-recap-azure-data-and-analytics
nakazax
0
260
Featured
See All Featured
Building Adaptive Systems
keathley
25
1.1k
Music & Morning Musume
bryan
35
4.2k
Distributed Sagas: A Protocol for Coordinating Microservices
caitiem20
315
19k
Product Roadmaps are Hard
iamctodd
34
6.5k
5 minutes of I Can Smell Your CMS
philhawksworth
196
18k
Code Reviewing Like a Champion
maltzj
506
37k
A designer walks into a library…
pauljervisheath
196
16k
Learning to Love Humans: Emotional Interface Design
aarron
261
37k
Easily Structure & Communicate Ideas using Wireframe
afnizarnur
181
15k
How New CSS Is Changing Everything About Graphic Design on the Web
jensimmons
213
11k
Dealing with People You Can't Stand - Big Design 2015
cassininazir
351
21k
Rebuilding a faster, lazier Slack
samanthasiow
62
7.2k
Transcript
Alert Handling with Datadog Incident Management Takeshi Kondo / @chaspy
2020/08/25 JDDUG meetup#1
Datadog Incident Management
Datadog Incident Management https://www.datadoghq.com/blog/incident-response-with-datadog/
Datadog Incident Management https://docs.datadoghq.com/monitors/incident_management/
Datadog Incident Management https://docs.datadoghq.com/monitors/incident_management/ Cool
Who am I chaspy chaspy_ Lead Software Engineer Site Reliability
Engineering at Quipper Takeshi Kondo
Agenda • Introduction of Datadog Incident Management • Alert Handling
in Quipper
Agenda • Introduction of Datadog Incident Management • Alert Handling
in Quipper
Incident Response 6-Step Plan 1. Preparation 2. Identification 3. Containment
4. Eradication 5. Recovery 6. Review lessons learned https://www.varonis.com/blog/incident-response-plan/
Incident Response 6-Step Plan 1. Preparation 2. Identification 3. Containment
4. Eradication 5. Recovery 6. Review lessons learned -> Postmortem https://www.varonis.com/blog/incident-response-plan/ Incident Management
Datadog Incident Management • Overview • Timeline • Remediation
Datadog Incident Management • Overview • Timeline • Remediation
Datadog Incident Management: Overview
Severity Levels: Smart default and configurable
Status Levels and Properties Fields
Datadog Incident Management: Overview
Datadog Incident Management • Overview • Timeline • Remediation
Datadog Incident Management: Timeline
Datadog Incident Management • Overview • Timeline • Remediation
Datadog Incident Management: Remediation
Agenda • Introduction of Datadog Incident Management • Alert Handling
in Quipper
See “Alerting Strategy for Self-Contained Team” https://speakerdeck.com/chaspy/alerting-strategy-for-self-contained-team
Review alerts Daily
Review alerts at Daily Standup
Review alerts at Daily Standup
Thank You! chaspy chaspy_ Lead Software Engineer at Quipper Takeshi
Kondo