Upgrade to Pro
— share decks privately, control downloads, hide ads and more …
Speaker Deck
Features
Speaker Deck
PRO
Sign in
Sign up for free
Search
Search
Alert Handling with Datadog Incident Management
Search
Takeshi Kondo
August 25, 2020
Technology
0
1.4k
Alert Handling with Datadog Incident Management
JDDUG meetup#1
https://datadog-jp.connpass.com/event/185920/
Takeshi Kondo
August 25, 2020
Tweet
Share
More Decks by Takeshi Kondo
See All by Takeshi Kondo
SRE の考えをマネジメントに活かす / applying SRE ideas to management
chaspy
7
3.5k
RAGの簡易評価によるフィードバックサイクル実践 / Feedback cycle practice through simplified assessment of RAGs
chaspy
2
2.9k
定量データと定性評価を用いた技術戦略の組織的実践 / Systematic implementation of technology strategies using quantitative data and qualitative evaluation
chaspy
9
1.3k
エンジニアブランディングチームの KPI / KPI's of engineer branding team
chaspy
2
1.4k
「SLO Review」今やるならこうする / If I had to do the "SLO Review" again
chaspy
3
1.2k
開発者とともに作る Site Reliability Engineering / SREing with Developers
chaspy
10
7.1k
自己診断能力の獲得を目指して / Toward the acquisition of self-diagnostic skills
chaspy
1
3.8k
『スタディサプリ 中学講座』における E2E Test の運用と計測による改善 / Improved E2E testing through measurement
chaspy
0
3.6k
『スタディサプリ』における SLI/SLO の継続的改善 / Continuous improvement of SLI/SLO at StudySapuri
chaspy
1
2.6k
Other Decks in Technology
See All in Technology
The XZ Backdoor Story
fr0gger
0
2.8k
Autonomous Database Cloud 技術詳細 / adb-s_technical_detail_jp
oracle4engineer
PRO
15
40k
スタッフエンジニアの道: The Staff Engineer’s Path
snoozer05
PRO
30
11k
LandingZoneAccelerator と学ぶ 「スケーラブルで安全なマルチアカウントAWS環境」と 私たちにもできるベストプラクティス
maimyyym
1
120
AI でアップデートする既存テクノロジーと、クラウドエンジニアの生きる道
soracom
PRO
2
380
プロダクトエンジニアを支えるための開発生産性向上施策
tsukakei
0
130
AWS SAW を広めたい @四国クラウドお遍路
kazzpapa3
0
210
効果的なオンコール対応と障害対応
ryuichi1208
5
2.4k
デジタル化・DX推進あるある
y150saya
0
240
サーバレスでモバイルアプリ開発! NTTコム「ビジネスdアプリ」のアーキテクチャ / The architecture of business d app
nttcom
12
200
四国のあのイベントの〇〇システムを45日間で構築した話 / cloudohenro2024_tachibana
biatunky
0
300
Eventual Detection Engineering
ken5scal
0
1.3k
Featured
See All Featured
CoffeeScript is Beautiful & I Never Want to Write Plain JavaScript Again
sstephenson
157
15k
Testing 201, or: Great Expectations
jmmastey
36
7k
A Philosophy of Restraint
colly
202
16k
個人開発の失敗を避けるイケてる考え方 / tips for indie hackers
panda_program
88
16k
Imperfection Machines: The Place of Print at Facebook
scottboms
263
13k
Speed Design
sergeychernyshev
21
420
Designing Experiences People Love
moore
138
23k
How to name files
jennybc
75
98k
Designing for Performance
lara
604
68k
Product Roadmaps are Hard
iamctodd
PRO
48
10k
Facilitating Awesome Meetings
lara
49
5.9k
Atom: Resistance is Futile
akmur
261
25k
Transcript
Alert Handling with Datadog Incident Management Takeshi Kondo / @chaspy
2020/08/25 JDDUG meetup#1
Datadog Incident Management
Datadog Incident Management https://www.datadoghq.com/blog/incident-response-with-datadog/
Datadog Incident Management https://docs.datadoghq.com/monitors/incident_management/
Datadog Incident Management https://docs.datadoghq.com/monitors/incident_management/ Cool
Who am I chaspy chaspy_ Lead Software Engineer Site Reliability
Engineering at Quipper Takeshi Kondo
Agenda • Introduction of Datadog Incident Management • Alert Handling
in Quipper
Agenda • Introduction of Datadog Incident Management • Alert Handling
in Quipper
Incident Response 6-Step Plan 1. Preparation 2. Identification 3. Containment
4. Eradication 5. Recovery 6. Review lessons learned https://www.varonis.com/blog/incident-response-plan/
Incident Response 6-Step Plan 1. Preparation 2. Identification 3. Containment
4. Eradication 5. Recovery 6. Review lessons learned -> Postmortem https://www.varonis.com/blog/incident-response-plan/ Incident Management
Datadog Incident Management • Overview • Timeline • Remediation
Datadog Incident Management • Overview • Timeline • Remediation
Datadog Incident Management: Overview
Severity Levels: Smart default and configurable
Status Levels and Properties Fields
Datadog Incident Management: Overview
Datadog Incident Management • Overview • Timeline • Remediation
Datadog Incident Management: Timeline
Datadog Incident Management • Overview • Timeline • Remediation
Datadog Incident Management: Remediation
Agenda • Introduction of Datadog Incident Management • Alert Handling
in Quipper
See “Alerting Strategy for Self-Contained Team” https://speakerdeck.com/chaspy/alerting-strategy-for-self-contained-team
Review alerts Daily
Review alerts at Daily Standup
Review alerts at Daily Standup
Thank You! chaspy chaspy_ Lead Software Engineer at Quipper Takeshi
Kondo