Upgrade to Pro
— share decks privately, control downloads, hide ads and more …
Speaker Deck
Features
Speaker Deck
PRO
Sign in
Sign up for free
Search
Search
Alert Handling with Datadog Incident Management
Search
Takeshi Kondo
August 25, 2020
Technology
0
1.4k
Alert Handling with Datadog Incident Management
JDDUG meetup#1
https://datadog-jp.connpass.com/event/185920/
Takeshi Kondo
August 25, 2020
Tweet
Share
More Decks by Takeshi Kondo
See All by Takeshi Kondo
エンジニアブランディングチームの KPI / KPI's of engineer branding team
chaspy
1
140
「SLO Review」今やるならこうする / If I had to do the "SLO Review" again
chaspy
3
650
開発者とともに作る Site Reliability Engineering / SREing with Developers
chaspy
10
5.6k
自己診断能力の獲得を目指して / Toward the acquisition of self-diagnostic skills
chaspy
1
2.4k
『スタディサプリ 中学講座』における E2E Test の運用と計測による改善 / Improved E2E testing through measurement
chaspy
0
2.7k
『スタディサプリ』における SLI/SLO の継続的改善 / Continuous improvement of SLI/SLO at StudySapuri
chaspy
1
1.8k
ポストモーテム運用を支える文化と技術 / Culture and Technology Supporting Postmortem Operations
chaspy
2
1.1k
Who owns the Service Level?
chaspy
5
9.7k
多様な働き方を支える Working Agreements / Working agreements that support diverse work styles
chaspy
1
1.9k
Other Decks in Technology
See All in Technology
XRミーティング 2024-03-20
1ftseabass
PRO
0
100
生成AIサービスPanorama AIご説明資料
sdt
0
300
検証からプロダクトへ: シームレスなLLM開発の ためのしくみ作り
nunukim
1
110
長文から長文を生成するLLMツールをオープンソースで作ってみた。
tomohisa
2
140
任意コード実行の原理
ffri
0
170
AMLD 2024 - Build Your Own GPT
donlelef
1
260
ビジネスとコード品質の接合点 そしてコード品質がそこに及ぼす影響 / The Intersections of Business and Engineering, and The Impact of Code Quality There
mtx2s
10
1k
Cloud Friendly(?) Jenkins. How we failed to make Jenkins cloud native and what we learned?
onenashev
PRO
0
110
「XX試験の環境作ってよ」と言われた時によく使うAWSのソリューションについて
bun913
0
120
技育祭2024春 LT Finatextホールディングス
kevinrobot34
1
160
あらゆる商品を扱う商品データベースを再設計した話 / product db re-architecture
rince
7
3.4k
PG-Stromの性能評価レポート~ Star Schema Benchmark を例に~ / pgstrom_ssb_report_2024
sakaik
0
100
Featured
See All Featured
Done Done
chrislema
178
15k
Facilitating Awesome Meetings
lara
39
5.5k
Raft: Consensus for Rubyists
vanstee
130
6.2k
Atom: Resistance is Futile
akmur
258
25k
The Pragmatic Product Professional
lauravandoore
24
5.7k
Making the Leap to Tech Lead
cromwellryan
122
8.4k
ReactJS: Keep Simple. Everything can be a component!
pedronauck
657
120k
Responsive Adventures: Dirty Tricks From The Dark Corners of Front-End
smashingmag
242
20k
Fontdeck: Realign not Redesign
paulrobertlloyd
75
4.8k
Teambox: Starting and Learning
jrom
126
8.4k
The Illustrated Children's Guide to Kubernetes
chrisshort
28
46k
Producing Creativity
orderedlist
PRO
335
39k
Transcript
Alert Handling with Datadog Incident Management Takeshi Kondo / @chaspy
2020/08/25 JDDUG meetup#1
Datadog Incident Management
Datadog Incident Management https://www.datadoghq.com/blog/incident-response-with-datadog/
Datadog Incident Management https://docs.datadoghq.com/monitors/incident_management/
Datadog Incident Management https://docs.datadoghq.com/monitors/incident_management/ Cool
Who am I chaspy chaspy_ Lead Software Engineer Site Reliability
Engineering at Quipper Takeshi Kondo
Agenda • Introduction of Datadog Incident Management • Alert Handling
in Quipper
Agenda • Introduction of Datadog Incident Management • Alert Handling
in Quipper
Incident Response 6-Step Plan 1. Preparation 2. Identification 3. Containment
4. Eradication 5. Recovery 6. Review lessons learned https://www.varonis.com/blog/incident-response-plan/
Incident Response 6-Step Plan 1. Preparation 2. Identification 3. Containment
4. Eradication 5. Recovery 6. Review lessons learned -> Postmortem https://www.varonis.com/blog/incident-response-plan/ Incident Management
Datadog Incident Management • Overview • Timeline • Remediation
Datadog Incident Management • Overview • Timeline • Remediation
Datadog Incident Management: Overview
Severity Levels: Smart default and configurable
Status Levels and Properties Fields
Datadog Incident Management: Overview
Datadog Incident Management • Overview • Timeline • Remediation
Datadog Incident Management: Timeline
Datadog Incident Management • Overview • Timeline • Remediation
Datadog Incident Management: Remediation
Agenda • Introduction of Datadog Incident Management • Alert Handling
in Quipper
See “Alerting Strategy for Self-Contained Team” https://speakerdeck.com/chaspy/alerting-strategy-for-self-contained-team
Review alerts Daily
Review alerts at Daily Standup
Review alerts at Daily Standup
Thank You! chaspy chaspy_ Lead Software Engineer at Quipper Takeshi
Kondo