Upgrade to Pro
— share decks privately, control downloads, hide ads and more …
Speaker Deck
Features
Speaker Deck
PRO
Sign in
Sign up for free
Search
Search
Alert Handling with Datadog Incident Management
Search
Takeshi Kondo
August 25, 2020
Technology
0
1.6k
Alert Handling with Datadog Incident Management
JDDUG meetup#1
https://datadog-jp.connpass.com/event/185920/
Takeshi Kondo
August 25, 2020
Tweet
Share
More Decks by Takeshi Kondo
See All by Takeshi Kondo
SRE NEXT CfP チームが語る 聞きたくなるプロポーザルとは / Proposals by the SRE NEXT CfP Team that are sure to be accepted
chaspy
2
1.4k
Slack Platform(Deno) での RAG 実装 - LangChain(js) を使ってみた / rag-implementation-on-slack-platform-deno-experimenting-with-langchain-js
chaspy
0
250
SRE の考えをマネジメントに活かす / applying SRE ideas to management
chaspy
7
7.8k
RAGの簡易評価によるフィードバックサイクル実践 / Feedback cycle practice through simplified assessment of RAGs
chaspy
2
5.6k
定量データと定性評価を用いた技術戦略の組織的実践 / Systematic implementation of technology strategies using quantitative data and qualitative evaluation
chaspy
9
2k
エンジニアブランディングチームの KPI / KPI's of engineer branding team
chaspy
2
2.3k
「SLO Review」今やるならこうする / If I had to do the "SLO Review" again
chaspy
3
2.1k
開発者とともに作る Site Reliability Engineering / SREing with Developers
chaspy
10
8.5k
自己診断能力の獲得を目指して / Toward the acquisition of self-diagnostic skills
chaspy
1
5.3k
Other Decks in Technology
See All in Technology
様々なファイルシステム
sat
PRO
0
270
DMMの検索システムをSolrからElasticCloudに移行した話
hmaa_ryo
0
290
serverless team topology
_kensh
3
250
触れるけど壊れないWordPressの作り方
masakawai
0
140
[Journal club] Thinking in Space: How Multimodal Large Language Models See, Remember, and Recall Spaces
keio_smilab
PRO
0
100
Azure Well-Architected Framework入門
tomokusaba
1
150
ストレージエンジニアの仕事と、近年の計算機について / 第58回 情報科学若手の会
pfn
PRO
4
920
オブザーバビリティと育てた ID管理・認証認可基盤の歩み / The Journey of an ID Management, Authentication, and Authorization Platform Nurtured with Observability
kaminashi
2
1.5k
OpenCensusと歩んだ7年間
bgpat
0
260
境界線が消える世界におけるQAエンジニアのキャリアの可能性を考える / Considering the Career Possibilities for QA Engineers
mii3king
2
100
OPENLOGI Company Profile for engineer
hr01
1
46k
猫でもわかるAmazon Q Developer CLI 解体新書
kentapapa
1
180
Featured
See All Featured
Building Applications with DynamoDB
mza
96
6.7k
Evolution of real-time – Irina Nazarova, EuRuKo, 2024
irinanazarova
9
1k
ピンチをチャンスに:未来をつくるプロダクトロードマップ #pmconf2020
aki_iinuma
127
54k
The Success of Rails: Ensuring Growth for the Next 100 Years
eileencodes
46
7.7k
JavaScript: Past, Present, and Future - NDC Porto 2020
reverentgeek
52
5.7k
Build The Right Thing And Hit Your Dates
maggiecrowley
38
2.9k
Docker and Python
trallard
46
3.6k
How GitHub (no longer) Works
holman
315
140k
The Myth of the Modular Monolith - Day 2 Keynote - Rails World 2024
eileencodes
26
3.1k
Understanding Cognitive Biases in Performance Measurement
bluesmoon
31
2.7k
Side Projects
sachag
455
43k
Building Flexible Design Systems
yeseniaperezcruz
329
39k
Transcript
Alert Handling with Datadog Incident Management Takeshi Kondo / @chaspy
2020/08/25 JDDUG meetup#1
Datadog Incident Management
Datadog Incident Management https://www.datadoghq.com/blog/incident-response-with-datadog/
Datadog Incident Management https://docs.datadoghq.com/monitors/incident_management/
Datadog Incident Management https://docs.datadoghq.com/monitors/incident_management/ Cool
Who am I chaspy chaspy_ Lead Software Engineer Site Reliability
Engineering at Quipper Takeshi Kondo
Agenda • Introduction of Datadog Incident Management • Alert Handling
in Quipper
Agenda • Introduction of Datadog Incident Management • Alert Handling
in Quipper
Incident Response 6-Step Plan 1. Preparation 2. Identification 3. Containment
4. Eradication 5. Recovery 6. Review lessons learned https://www.varonis.com/blog/incident-response-plan/
Incident Response 6-Step Plan 1. Preparation 2. Identification 3. Containment
4. Eradication 5. Recovery 6. Review lessons learned -> Postmortem https://www.varonis.com/blog/incident-response-plan/ Incident Management
Datadog Incident Management • Overview • Timeline • Remediation
Datadog Incident Management • Overview • Timeline • Remediation
Datadog Incident Management: Overview
Severity Levels: Smart default and configurable
Status Levels and Properties Fields
Datadog Incident Management: Overview
Datadog Incident Management • Overview • Timeline • Remediation
Datadog Incident Management: Timeline
Datadog Incident Management • Overview • Timeline • Remediation
Datadog Incident Management: Remediation
Agenda • Introduction of Datadog Incident Management • Alert Handling
in Quipper
See “Alerting Strategy for Self-Contained Team” https://speakerdeck.com/chaspy/alerting-strategy-for-self-contained-team
Review alerts Daily
Review alerts at Daily Standup
Review alerts at Daily Standup
Thank You! chaspy chaspy_ Lead Software Engineer at Quipper Takeshi
Kondo