Upgrade to PRO for Only $50/Year—Limited-Time Offer! 🔥
Speaker Deck
Features
Speaker Deck
PRO
Sign in
Sign up for free
Search
Search
Monitoring in Motion: Challenges of Monitoring ...
Search
Ilan Rabinovitch
February 26, 2016
Technology
0
100
Monitoring in Motion: Challenges of Monitoring Containers and Kuberntes
Ilan Rabinovitch
February 26, 2016
Tweet
Share
More Decks by Ilan Rabinovitch
See All by Ilan Rabinovitch
Monitoring in Motion - ContainerCon 2016
irabinovitch
0
100
Data Driven Post Mortems at Datadog - LinuxCon 2016
irabinovitch
1
210
Introduction to Docker Monitoring
irabinovitch
0
150
OSCON 2016 - Monitoring in Motion
irabinovitch
2
180
Monitoring OpenStack at Lithium (OpenStack Summit Austin 2016)
irabinovitch
0
69
LinuxFest Northwest 2016 - Monitoring 101
irabinovitch
0
44
Monitoring ECS and Dynamic Infrastructure
irabinovitch
0
110
Doing DevOps Right with Datadog + Pagerduty
irabinovitch
0
130
Docker Usage Patterns - Docker Meetup Palo Alto - Nov 2015
irabinovitch
0
68
Other Decks in Technology
See All in Technology
小さな判断で育つ、大きな意思決定力 / 20251204 Takahiro Kinjo
shift_evolve
PRO
1
570
寫了幾年 Code,然後呢?軟體工程師必須重新認識的 DevOps
cheng_wei_chen
1
530
re:Invent2025 コンテナ系アップデート振り返り(+CloudWatchログのアップデート紹介)
masukawa
0
300
Uncertainty in the LLM era - Science, more than scale
gaelvaroquaux
0
800
GitHub Copilotを使いこなす 実例に学ぶAIコーディング活用術
74th
3
1.2k
コミューンのデータ分析AIエージェント「Community Sage」の紹介
fufufukakaka
0
430
Playwright x GitHub Actionsで実現する「レビューしやすい」E2Eテストレポート
kinosuke01
0
320
ログ管理の新たな可能性?CloudWatchの新機能をご紹介
ikumi_ono
0
460
21st ACRi Webinar - Univ of Tokyo Presentation Slide (Ayumi Ohno)
nao_sumikawa
0
120
21st ACRi Webinar - Univ of Tokyo Presentation Slide (Shinya Takamaeda)
nao_sumikawa
0
120
“決まらない”NSM設計への処方箋 〜ビットキーにおける現実的な指標デザイン事例〜 / A Prescription for "Stuck" NSM Design: Bitkey’s Practical Case Study
bitkey
PRO
1
580
Oracle Technology Night #95 GoldenGate 26ai の実装に迫る1
oracle4engineer
PRO
0
150
Featured
See All Featured
JavaScript: Past, Present, and Future - NDC Porto 2020
reverentgeek
52
5.7k
GraphQLの誤解/rethinking-graphql
sonatard
73
11k
The Hidden Cost of Media on the Web [PixelPalooza 2025]
tammyeverts
1
92
Code Reviewing Like a Champion
maltzj
527
40k
Become a Pro
speakerdeck
PRO
31
5.7k
The Web Performance Landscape in 2024 [PerfNow 2024]
tammyeverts
12
970
Being A Developer After 40
akosma
91
590k
[RailsConf 2023 Opening Keynote] The Magic of Rails
eileencodes
31
9.8k
Imperfection Machines: The Place of Print at Facebook
scottboms
269
13k
The Power of CSS Pseudo Elements
geoffreycrofte
80
6.1k
YesSQL, Process and Tooling at Scale
rocio
174
15k
Fantastic passwords and where to find them - at NoRuKo
philnash
52
3.5k
Transcript
Monitoring In Motion Challenges in Monitoring Kubernetes & Containers Cloud
Native SF Meetup Feb 25, 2016 Ilan Rabinovitch Director, Community Datadog
About Me • Long time Datadog user. • Prior to
Datadog built automation and monitoring tooling at Ooyala and Edmunds.com • SCALE and TXLF Co-Founder Ilan Rabinovitch Datadog
[email protected]
@irabinovitch
Agenda • Monitoring 101 - Crash Course • Challenges in
Monitoring Dynamic Infrastructure • Demo Time • Questions?
Monitoring Everything
None
@honest_update on Twitter
Quick Overview of Datadog • Monitoring for modern applications. •
Time series storage of metrics and events. • Trending, alerting and anomaly detection. • Hundreds of integrations out of the box.
Monitoring 101: Categorization More at: http://goo.gl/t1Rgcg
None
Monitoring 101: Focus on symptoms More at: http://goo.gl/t1Rgcg
Recurse until you find root cause. More at: http://goo.gl/t1Rgcg
Container Monitoring Challenges
https://www.datadoghq.com/docker-adoption/
None
None
Operational Complexity •Average containers per host: N (N=4, 10/2015) •N-times
as many “hosts” to manage •Affects everything
Operational Complexity: Scale 100 instances 400 containers
Operational Complexity: Scale 160 metrics per host 640 metrics per
host
Operational Complexity: Scale 100 instances 64,000 metrics
None
Host Centric vs Service Centric
Host Centric vs Service Centric
Query Based Monitoring … … …
•Use tags, labels, etc on your hosts and metrics. •Pull
in existing labels from your infrastructure (Region, Docker Images, K8S Tags..) Query Based Monitoring By using tags, auto-adapt!
Where is my application running ? What’s the total throughput
of App X ? What’s its response time per tag ? (pod, version, DC) What’s the distribution of 5xx from Nginx per pod ?
Auto Discovery
Docker API Kubelet API Monitoring Agent Container A O A
O A O Application Container Off-The-Shelf Application (Redis, PostgreSQL, …) Containers List Metadata Additional Metadata (Pod names, RC, …) Config Backend Integration Configurations Host Level Metrics
Some Pictures Dashboards and Metrics Alerts Sharing
Demo time