Upgrade to Pro
— share decks privately, control downloads, hide ads and more …
Speaker Deck
Features
Speaker Deck
PRO
Sign in
Sign up for free
Search
Search
Monitoring in Motion: Challenges of Monitoring ...
Search
Ilan Rabinovitch
February 26, 2016
Technology
0
98
Monitoring in Motion: Challenges of Monitoring Containers and Kuberntes
Ilan Rabinovitch
February 26, 2016
Tweet
Share
More Decks by Ilan Rabinovitch
See All by Ilan Rabinovitch
Monitoring in Motion - ContainerCon 2016
irabinovitch
0
98
Data Driven Post Mortems at Datadog - LinuxCon 2016
irabinovitch
1
210
Introduction to Docker Monitoring
irabinovitch
0
150
OSCON 2016 - Monitoring in Motion
irabinovitch
2
170
Monitoring OpenStack at Lithium (OpenStack Summit Austin 2016)
irabinovitch
0
64
LinuxFest Northwest 2016 - Monitoring 101
irabinovitch
0
41
Monitoring ECS and Dynamic Infrastructure
irabinovitch
0
110
Doing DevOps Right with Datadog + Pagerduty
irabinovitch
0
120
Docker Usage Patterns - Docker Meetup Palo Alto - Nov 2015
irabinovitch
0
63
Other Decks in Technology
See All in Technology
2025年にHCP Vaultを学び直して見えた景色 / Lessons and New Perspectives from Relearning HCP Vault in 2025
aeonpeople
0
200
LLMを搭載したプロダクトの品質保証の模索と学び
qa
0
790
Kiroと学ぶコンテキストエンジニアリング
oikon48
6
8.9k
複数サービスを支えるマルチテナント型Batch MLプラットフォーム
lycorptech_jp
PRO
0
180
エラーとアクセシビリティ
schktjm
0
980
Grafana MCPサーバーによるAIエージェント経由でのGrafanaダッシュボード動的生成
hamadakoji
1
1.3k
AWSで始める実践Dagster入門
kitagawaz
0
250
BPaaSにおける人と協働する前提のAIエージェント-AWS登壇資料
kentarofujii
0
120
ヘブンバーンズレッドのレンダリングパイプライン刷新
gree_tech
PRO
0
550
LLM翻訳ツールの開発と海外のお客様対応等への社内導入事例
gree_tech
PRO
0
540
AI時代に非連続な成長を実現するエンジニアリング戦略
sansantech
PRO
3
1.1k
250905 大吉祥寺.pm 2025 前夜祭 「プログラミングに出会って20年、『今』が1番楽しい」
msykd
PRO
1
460
Featured
See All Featured
Fight the Zombie Pattern Library - RWD Summit 2016
marcelosomers
234
17k
[RailsConf 2023 Opening Keynote] The Magic of Rails
eileencodes
30
9.6k
Building a Modern Day E-commerce SEO Strategy
aleyda
43
7.5k
Optimizing for Happiness
mojombo
379
70k
Chrome DevTools: State of the Union 2024 - Debugging React & Beyond
addyosmani
7
840
Measuring & Analyzing Core Web Vitals
bluesmoon
9
580
A Modern Web Designer's Workflow
chriscoyier
696
190k
XXLCSS - How to scale CSS and keep your sanity
sugarenia
248
1.3M
How to train your dragon (web standard)
notwaldorf
96
6.2k
How GitHub (no longer) Works
holman
315
140k
The Illustrated Children's Guide to Kubernetes
chrisshort
48
50k
What’s in a name? Adding method to the madness
productmarketing
PRO
23
3.6k
Transcript
Monitoring In Motion Challenges in Monitoring Kubernetes & Containers Cloud
Native SF Meetup Feb 25, 2016 Ilan Rabinovitch Director, Community Datadog
About Me • Long time Datadog user. • Prior to
Datadog built automation and monitoring tooling at Ooyala and Edmunds.com • SCALE and TXLF Co-Founder Ilan Rabinovitch Datadog
[email protected]
@irabinovitch
Agenda • Monitoring 101 - Crash Course • Challenges in
Monitoring Dynamic Infrastructure • Demo Time • Questions?
Monitoring Everything
None
@honest_update on Twitter
Quick Overview of Datadog • Monitoring for modern applications. •
Time series storage of metrics and events. • Trending, alerting and anomaly detection. • Hundreds of integrations out of the box.
Monitoring 101: Categorization More at: http://goo.gl/t1Rgcg
None
Monitoring 101: Focus on symptoms More at: http://goo.gl/t1Rgcg
Recurse until you find root cause. More at: http://goo.gl/t1Rgcg
Container Monitoring Challenges
https://www.datadoghq.com/docker-adoption/
None
None
Operational Complexity •Average containers per host: N (N=4, 10/2015) •N-times
as many “hosts” to manage •Affects everything
Operational Complexity: Scale 100 instances 400 containers
Operational Complexity: Scale 160 metrics per host 640 metrics per
host
Operational Complexity: Scale 100 instances 64,000 metrics
None
Host Centric vs Service Centric
Host Centric vs Service Centric
Query Based Monitoring … … …
•Use tags, labels, etc on your hosts and metrics. •Pull
in existing labels from your infrastructure (Region, Docker Images, K8S Tags..) Query Based Monitoring By using tags, auto-adapt!
Where is my application running ? What’s the total throughput
of App X ? What’s its response time per tag ? (pod, version, DC) What’s the distribution of 5xx from Nginx per pod ?
Auto Discovery
Docker API Kubelet API Monitoring Agent Container A O A
O A O Application Container Off-The-Shelf Application (Redis, PostgreSQL, …) Containers List Metadata Additional Metadata (Pod names, RC, …) Config Backend Integration Configurations Host Level Metrics
Some Pictures Dashboards and Metrics Alerts Sharing
Demo time