Upgrade to Pro
— share decks privately, control downloads, hide ads and more …
Speaker Deck
Features
Speaker Deck
PRO
Sign in
Sign up for free
Search
Search
Monitoring in Motion: Challenges of Monitoring ...
Search
Ilan Rabinovitch
February 26, 2016
Technology
0
97
Monitoring in Motion: Challenges of Monitoring Containers and Kuberntes
Ilan Rabinovitch
February 26, 2016
Tweet
Share
More Decks by Ilan Rabinovitch
See All by Ilan Rabinovitch
Monitoring in Motion - ContainerCon 2016
irabinovitch
0
98
Data Driven Post Mortems at Datadog - LinuxCon 2016
irabinovitch
1
210
Introduction to Docker Monitoring
irabinovitch
0
150
OSCON 2016 - Monitoring in Motion
irabinovitch
2
170
Monitoring OpenStack at Lithium (OpenStack Summit Austin 2016)
irabinovitch
0
64
LinuxFest Northwest 2016 - Monitoring 101
irabinovitch
0
41
Monitoring ECS and Dynamic Infrastructure
irabinovitch
0
110
Doing DevOps Right with Datadog + Pagerduty
irabinovitch
0
120
Docker Usage Patterns - Docker Meetup Palo Alto - Nov 2015
irabinovitch
0
62
Other Decks in Technology
See All in Technology
CSPヘッダー導入で実現するWebサイトの多層防御:今すぐ試せる設定例と運用知見
llamakko
1
280
隙間時間で爆速開発! Claude Code × Vibe Coding で作るマニュアル自動生成サービス
akitomonam
2
240
【CEDEC2025】『Shadowverse: Worlds Beyond』二度目のDCG開発でゲームをリデザインする~遊びやすさと競技性の両立~
cygames
PRO
1
190
alecthomas/kong はいいぞ
fujiwara3
6
1.3k
Gemini in Android Studio - Google I/O Bangkok '25
akexorcist
0
110
マルチモーダル基盤モデルに基づく動画と音の解析技術
lycorptech_jp
PRO
3
370
Kiroから考える AIコーディングツールの潮流
s4yuba
3
570
Kiroでインフラ要件定義~テスト を実施してみた
nagisa53
1
170
クマ×共生 HACKATHON - 熊対策を『特別な行動」から「生活の一部」に -
pharaohkj
0
270
反脆弱性(アンチフラジャイル)とデータ基盤構築
cuebic9bic
2
130
完璧を目指さない小さく始める信頼性向上
kakehashi
PRO
0
130
【CEDEC2025】『ウマ娘 プリティーダービー』における映像制作のさらなる高品質化へ!~ 豊富な素材出力と制作フローの改善を実現するツールについて~
cygames
PRO
0
130
Featured
See All Featured
Producing Creativity
orderedlist
PRO
346
40k
Large-scale JavaScript Application Architecture
addyosmani
512
110k
Designing Experiences People Love
moore
142
24k
Let's Do A Bunch of Simple Stuff to Make Websites Faster
chriscoyier
507
140k
The Straight Up "How To Draw Better" Workshop
denniskardys
235
140k
[RailsConf 2023 Opening Keynote] The Magic of Rails
eileencodes
29
9.6k
Sharpening the Axe: The Primacy of Toolmaking
bcantrill
44
2.4k
A designer walks into a library…
pauljervisheath
207
24k
Put a Button on it: Removing Barriers to Going Fast.
kastner
60
3.9k
Code Review Best Practice
trishagee
69
19k
Building a Modern Day E-commerce SEO Strategy
aleyda
42
7.4k
The Power of CSS Pseudo Elements
geoffreycrofte
77
5.9k
Transcript
Monitoring In Motion Challenges in Monitoring Kubernetes & Containers Cloud
Native SF Meetup Feb 25, 2016 Ilan Rabinovitch Director, Community Datadog
About Me • Long time Datadog user. • Prior to
Datadog built automation and monitoring tooling at Ooyala and Edmunds.com • SCALE and TXLF Co-Founder Ilan Rabinovitch Datadog
[email protected]
@irabinovitch
Agenda • Monitoring 101 - Crash Course • Challenges in
Monitoring Dynamic Infrastructure • Demo Time • Questions?
Monitoring Everything
None
@honest_update on Twitter
Quick Overview of Datadog • Monitoring for modern applications. •
Time series storage of metrics and events. • Trending, alerting and anomaly detection. • Hundreds of integrations out of the box.
Monitoring 101: Categorization More at: http://goo.gl/t1Rgcg
None
Monitoring 101: Focus on symptoms More at: http://goo.gl/t1Rgcg
Recurse until you find root cause. More at: http://goo.gl/t1Rgcg
Container Monitoring Challenges
https://www.datadoghq.com/docker-adoption/
None
None
Operational Complexity •Average containers per host: N (N=4, 10/2015) •N-times
as many “hosts” to manage •Affects everything
Operational Complexity: Scale 100 instances 400 containers
Operational Complexity: Scale 160 metrics per host 640 metrics per
host
Operational Complexity: Scale 100 instances 64,000 metrics
None
Host Centric vs Service Centric
Host Centric vs Service Centric
Query Based Monitoring … … …
•Use tags, labels, etc on your hosts and metrics. •Pull
in existing labels from your infrastructure (Region, Docker Images, K8S Tags..) Query Based Monitoring By using tags, auto-adapt!
Where is my application running ? What’s the total throughput
of App X ? What’s its response time per tag ? (pod, version, DC) What’s the distribution of 5xx from Nginx per pod ?
Auto Discovery
Docker API Kubelet API Monitoring Agent Container A O A
O A O Application Container Off-The-Shelf Application (Redis, PostgreSQL, …) Containers List Metadata Additional Metadata (Pod names, RC, …) Config Backend Integration Configurations Host Level Metrics
Some Pictures Dashboards and Metrics Alerts Sharing
Demo time