Upgrade to Pro
— share decks privately, control downloads, hide ads and more …
Speaker Deck
Features
Speaker Deck
PRO
Sign in
Sign up for free
Search
Search
Monitoring is dead
Search
Sponsored
·
SiteGround - Reliable hosting with speed, security, and support you can count on.
→
Sebastian Montini
September 21, 2018
Technology
0
240
Monitoring is dead
Sebastian Montini
September 21, 2018
Tweet
Share
More Decks by Sebastian Montini
See All by Sebastian Montini
AWS Community Day BA 2019
sebamontini
0
79
Giraffe: our journey to support 1 million metrics per second
sebamontini
0
180
Nomad-PyCon2017
sebamontini
0
87
Atlas, a PaaS with batteries included
sebamontini
0
78
Nomad: The sequel
sebamontini
1
160
Nomad, a love story
sebamontini
0
160
Aurora: 5 Tb later ...
sebamontini
0
85
Ansible 202 - Sysarmy Meetup
sebamontini
0
100
Cloud Computing: All that glitters is not AWS - Nerdear.la 2016
sebamontini
0
65
Other Decks in Technology
See All in Technology
仕様書駆動AI開発の実践: Issue→Skill→PRテンプレで 再現性を作る
knishioka
2
640
【Oracle Cloud ウェビナー】[Oracle AI Database + AWS] Oracle Database@AWSで広がるクラウドの新たな選択肢とAI時代のデータ戦略
oracle4engineer
PRO
2
140
Agile Leadership Summit Keynote 2026
m_seki
1
600
ブロックテーマ、WordPress でウェブサイトをつくるということ / 2026.02.07 Gifu WordPress Meetup
torounit
0
180
配列に見る bash と zsh の違い
kazzpapa3
1
140
~Everything as Codeを諦めない~ 後からCDK
mu7889yoon
3
330
OWASP Top 10:2025 リリースと 少しの日本語化にまつわる裏話
okdt
PRO
3
720
こんなところでも(地味に)活躍するImage Modeさんを知ってるかい?- Image Mode for OpenShift -
tsukaman
0
130
生成AI時代にこそ求められるSRE / SRE for Gen AI era
ymotongpoo
5
3.1k
Claude_CodeでSEOを最適化する_AI_Ops_Community_Vol.2__マーケティングx_AIはここまで進化した.pdf
riku_423
2
550
レガシー共有バッチ基盤への挑戦 - SREドリブンなリアーキテクチャリングの取り組み
tatsukoni
0
210
Sansan Engineering Unit 紹介資料
sansan33
PRO
1
3.8k
Featured
See All Featured
The World Runs on Bad Software
bkeepers
PRO
72
12k
SEO for Brand Visibility & Recognition
aleyda
0
4.2k
Optimising Largest Contentful Paint
csswizardry
37
3.6k
Public Speaking Without Barfing On Your Shoes - THAT 2023
reverentgeek
1
300
Darren the Foodie - Storyboard
khoart
PRO
2
2.4k
DBのスキルで生き残る技術 - AI時代におけるテーブル設計の勘所
soudai
PRO
62
49k
Designing for Performance
lara
610
70k
The Director’s Chair: Orchestrating AI for Truly Effective Learning
tmiket
1
96
技術選定の審美眼(2025年版) / Understanding the Spiral of Technologies 2025 edition
twada
PRO
117
110k
The Myth of the Modular Monolith - Day 2 Keynote - Rails World 2024
eileencodes
26
3.3k
Leading Effective Engineering Teams in the AI Era
addyosmani
9
1.6k
RailsConf & Balkan Ruby 2019: The Past, Present, and Future of Rails at GitHub
eileencodes
141
34k
Transcript
@sebamontini MEDALLIA Monitoring is Dead And why you’re (probably) doing
it wrong
@sebamontini MEDALLIA
@sebamontini MEDALLIA Why?
@sebamontini MEDALLIA The big 5
@sebamontini MEDALLIA The big 5 ✓ CPU → uptime |
mailx -s “cpu” root ✓ MEM → free | mailx -s “mem” root ✓ DISK → (df -h; du -sh /home/*) | mailx -s “disk” root ✓ PROC → (ps -ef | grep important) | mailx -s root ✓ SYS → ping -c 4 google.com | mailx -s root
@sebamontini MEDALLIA OK: x < something
@sebamontini MEDALLIA WARN: something < x < something
@sebamontini MEDALLIA CRITICAL: x > something
@sebamontini MEDALLIA
@sebamontini MEDALLIA Observability
@sebamontini MEDALLIA A system is observable if you can determine
the behavior of the system based on it’s outputs.
@sebamontini MEDALLIA A system is observable if you can determine
the behavior of the system based on it’s outputs.
@sebamontini MEDALLIA A system is a set of connected components.
@sebamontini MEDALLIA A system is observable if you can determine
the behavior of the system based on it’s outputs.
@sebamontini MEDALLIA The manner in which a system acts is
it’s behavior.
@sebamontini MEDALLIA A system is observable if you can determine
the behavior of the system based on it’s outputs.
@sebamontini MEDALLIA The outputs of a system are the concrete
results of it’s behaviors.
@sebamontini MEDALLIA Monitoring is the action of observing and checking
the behavior and outputs of a system and it’s components over time.
@sebamontini MEDALLIA The (real) big 5
@sebamontini MEDALLIA Instrumentation Collection Storage Alerting Visualization
@sebamontini MEDALLIA Instrumentation
@sebamontini MEDALLIA Gauges Counters Histogram Timers
@sebamontini MEDALLIA Gauges A gauge is an instantaneous measurement of
a value. For example, we may want to measure the number of pending jobs in a queue
@sebamontini MEDALLIA Counters A counter is just a gauge that
you can increment or decrement its value. For example, we may want a more efficient way of measuring the pending job in a queue
@sebamontini MEDALLIA Histogram A histogram measures the statistical distribution of
values in a stream of data like median or percentiles
@sebamontini MEDALLIA Timers A timer measures both the rate that
a particular piece of code is called and the distribution of its duration.
@sebamontini MEDALLIA Collection
@sebamontini MEDALLIA
@sebamontini MEDALLIA Storage
@sebamontini MEDALLIA Storage
@sebamontini MEDALLIA Alerting
@sebamontini MEDALLIA Thresholds Dead man Delta Anomaly detection
@sebamontini MEDALLIA Visualization
@sebamontini MEDALLIA
@sebamontini MEDALLIA The big 5 ✓ Instrumentation → gauges, histograms,
timers, counters ✓ Collection → pull vs push ✓ Storage → Time Series DB ✓ Alerting → threshold, flatline, delta, anomaly ✓ Visualization → dashboards
@sebamontini MEDALLIA
@sebamontini MEDALLIA The Four Golden Signals
@sebamontini MEDALLIA Latency Traffic Errors Saturation
@sebamontini MEDALLIA Thanks