Upgrade to Pro
— share decks privately, control downloads, hide ads and more …
Speaker Deck
Speaker Deck
PRO
Sign in
Sign up
for free
Real-time application monitoring
Igor Afonov
June 19, 2012
Programming
5
860
Real-time application monitoring
Real-time application monitoring. Coffee and code Donetsk June 2012.
Igor Afonov
June 19, 2012
Tweet
Share
More Decks by Igor Afonov
See All by Igor Afonov
iafonov
6
500
iafonov
8
870
iafonov
6
460
Other Decks in Programming
See All in Programming
hanhan1978
0
300
nrslib
20
13k
takahi5
0
230
shigeruoda
0
480
hr01
1
1.3k
adoranwodo
0
230
attsumi
1
460
yshrsmz
1
460
horie1024
1
390
wasabeef
1
570
fkubota
1
400
ken3ypa
0
160
Featured
See All Featured
denniskardys
220
120k
dotmariusz
94
5.1k
bryan
30
3.3k
lara
16
2.6k
hursman
106
9.2k
addyosmani
494
110k
caitiem20
308
17k
myddelton
109
11k
scottboms
251
11k
thoeni
4
550
tmm1
61
8.4k
addyosmani
310
21k
Transcript
Real-time application monitoring Igor Afonov @iafonov
Background • SaaS application • 5 production servers (2 auxiliary
servers) • Everything managed by chef • Apache/Passenger/Rails/MySQL/Postfix
Metrics that matter • Server state • Application state
How metrics data is stored • Round-robin database • RRDTool
(C), Whisper (Python) (Minor offtopic)
Server state • Munin - storage and graphs • Monit
- alerts, basic rescue actions Shard munin-server munin-node Shard munin-node
Munin • Server pulls data from nodes • Gathers basic
server health data • A lot of custom plugins
Munin + Chef = — # Client config template munin_servers
= search(:node, "role:monitoring") <% munin_servers.sort.each do |server| -%> allow ^<%= server[:ipaddress].to_s.gsub(/\./, '\.') %>$ <% end %> # Server config template munin_nodes = search(:node, "munin:[* TO *]") <% munin_nodes.each do |system| -%> [<%= system[:hostname] %>] address <%= system[:ipaddress] %> use_node_name yes <% end %>
Application state • Subscriptions • Logins • Orders • Business
metrics • ...
Our setup Shard Monitoring Server StatsD Graphite Whisper Carbon WebApp
Shard Shard
StatsD • Lightweight proxy to graphite • 300 lines of
Javascript • Uses UDP (small size, non-blocking) • 10+ implementations • Simple protocol
StatsD • Count: counter:1|c • Measure: metric:200|ms • Gauge: value:9000|g
• Supports sampling • A lot of client-side libraries
StatsD StatsD.server = '178.22.33.88:8125' # increment counter StatsD.increment("users.new") # measure
task StatsD.measure("cron.#{task}") do task.run end # meta-programming - track subscriptions Subscription.extend StatsD::Instrument Subscription.statsd_count :subscribe, 'subscriptions'
Graphite • Python everywhere • Whisper - RRD, stores data
• Carbon - backend • Graphite - draws graphs, works with data
Graphite [stats] priority = 100 pattern = ^stats\..* retentions =
10:2160,60:10080,600:262974 (Storage schemas)
Problems • Basic UI • Complex installation and setup
telemetry.io • Fun project • SaaS application • Custom front-end
for StatsD + Graphite • Optimized for big screens (TVs) • Free • Maybe open-source
telemetry.io Client telemetry.io StatsD Graphite Whisper Carbon WebApp Custom Front-end
Client Client Client
Workflow • Get access token • Use one of available
libs or create own • Prepend token to metric name • Integrate into application StatsD.increment("#{token}.subscribers")
None
None
None
None
Implementation • Ruby on Rails • CoffeeScript • Chef (full
node bootstrap in 10 minutes)
Links • http://telemetry.io • http://code.flickr.com/blog/2008/10/27/ counting-timing/ • http://codeascraft.etsy.com/2011/02/15/ measure-anything-measure-everything/
http://iafonov.github.com/ @iafonov