Real-time application monitoring

Real-time application monitoring

Real-time application monitoring. Coffee and code Donetsk June 2012.

Dd13a61bab3fe4243f3ebc683f9219eb?s=128

Igor Afonov

June 19, 2012
Tweet

Transcript

  1. Real-time application monitoring Igor Afonov @iafonov

  2. Background • SaaS application • 5 production servers (2 auxiliary

    servers) • Everything managed by chef • Apache/Passenger/Rails/MySQL/Postfix
  3. Metrics that matter • Server state • Application state

  4. How metrics data is stored • Round-robin database • RRDTool

    (C), Whisper (Python) (Minor offtopic)
  5. Server state • Munin - storage and graphs • Monit

    - alerts, basic rescue actions Shard munin-server munin-node Shard munin-node
  6. Munin • Server pulls data from nodes • Gathers basic

    server health data • A lot of custom plugins
  7. Munin + Chef = — # Client config template munin_servers

    = search(:node, "role:monitoring") <% munin_servers.sort.each do |server| -%> allow ^<%= server[:ipaddress].to_s.gsub(/\./, '\.') %>$ <% end %> # Server config template munin_nodes = search(:node, "munin:[* TO *]") <% munin_nodes.each do |system| -%> [<%= system[:hostname] %>] address <%= system[:ipaddress] %> use_node_name yes <% end %>
  8. Application state • Subscriptions • Logins • Orders • Business

    metrics • ...
  9. Our setup Shard Monitoring Server StatsD Graphite Whisper Carbon WebApp

    Shard Shard
  10. StatsD • Lightweight proxy to graphite • 300 lines of

    Javascript • Uses UDP (small size, non-blocking) • 10+ implementations • Simple protocol
  11. StatsD • Count: counter:1|c • Measure: metric:200|ms • Gauge: value:9000|g

    • Supports sampling • A lot of client-side libraries
  12. StatsD StatsD.server = '178.22.33.88:8125' # increment counter StatsD.increment("users.new") # measure

    task StatsD.measure("cron.#{task}") do task.run end # meta-programming - track subscriptions Subscription.extend StatsD::Instrument Subscription.statsd_count :subscribe, 'subscriptions'
  13. Graphite • Python everywhere • Whisper - RRD, stores data

    • Carbon - backend • Graphite - draws graphs, works with data
  14. Graphite [stats] priority = 100 pattern = ^stats\..* retentions =

    10:2160,60:10080,600:262974 (Storage schemas)
  15. Problems • Basic UI • Complex installation and setup

  16. telemetry.io • Fun project • SaaS application • Custom front-end

    for StatsD + Graphite • Optimized for big screens (TVs) • Free • Maybe open-source
  17. telemetry.io Client telemetry.io StatsD Graphite Whisper Carbon WebApp Custom Front-end

    Client Client Client
  18. Workflow • Get access token • Use one of available

    libs or create own • Prepend token to metric name • Integrate into application StatsD.increment("#{token}.subscribers")
  19. None
  20. None
  21. None
  22. None
  23. Implementation • Ruby on Rails • CoffeeScript • Chef (full

    node bootstrap in 10 minutes)
  24. Links • http://telemetry.io • http://code.flickr.com/blog/2008/10/27/ counting-timing/ • http://codeascraft.etsy.com/2011/02/15/ measure-anything-measure-everything/

  25. http://iafonov.github.com/ @iafonov