Upgrade to Pro — share decks privately, control downloads, hide ads and more …

Hynek Schlawack - Beyond grep: Practical Logging and Metrics

Hynek Schlawack - Beyond grep: Practical Logging and Metrics

Your Python server applications are running but you’re wondering what they are doing? Your only clue about their current state is the server load? Let’s have stroll through the landscape of logging and metrics so you’ll find the perfect fit for your use cases!

https://us.pycon.org/2015/schedule/presentation/353/

PyCon 2015

April 18, 2015
Tweet

More Decks by PyCon 2015

Other Decks in Programming

Transcript

  1. Vanilla from raven import Client client = Client("https://yoursentry") try: 1

    / 0 except ZeroDivisionError: client.captureException()
  2. Vanilla from raven import Client client = Client("https://yoursentry") try: 1

    / 0 except ZeroDivisionError: client.captureException()
  3. Vanilla from raven import Client client = Client("https://yoursentry") try: 1

    / 0 except ZeroDivisionError: client.captureException()
  4. System Metrics vs App Metrics • load • network traffic

    • I/O • … • counters • timers
  5. System Metrics vs App Metrics • load • network traffic

    • I/O • … • counters • timers • gauges
  6. System Metrics vs App Metrics • load • network traffic

    • I/O • … • counters • timers • gauges • …
  7. Math • # reqs / s? • worst 0.01% ⟨req

    time⟩? • don’t try this alone!
  8. Approaches 1. external aggregation: StatsD, Riemann + no state, simple

    – no direct introspection 2. aggregate in-app, deliver to DB
  9. Approaches 1. external aggregation: StatsD, Riemann + no state, simple

    – no direct introspection 2. aggregate in-app, deliver to DB + in-app dashboard, useful in dev
  10. Approaches 1. external aggregation: StatsD, Riemann + no state, simple

    – no direct introspection 2. aggregate in-app, deliver to DB + in-app dashboard, useful in dev – state w/i app
  11. Scales from greplin import scales from greplin.scales.meter import MeterStat STATS

    = scales.collection( "/Resource", MeterStat("reqs"), scales.PmfStat("request_time") )
  12. Scales from greplin import scales from greplin.scales.meter import MeterStat STATS

    = scales.collection( "/Resource", MeterStat("reqs"), scales.PmfStat("request_time") )
  13. Dashboard Scales … "request_time": { "count": 567315293, "99percentile": 0.10978688716888428, "75percentile":

    0.013181567192077637, "min": 0.0002448558807373047, "max": 30.134822130203247, "98percentile": 0.08934824466705339, "95percentile": 0.027234303951263434, "median": 0.009176492691040039, "999percentile": 0.14235656142234793, "stddev": 0.01676855570363413, "mean": 0.013247184020535955 }, …
  14. Original Logger BoundLogger Processor 1 Processor n Return Value Return

    Value bind values log.bind(key=value) Context log events log.info(event, another_key=another_value) + structlog
  15. Original Logger BoundLogger Processor 1 Processor n Return Value Return

    Value bind values log.bind(key=value) Context log events log.info(event, another_key=another_value) + structlog
  16. Capture • into files • to syslog / a queue

    • pipe into a logging agent
  17. {"event": "user.login", "user": "guido"} log = log.bind(user="guido") log.info("user.login") structlog logstash-forwarder

    logstash stdout logging /var/log/app/current runit’s svlogd (adds TAI64 timestamp)
  18. {"event": "user.login", "user": "guido"} log = log.bind(user="guido") log.info("user.login") structlog logstash-forwarder

    logstash 1010001101 Elasticsearch stdout logging /var/log/app/current runit’s svlogd (adds TAI64 timestamp)