Upgrade to Pro — share decks privately, control downloads, hide ads and more …

Wider den Blindflug -- Logging und Metriken in verteilten Anwendungen

Wider den Blindflug -- Logging und Metriken in verteilten Anwendungen

Talk held at JAX 2015 with Alex Heusingfeld

Ein Softwaresystem kann nur dann erfolgreich betrieben werden, wenn es feinmaschig überwacht wird. Mit dem Aufschwung der Microservices-Architekturen reichen die traditionellen Monitoringmittel nicht mehr aus, denn der Erfolg dieses Architekturmodells hängt maßgeblich von den erfassten Kennzahlen ab. Je mehr Informationen man zur Laufzeit erfassen kann, desto genauer kann man den Gesundheitszustand des Gesamtsystems bestimmen. Dafür müssen neue Konzepte für Logging und Metriken umgesetzt werden, die die Aggregation, Korrelation und Visualisierung von Laufzeitinformationen an zentraler Stelle erlauben.

In dieser Session zeigen wir, welche Metriken erfasst und welche Informationen geloggt werden sollten, wie man sie zentral aggregiert und auswertet, und welche freien Tools dabei helfen, so einen reibungslosen Betrieb zu sichern.

E54bc21e49b523cd3039fdd9593f206f?s=128

Tammo van Lessen

April 23, 2015
Tweet

Transcript

  1. Wider den Blindflug: Logging und Metriken in verteilten Anwendungen Alexander

    Heusingfeld & Tammo van Lessen
  2. None
  3. None
  4. ...so you start to disassemble your monoliths...

  5. ...and then you think about the infrastructure

  6. Architectural Decisions Domain architecture Macro architecture Micro architecture Ą Ą

    Ą
  7. Scenario: Big Shop

  8. That wouldn't have happened with proper logging! ... Would it?

  9. What makes good logging? What identifies a good log message?

    Which log level should I use when? Should I log into files? What format? Ą Ą Ą
  10. Some recommendations Log messages should have a uniform style. Log

    violations of assumptions. Use markers to make log streams filterable. Prefer machine-readable log formats over human-readable. Identify correlation tokens and attach them to the log event. Collect and store logs in a central repository. Ą Ą Ą Ą Ą Ą
  11. Default Levels Local Files? -> WARN only Central Logfile Repository?

    -> INFO Magic bugs + advanced setup? DEBUG, or even TRACE.
  12. Async Appenders (LMAX, MemoryMappedFileAppender) Routing Properties Reconfiguration (Auto load, JMX,...)

    Audit logs Markers / Log levels ... Ą Ą Ą Ą Ą Ą Ą
  13. Thread Context ThreadContext.put("loginId", login); logger.error("Something bad happened!"); ThreadContext.clear(); + Layout:

    %-5p: [%X{loginId}] %m%n Log: ERROR: [John Doe] Something bad happened!
  14. Thread Context (2) ThreadContext.put("loginId", login); logger.error("Something bad happened!"); ThreadContext.clear(); +

    JSON Layout: Log: { "@version" => "1", "@timestamp" => "2014-04-29T14:21:14.988-07:00", "logger" => "com.example.LogStashExampleTest", "level" => "ERROR", "thread" => "Test worker", "message" => "Something bad happened!", "Properties" => { "loginId" => "John Doe" } }
  15. Log4j2 demo

  16. Requirements in a distributed environment Aggregate logs in different formats

    from different systems. Search & Correlate Visualize Alert on complex correlations. Ą Ą Ą Ą
  17. Tools of Trade

  18. Logstash Architecture

  19. Logstash – Hands on!

  20. A Logstash Cluster From the Logstash docs

  21. … and there are others, too! Apache Flume (ASL 2.0)

    FluentD (ASL 2.0) Graylog 2 (GPL) Loggly (commerical) Splunk (commerical)
  22. Logging is cool. And I can use it to collect

    metrics as well, right?
  23. Yes, you can! But you shouldn't!

  24. Metrics Business Metrics Application Metrics System Metrics Ą Ą Ą

  25. Continuous Delivery & Metrics?

  26. Continuous Delivery & Metrics?

  27. Gauges An instrument that measures a value.

  28. Counters A counter is a simple incrementing and decrementing integer.

  29. Meters A meter measures the rate at which a set

    of events occur.
  30. Histograms A Histogram measures the distribution of values.

  31. Timers

  32. None
  33. None
  34. None
  35. None
  36. None
  37. None
  38. None
  39. None
  40. None
  41. Dashboards

  42. Cubism.js Credits: Michael Bostock Mirror Offset 1 − +

  43. None
  44. Comparisons var cube = context.cube("http://..."), primary = cube.metric("sum(request)"), secondary =

    primary.shift(-7 * 24 * 60 * 60 * 1000);
  45. ... Dashing

  46. Best practices Measure everything! Counters ./. Meters Metrics are cheap,

    but not for free. Retention Policies Get rid of silos Correlate your data ...to make better decisions Ą Ą Ą Ą Ą Ą Ą
  47. Prevent the apocalypse! Logging shows events. Metrics shows state. Don't

    fly blind!
  48. Thanks for your attention! Alexander Heusingfeld | «@goldstift Tammo van

    Lessen | «@taval https://www.innoq.com/
  49. Credits Ą Buuz and Woody Ą Monolith by Ron Cogswell

    Ą Dave - Wrapping up monolith tins Ą Pleuntje - connected Ą CPU by mbostock Ą Mess by Rev Stan Ą Pay Here by Marc Falardeau Ą Cockpit by Ronnie Rams Ą Stream by Phil Whitehouse Ą Magnifier by John Lodder (Flickr) Ą Flying Saucer, Cup, and Teapot! by Mr Thinktank Ą Ice berg by Derek Keats Ą Gas Meters by mxmstryo (Flickr) Ą Gauge Stock by Andrew Taylor (Flickr) Ą Counter by Marcin Wichary (Flickr) Ą Histogram of legos by color frequency by Jeff Boulter (Flickr) Ą pomodoro timers by Paul Downey (Flickr) Ą Zombie Apocalypse by pasukaru76