Upgrade to Pro — share decks privately, control downloads, hide ads and more …

Wider den Blindflug -- Logging und Metriken in verteilten Anwendungen

Wider den Blindflug -- Logging und Metriken in verteilten Anwendungen

Talk held at JAX 2015 with Alex Heusingfeld

Ein Softwaresystem kann nur dann erfolgreich betrieben werden, wenn es feinmaschig überwacht wird. Mit dem Aufschwung der Microservices-Architekturen reichen die traditionellen Monitoringmittel nicht mehr aus, denn der Erfolg dieses Architekturmodells hängt maßgeblich von den erfassten Kennzahlen ab. Je mehr Informationen man zur Laufzeit erfassen kann, desto genauer kann man den Gesundheitszustand des Gesamtsystems bestimmen. Dafür müssen neue Konzepte für Logging und Metriken umgesetzt werden, die die Aggregation, Korrelation und Visualisierung von Laufzeitinformationen an zentraler Stelle erlauben.

In dieser Session zeigen wir, welche Metriken erfasst und welche Informationen geloggt werden sollten, wie man sie zentral aggregiert und auswertet, und welche freien Tools dabei helfen, so einen reibungslosen Betrieb zu sichern.

Tammo van Lessen

April 23, 2015
Tweet

More Decks by Tammo van Lessen

Other Decks in Technology

Transcript

  1. What makes good logging? What identifies a good log message?

    Which log level should I use when? Should I log into files? What format? Ą Ą Ą
  2. Some recommendations Log messages should have a uniform style. Log

    violations of assumptions. Use markers to make log streams filterable. Prefer machine-readable log formats over human-readable. Identify correlation tokens and attach them to the log event. Collect and store logs in a central repository. Ą Ą Ą Ą Ą Ą
  3. Default Levels Local Files? -> WARN only Central Logfile Repository?

    -> INFO Magic bugs + advanced setup? DEBUG, or even TRACE.
  4. Thread Context (2) ThreadContext.put("loginId", login); logger.error("Something bad happened!"); ThreadContext.clear(); +

    JSON Layout: Log: { "@version" => "1", "@timestamp" => "2014-04-29T14:21:14.988-07:00", "logger" => "com.example.LogStashExampleTest", "level" => "ERROR", "thread" => "Test worker", "message" => "Something bad happened!", "Properties" => { "loginId" => "John Doe" } }
  5. Requirements in a distributed environment Aggregate logs in different formats

    from different systems. Search & Correlate Visualize Alert on complex correlations. Ą Ą Ą Ą
  6. … and there are others, too! Apache Flume (ASL 2.0)

    FluentD (ASL 2.0) Graylog 2 (GPL) Loggly (commerical) Splunk (commerical)
  7. Best practices Measure everything! Counters ./. Meters Metrics are cheap,

    but not for free. Retention Policies Get rid of silos Correlate your data ...to make better decisions Ą Ą Ą Ą Ą Ą Ą
  8. Credits Ą Buuz and Woody Ą Monolith by Ron Cogswell

    Ą Dave - Wrapping up monolith tins Ą Pleuntje - connected Ą CPU by mbostock Ą Mess by Rev Stan Ą Pay Here by Marc Falardeau Ą Cockpit by Ronnie Rams Ą Stream by Phil Whitehouse Ą Magnifier by John Lodder (Flickr) Ą Flying Saucer, Cup, and Teapot! by Mr Thinktank Ą Ice berg by Derek Keats Ą Gas Meters by mxmstryo (Flickr) Ą Gauge Stock by Andrew Taylor (Flickr) Ą Counter by Marcin Wichary (Flickr) Ą Histogram of legos by color frequency by Jeff Boulter (Flickr) Ą pomodoro timers by Paul Downey (Flickr) Ą Zombie Apocalypse by pasukaru76