Slide 1

Slide 1 text

Distributed Metrics and Log Aggregation Alexander Heusingfeld & Tammo van Lessen

Slide 2

Slide 2 text

No content

Slide 3

Slide 3 text

No content

Slide 4

Slide 4 text

...so you start to disassemble your monoliths...

Slide 5

Slide 5 text

...and then you think about the infrastructure

Slide 6

Slide 6 text

Architectural Decisions Domain architecture Macro architecture Micro architecture Ą Ą Ą

Slide 7

Slide 7 text

Scenario: Big Shop

Slide 8

Slide 8 text

That wouldn't have happened with proper logging!

Slide 9

Slide 9 text

What makes good logging? What identifies a good log message? Which log level should I use when? Should I log into files? What format? Ą Ą Ą

Slide 10

Slide 10 text

Some recommendations Log messages should have a uniform style. Log violations of assumptions. Use markers to make log streams filterable. Prefer machine-readable log formats over human-readable. Identify correlation tokens and attach them to the log event. Collect and store logs in a central repository. Ą Ą Ą Ą Ą Ą

Slide 11

Slide 11 text

Default Levels Files? Warn only. Logstash & Co? Info. Magic bugs + advanced setup? Debug, or even trace.

Slide 12

Slide 12 text

Async Appenders (LMAX, MemoryMappedFileAppender) Routing Properties Reconfiguration (Auto load, JMX,...) Audit logs Markers / Log levels ... Ą Ą Ą Ą Ą Ą Ą

Slide 13

Slide 13 text

Thread Context ThreadContext.put("loginId", login); logger.error("Something bad happened!"); ThreadContext.clear(); + Layout: %-5p: [%X{loginId}] %m%n Log: ERROR: [John Doe] Something bad happened!

Slide 14

Slide 14 text

Thread Context (2) ThreadContext.put("loginId", login); logger.error("Something bad happened!"); ThreadContext.clear(); + JSON Layout: Log: { "@version" => "1", "@timestamp" => "2014-04-29T14:21:14.988-07:00", "logger" => "com.example.LogStashExampleTest", "level" => "ERROR", "thread" => "Test worker", "message" => "Something bad happened!", "Properties" => { "loginId" => "John Doe" } }

Slide 15

Slide 15 text

Requirements in a distributed environment Aggregate logs in different formats from different systems. Search & Correlate Visualize Alert on complex correlations. Ą Ą Ą Ą

Slide 16

Slide 16 text

Tools of Trade

Slide 17

Slide 17 text

Logstash Architecture

Slide 18

Slide 18 text

Logstash – Hands on!

Slide 19

Slide 19 text

A Logstash Cluster From the Logstash docs

Slide 20

Slide 20 text

… and there are others, too! Apache Flume (ASL 2.0) FluentD (ASL 2.0) Graylog 2 (GPL) Loggly (commerical) Splunk (commerical)

Slide 21

Slide 21 text

Logging is cool. And I can use it to collect metrics as well, right?

Slide 22

Slide 22 text

Yes, you can! But you shouldn't!

Slide 23

Slide 23 text

Metrics Business Metrics Application Metrics System Metrics Ą Ą Ą

Slide 24

Slide 24 text

Continuous Delivery & Metrics?

Slide 25

Slide 25 text

Continuous Delivery & Metrics?

Slide 26

Slide 26 text

Gauges An instrument that measures a value.

Slide 27

Slide 27 text

Counters A counter is a simple incrementing and decrementing integer.

Slide 28

Slide 28 text

Meters A meter measures the rate at which a set of events occur.

Slide 29

Slide 29 text

Histograms A Histogram measures the distribution of values.

Slide 30

Slide 30 text

Timers

Slide 31

Slide 31 text

No content

Slide 32

Slide 32 text

No content

Slide 33

Slide 33 text

No content

Slide 34

Slide 34 text

No content

Slide 35

Slide 35 text

No content

Slide 36

Slide 36 text

No content

Slide 37

Slide 37 text

No content

Slide 38

Slide 38 text

No content

Slide 39

Slide 39 text

No content

Slide 40

Slide 40 text

Dashboards

Slide 41

Slide 41 text

Graphite

Slide 42

Slide 42 text

Cubism.js Credits: Michael Bostock Mirror Offset 1 − +

Slide 43

Slide 43 text

No content

Slide 44

Slide 44 text

Comparisons var cube = context.cube("http://..."), primary = cube.metric("sum(request)"), secondary = primary.shift(-7 * 24 * 60 * 60 * 1000);

Slide 45

Slide 45 text

... Dashing

Slide 46

Slide 46 text

Best practices Measure everything! Counters ./. Meters Metrics are cheap, but not for free. Retention Policies Get rid of silos Correlate your data ...to make better decisions Ą Ą Ą Ą Ą Ą Ą

Slide 47

Slide 47 text

Prevent the apocalypse! Logging shows events. Metrics shows state. Don't fly blind!

Slide 48

Slide 48 text

Thanks for your attention! Alexander Heusingfeld | «@goldstift Tammo van Lessen | «@taval https://www.innoq.com/

Slide 49

Slide 49 text

Credits Ą Buuz and Woody Ą Monolith by Ron Cogswell Ą Dave - Wrapping up monolith tins Ą Pleuntje - connected Ą CPU by mbostock Ą Mess by Rev Stan Ą Pay Here by Marc Falardeau Ą Cockpit by Ronnie Rams Ą Stream by Phil Whitehouse Ą Magnifier by John Lodder (Flickr) Ą Flying Saucer, Cup, and Teapot! by Mr Thinktank Ą Ice berg by Derek Keats Ą Gas Meters by mxmstryo (Flickr) Ą Gauge Stock by Andrew Taylor (Flickr) Ą Counter by Marcin Wichary (Flickr) Ą Histogram of legos by color frequency by Jeff Boulter (Flickr) Ą pomodoro timers by Paul Downey (Flickr) Ą Zombie Apocalypse by pasukaru76