Slide 1

Slide 1 text

Application Tracing Velocity San Jose 2017

Slide 2

Slide 2 text

Your instructor •Bryan Liles •[email protected] •@bryanl

Slide 3

Slide 3 text

WIFI: OReilly Conf

Slide 4

Slide 4 text

Agenda • Workstation Setup • Goals • Tracing Introduction • Exercise 1 • Break • Exercise 2 • Exercise 3 • Beyond the basics • Discussion • Closing

Slide 5

Slide 5 text

Workshop Setup • https://github.com/bryanl/velocity-sj-2017 • http://bit.ly/vsj2017-apptracing

Slide 6

Slide 6 text

• Jaeger • Language Specific Exercises • Jaeger Demo App • Postgres

Slide 7

Slide 7 text

Today’s applications are complex

Slide 8

Slide 8 text

Shop

Slide 9

Slide 9 text

Shop Inventory Management Shipping

Slide 10

Slide 10 text

Shop Inventory Management Shipping Redis Postgres

Slide 11

Slide 11 text

How do we know what’s going on?

Slide 12

Slide 12 text

Shop

Slide 13

Slide 13 text

Shop Metrics For every resource, check utilization, saturation, and errors. • Average busy time • Amount of capacity available • Error count

Slide 14

Slide 14 text

Shop Logging Capture events to help identify incidents and application specific data • Auth success/failure • Validation failures • So many more

Slide 15

Slide 15 text

Shop Tracing Capture timing metrics from one or multiple resources participating in a transaction. • Find latency issues and errors across resources

Slide 16

Slide 16 text

Shop Tracing

Slide 17

Slide 17 text

Tracing Request scoped Metrics Aggregates Logging Events Request scoped events Aggregate events Request scoped metrics

Slide 18

Slide 18 text

Tracing Request scoped Metrics Aggregates Logging Events Prometheus ELK OpenTracing

Slide 19

Slide 19 text

Tracing

Slide 20

Slide 20 text

No content

Slide 21

Slide 21 text

Terminology

Slide 22

Slide 22 text

Span: A period of time A span contains the following: * Operation name * Start/Finish timestamps * Tags * Logs * References to other spans

Slide 23

Slide 23 text

Trace: a directed acyclic graph (DAG) of spans Span A Span B Span C Span D Span E Span F Span G Span H

Slide 24

Slide 24 text

Trace: a DAG of spans Time ➡ Span A Span B Span D Span C Span E Span F Span G Span H

Slide 25

Slide 25 text

OpenTracing

Slide 26

Slide 26 text

Google Dapper "We built dapper to provide Google’s developers with more information about the behavior of complex distributed systems"

Slide 27

Slide 27 text

Google Dapper Design Goals • Low Overhead • Application-level transparency • Scalability

Slide 28

Slide 28 text

Zipkin • Created by Twitter • Zipkin and Dapper are based on the same principals

Slide 29

Slide 29 text

OpenTracing isn’t a framework

Slide 30

Slide 30 text

OpenTracing describes an abstraction for tracing distributed systems

Slide 31

Slide 31 text

Jaeger: OpenTracing Compatible Implementation

Slide 32

Slide 32 text

Jaeger Driver OpenTracing API App Jaeger

Slide 33

Slide 33 text

Standardization Why OpenTracing?

Slide 34

Slide 34 text

• Standardize span management • Standardize inter-process propagation • Standardize active span management • Standardize in-band context encoding • Standardize out-of-band trace data encoding

Slide 35

Slide 35 text

Why Standardize? • Tracing should be easy and unobtrusive • Competing standards dilute progress • The ecosystem is complex and diverse. It’s easier to drive to a single standard rather than having multiple.

Slide 36

Slide 36 text

Open Tracing Demo

Slide 37

Slide 37 text

pip install -U docker-compose

Slide 38

Slide 38 text

Workshop Setup

Slide 39

Slide 39 text

Activity: Tracing Concepts

Slide 40

Slide 40 text

Talking to Jaeger

Slide 41

Slide 41 text

Reporter Sampler Jaeger

Slide 42

Slide 42 text

Activity: More Tracing Concepts

Slide 43

Slide 43 text

Discussion / Lessons Learned

Slide 44

Slide 44 text

What should I log or tag? https://github.com/opentracing/specification/blob/master/semantic_conventions.md

Slide 45

Slide 45 text

Instrumenting

Slide 46

Slide 46 text

Searching for tags or events

Slide 47

Slide 47 text

Baggage Sometimes, you want to make data available for child spans

Slide 48

Slide 48 text

Closing