Distributed Tracing FAQ

© 2017 InfluxData. All rights reserved. 1 Distributed Tracing Frequently
Asked Questions @gianarb

© 2017 InfluxData. All rights reserved. 2 Gianluca Arbezzano Site
Reliability Engineer @InfluxData • https://gianarb.it • @gianarb What I like: • I make dirty hacks that look awesome • I grow my vegetables • Travel for fun and work

Why do I need distributed tracing?

© 2017 InfluxData. All rights reserved. 6 It is a
way to describe the distribution’s complexity

© 2017 InfluxData. All rights reserved. 7 In practice it
is a different aggregation for the well-known logs and stats.

© 2017 InfluxData. All rights reserved. 8 To tell the
story of our distributed system

How a trace looks like?

© 2017 InfluxData. All rights reserved. 10 A span is
the smallest unit in a trace.

© 2017 InfluxData. All rights reserved. 11 It describes a
single action executed by a program: • A single HTTP request. • A database query. • A message execution in a queue system. • A lookup from a key/value store.

© 2017 InfluxData. All rights reserved. 12 A span is
described via: • span_id the unique identifier in a trace • trace_id to determine its trace • parent_id to describe a hierarchy • labels a set of key/value pairs • Span Context is a set of value that will be propagated in the trace • Logs

© 2017 InfluxData. All rights reserved. 13 post: /users handle.create_user
user_exists insert_user send_email nginx sA mysql mysql worker A single trace

© 2017 InfluxData. All rights reserved. 14 post: /users handle.create_user
user_exists nginx sA mysql mysql sA Service Name: mysql Trace ID: 34ytsy5hs45gs46hs5g Span ID: se5hs5s5hs45gs45gs Span Name: user_exists Duration: 1.2s Start: 56467657457234 Logs: query: “select * from tb_user where id = 345” user: sa_service

How do I follow a request?

© 2017 InfluxData. All rights reserved. 16 The implementation changes
based on what you are instrumenting ¨ To instrument HTTP services the solution is via HEADER ¨ Same for grpc ¨ For queue system you can pass it as part of the message payload

© 2017 InfluxData. All rights reserved. 17 B3-Propagation https://github.com/openzipkin/b3-propagation X-B3-TraceId:
80f198ee56343ba864fe8b2a57d3eff7 X-B3-ParentSpanId: 05e3ac9a4f6e3b90 X-B3-SpanId: e457b5a2e4d86bd1 X-B3-Sampled: 1

Do I need a standard for tracing?

© 2017 InfluxData. All rights reserved. 20 1. Applications can
be written using different languages but at the end you need to build one single trace. It means that they need to agree on a common standard/protocol. 2. If you use a widely supported standard you can avoid vendor lock-in.

© 2017 InfluxData. All rights reserved. 22 log log log
log log log Parent Span Span Context / Baggage Child Child Child Span ¨ Spans - Basic unit of timing and causality. Can be tagged with key/value pairs. ¨ Logs - Structured data recorded on a span. ¨ Span Context - serializable format for linking spans across network boundaries. Carries baggage, such as a request and client IDs. ¨ Tracers - Anything that plugs into the OpenTracing API to record information. ¨ ZipKin, Jaeger, LightStep, others ¨ Also metrics (Prometheus) and logging

© 2017 InfluxData. All rights reserved. 23 1.5 year old!
Tracer implementations: Zipkin, Jaeger, LightStep, SkyWalking, others All sorts of companies use OpenTracing:

© 2017 InfluxData. All rights reserved. 24 Rapidly growing OSS
and vendor adoption JDBI Java Webservlet Jaxr

© 2017 InfluxData. All rights reserved. 25 import "github.com/opentracing/opentracing-go" import
".../some_tracing_impl" func main() { opentracing.SetGlobalTracer( // tracing impl specific: some_tracing_impl.New(...), ) ... } https://github.com/opentracing/opentracing-go Opentracing: Configure the GlobalTracer

© 2017 InfluxData. All rights reserved. 26 func xyz(ctx context.Context,
...) { ... span, ctx := opentracing.StartSpanFromContext(ctx, "operation_name") defer span.Finish() span.LogFields( log.String("event", "soft error"), log.String("type", "cache timeout"), log.Int("waited.millis", 1500)) ... } https://github.com/opentracing/opentracing-go Opentracing: Create a Span from the Context

© 2017 InfluxData. All rights reserved. 27 func xyz(parentSpan opentracing.Span,
...) { ... sp := opentracing.StartSpan( "operation_name", opentracing.ChildOf(parentSpan.Context())) defer sp.Finish() ... } https://github.com/opentracing/opentracing-go Opentracing: Create a Child Span

How a tracing infrastructure looks?

© 2017 InfluxData. All rights reserved. 30 OpenTracing API application
logic µ-service frameworks Lambda functions RPC & control-flow frameworks existing instrumentation tracing infrastructure main() I N S T A N A J a e g e r microservice process

Can I have a tracing infrastructure on-prem?

© 2017 InfluxData. All rights reserved. 32 There are different
Open Source alternatives: ¨ Zipkin ¨ Java ¨ Sponsored by Twitter ¨ Supported backend: ElasticSearch, MySQL, Cassandra ¨ Jaeger ¨ Go ¨ Sponsored by Uber and part of the CNCF ¨ Supported backend: ElasticSearch, Cassandra

There are as a service tracing infrastructure?

Can I store traces everywhere?

© 2017 InfluxData. All rights reserved. 36 Short answer YES.
At your own risk… ¨ Really high cardinality ¨ High write throughput Probably databases like InfluxDB, Cassandra, MongoDB are a better option compared with MySQL, Postgres but it always depends on traffic and amount of data.

Distributed Tracing FAQ

Distributed Tracing FAQ

More Decks by Gianluca Arbezzano

Other Decks in Technology

Featured

Transcript