Dapper - A Large-Scale Distributed Systems Tracing Infrastructure

Dapper A Large-Scale Distributed Systems Tracing Infrastructure

Introduction

Dapper Design goals 1. Low overhead 2. Transparent 3. Ubiquitous
deploy in large systems

Motivation Engineers are not an expert in every systems. We
can detect an overall latency but don’t know which systems are causing slowness.

Requirements • Low impact ◦ Teams turn the tracing system
if there is performance impact. • Application transparency ◦ A tracing system that relies in application developers becomes fragile. • Tracing data quickly available ◦ We want to be able to react to problems as soon as possible.

Application transparency Possible with instrumentation in shared libraries: • threading
• control flow • RPC

“Our instrumentation is restricted to a low enough level in
the software stack that even large-scale distributed systems like Google web search could be traced without additional annotations” Page 2

1/1000 Sampling is necessary for low overhead This ratio provides
enough information

Distributed Tracing in Dapper

How to aggregate trace information? • Black Box ◦ No
additional information than the message record. ◦ Use statistical regression. • Annotation-based ◦ Applications or middleware tag every record. ◦ Use a global identifier to link all records to the originating request.

Dapper - Trace tree - Nodes are spans - Edge
is a casual relationship - Same trace id in all spans

RPC span annotations Every RPC span contains annotations from the
client and server processes. Clock skew in timestamps? Client always send the request before a server receives so there is an interval for server side timestamps.

Instrumentation points Near-zero intervention from application developers possible with instrumentation
in shared libraries: • Threading library ◦ The trace context container is injected in the thread context. ◦ This container is small and contains trace and span ids. • Control flow library ◦ Allows to construct callbacks and schedule them in a thread pool. ◦ Callbacks store the trace context of their creator. • RPC framework for Java and C++ ◦ Span and trace ids are transmitted from client to server. ◦ This is an essential instrumentation point.

Annotations Dapper allows developers to enrich traces with additional information:
• Text annotations • Key value

Trace collection Three-stage process: 1. Span data writen to local
log files; 2. Collectors pull logs from all hosts through the Dapper daemons; 3. Traces are written to a cell in Dapper Bigtable repositories. A trace is a row and each column is a span. Tables are sparse, since the number of spans is arbitrary. This median latency of this process is 15 seconds. But sometimes it can take hours.

Security and privacy • Dapper stores only the name of
RPC methods without the payload. • Allows to monitor security policies such isolation of systems.

Dapper Deployment Status

Deployment status at Google Most critical part is Threading, RPC
and Control Flow libraries. Core instrumentation in ~1000 lines for C++ and ~800 lines for Java. Key-value annotations takes ~500 lines. Dapper daemon is part of the basic machine image, so it is present in every server at Google. Nearly every Google production process supports tracing. Google developed a simple library to control trace ids propagation manually as a workarround in non-standard control flows that Dapper doesn’t support. Near 90% of Dapper traces contain annotations. Annotations are also used to relate traces with a certain feature.

Quick quiz The sampling ratio that provides low overhead and
enough information is: A. 2/10 B. 1/1000 C. 1/100

Managing Tracing Overhead

Generation Overhead Creating and destroying spans and annotations and logging
to disk, are the most important source of overhead.

~204 ns Root span creation and destruction

~176 ns Non-root span creation and destruction Non-root don’t need
to generate the global trace id

~6 ns Additional span annotations

Writes to Local Disk Logs Are the most expensive operation
but they are asyncronous to the application.

Collection Overhead Daemon uses less than 0.3% CPU in a
unrealistic heavy load benchmark. Daemon process have the lowest priority in the Kernel.

Effect in production workloads Trace sampling is necessary in production.
Services with high traffic can generate a lot of trace data. A sampling of 1/1024 have enough data in high-volume services. A lower sampling frequency allows data to persist more time on disk giving flexibility to the Dapper collection infrastructure. Benchmark - Sampling impact in web search.

Adaptive sampling A fixed sampling ratio doesn’t fit for low
traffic services, since it misses important traces. Having a rate of traces/time is better for these scenarios.

Sampling in Collection Infrastructure Traces are also sampled when collected,
to save resources in the Dapper infrastructure. Google generated more than 1TB of Trace Data per Day.

General-Purpose Dapper tools

Dapper Depot API • Access traces by trace id •
Bulk access ◦ Map Reduce ◦ User-defined function that accept a trace • Indexed access ◦ Index storage cost is 74% of the trace data itself ◦ Access traces by service or hostname ◦ Hostname index was nearly not used ◦ Dropped hostname index, and composed one of {hostname,service,timestamp} • Analytics tools developed on top

Dapper Use Interface 1. Service and Time Window 2. Performance
Summary 3. Execution Pattern 4. Metric histogram 5. Trace

Experiences

Google Ads Review system Google developed an automatic keyword review
system (e.g. inappropriate language). Dapper helped from the first prototype: • Track performance bottlenecks to optimize; • Find unecessary requests made to databases masters that could be done to replicas; • Understand the cost of making queries in the dependencies; • Quality assurance in correct system behavior and performance; • Ads Review estimates that latency have improved by 2x using data from Dapper.

Exception monitoring Trace and span ids are included in the
exception report. Helped in forensic of bugs.

Long tail latency Due the number of moving parts, debugging
services like Universal Search is very challenging, even for experienced Engineers. An Engineer developed a tool to infer critical paths from Dapper traces. This allowed to discover: • Impact of network ◦ Momentary degratation in network does not affect system throughput but overall latency. • Unintended interactions between services ◦ Wrong queries resulted in requests to services that were not necessary. • Slow queries ◦ Built a list of slow queries associated with trace ids.

Inferring services dependencies Google developed a project to automatically infer
services dependencies using DAPI MapReduce interface. It helped Infrastructure Ops.

Lessons Learned

Coalescing effects Requests that are batched are not reported correctly
(e.g. disk writes) Dapper reports only one of the batched requests.

Batch workloads Dapper is focused in performance insights from user-generated
requests. It can also be useful for offline batch workloads.

Finding Root Cause Sometimes a request takes time to complete
because another request queue ahead of it. This is not visible in Dapper.

Conclusions Low cost of implementation Dapper allowed the majority of
Google services to be tracked without the need for application level modifications. Shared libraries The usage of shared libraries was key to allow application transparency. Performance Dapper allows to trace requests without performance impacts due sampling.

Zipkin is inspired on this paper

Thank you! I hope you start tracing your applications… andrefreitas.pt

Dapper - A Large-Scale Distributed Systems Tra...

Dapper - A Large-Scale Distributed Systems Tracing Infrastructure

More Decks by André Freitas

Other Decks in Programming

Featured

Transcript