Upgrade to Pro — share decks privately, control downloads, hide ads and more …

Dapper - A Large-Scale Distributed Systems Tra...

Dapper - A Large-Scale Distributed Systems Tracing Infrastructure

Presentation of the Dapper paper.

André Freitas

January 15, 2019
Tweet

More Decks by André Freitas

Other Decks in Programming

Transcript

  1. Motivation Engineers are not an expert in every systems. We

    can detect an overall latency but don’t know which systems are causing slowness.
  2. Requirements • Low impact ◦ Teams turn the tracing system

    if there is performance impact. • Application transparency ◦ A tracing system that relies in application developers becomes fragile. • Tracing data quickly available ◦ We want to be able to react to problems as soon as possible.
  3. “Our instrumentation is restricted to a low enough level in

    the software stack that even large-scale distributed systems like Google web search could be traced without additional annotations” Page 2
  4. How to aggregate trace information? • Black Box ◦ No

    additional information than the message record. ◦ Use statistical regression. • Annotation-based ◦ Applications or middleware tag every record. ◦ Use a global identifier to link all records to the originating request.
  5. Dapper - Trace tree - Nodes are spans - Edge

    is a casual relationship - Same trace id in all spans
  6. RPC span annotations Every RPC span contains annotations from the

    client and server processes. Clock skew in timestamps? Client always send the request before a server receives so there is an interval for server side timestamps.
  7. Instrumentation points Near-zero intervention from application developers possible with instrumentation

    in shared libraries: • Threading library ◦ The trace context container is injected in the thread context. ◦ This container is small and contains trace and span ids. • Control flow library ◦ Allows to construct callbacks and schedule them in a thread pool. ◦ Callbacks store the trace context of their creator. • RPC framework for Java and C++ ◦ Span and trace ids are transmitted from client to server. ◦ This is an essential instrumentation point.
  8. Trace collection Three-stage process: 1. Span data writen to local

    log files; 2. Collectors pull logs from all hosts through the Dapper daemons; 3. Traces are written to a cell in Dapper Bigtable repositories. A trace is a row and each column is a span. Tables are sparse, since the number of spans is arbitrary. This median latency of this process is 15 seconds. But sometimes it can take hours.
  9. Security and privacy • Dapper stores only the name of

    RPC methods without the payload. • Allows to monitor security policies such isolation of systems.
  10. Deployment status at Google Most critical part is Threading, RPC

    and Control Flow libraries. Core instrumentation in ~1000 lines for C++ and ~800 lines for Java. Key-value annotations takes ~500 lines. Dapper daemon is part of the basic machine image, so it is present in every server at Google. Nearly every Google production process supports tracing. Google developed a simple library to control trace ids propagation manually as a workarround in non-standard control flows that Dapper doesn’t support. Near 90% of Dapper traces contain annotations. Annotations are also used to relate traces with a certain feature.
  11. Quick quiz The sampling ratio that provides low overhead and

    enough information is: A. 2/10 B. 1/1000 C. 1/100
  12. Generation Overhead Creating and destroying spans and annotations and logging

    to disk, are the most important source of overhead.
  13. Writes to Local Disk Logs Are the most expensive operation

    but they are asyncronous to the application.
  14. Collection Overhead Daemon uses less than 0.3% CPU in a

    unrealistic heavy load benchmark. Daemon process have the lowest priority in the Kernel.
  15. Effect in production workloads Trace sampling is necessary in production.

    Services with high traffic can generate a lot of trace data. A sampling of 1/1024 have enough data in high-volume services. A lower sampling frequency allows data to persist more time on disk giving flexibility to the Dapper collection infrastructure. Benchmark - Sampling impact in web search.
  16. Adaptive sampling A fixed sampling ratio doesn’t fit for low

    traffic services, since it misses important traces. Having a rate of traces/time is better for these scenarios.
  17. Sampling in Collection Infrastructure Traces are also sampled when collected,

    to save resources in the Dapper infrastructure. Google generated more than 1TB of Trace Data per Day.
  18. Dapper Depot API • Access traces by trace id •

    Bulk access ◦ Map Reduce ◦ User-defined function that accept a trace • Indexed access ◦ Index storage cost is 74% of the trace data itself ◦ Access traces by service or hostname ◦ Hostname index was nearly not used ◦ Dropped hostname index, and composed one of {hostname,service,timestamp} • Analytics tools developed on top
  19. Dapper Use Interface 1. Service and Time Window 2. Performance

    Summary 3. Execution Pattern 4. Metric histogram 5. Trace
  20. Google Ads Review system Google developed an automatic keyword review

    system (e.g. inappropriate language). Dapper helped from the first prototype: • Track performance bottlenecks to optimize; • Find unecessary requests made to databases masters that could be done to replicas; • Understand the cost of making queries in the dependencies; • Quality assurance in correct system behavior and performance; • Ads Review estimates that latency have improved by 2x using data from Dapper.
  21. Exception monitoring Trace and span ids are included in the

    exception report. Helped in forensic of bugs.
  22. Long tail latency Due the number of moving parts, debugging

    services like Universal Search is very challenging, even for experienced Engineers. An Engineer developed a tool to infer critical paths from Dapper traces. This allowed to discover: • Impact of network ◦ Momentary degratation in network does not affect system throughput but overall latency. • Unintended interactions between services ◦ Wrong queries resulted in requests to services that were not necessary. • Slow queries ◦ Built a list of slow queries associated with trace ids.
  23. Inferring services dependencies Google developed a project to automatically infer

    services dependencies using DAPI MapReduce interface. It helped Infrastructure Ops.
  24. Coalescing effects Requests that are batched are not reported correctly

    (e.g. disk writes) Dapper reports only one of the batched requests.
  25. Batch workloads Dapper is focused in performance insights from user-generated

    requests. It can also be useful for offline batch workloads.
  26. Finding Root Cause Sometimes a request takes time to complete

    because another request queue ahead of it. This is not visible in Dapper.
  27. Conclusions Low cost of implementation Dapper allowed the majority of

    Google services to be tracked without the need for application level modifications. Shared libraries The usage of shared libraries was key to allow application transparency. Performance Dapper allows to trace requests without performance impacts due sampling.