Upgrade to Pro — share decks privately, control downloads, hide ads and more …

Distributed Tracing and Monitoring With OpenCensus

Simon Zeltser
February 01, 2019

Distributed Tracing and Monitoring With OpenCensus

A talk at NDC London 2019

Simon Zeltser

February 01, 2019

More Decks by Simon Zeltser

Other Decks in Technology


  1. Software Evolution @simon_zeltser On Prem Cloud Virtual Machines Containers Monolith

    Microservices Single Language / Stack Polyglot Single Cloud Multiple Cloud Providers / Hybrid Containers Cloud Functions / Serverless
  2. New architectures => new challenges Debugging Observability Standardizing development practices

    Deployment / Packaging Configuration Management Secrets management @simon_zeltser
  3. Who Am I Software Engineer at Cloud Developer Experience -

    Infrastructure and Operations Tools Over a decade in distributed systems observability github.com/simonz130 @simon_zeltser
  4. What is Observability? Being able to debug the system and

    gain insights into the system’s behavior @simon_zeltser
  5. Signals Holistic Approach: - Distributed Tracing - Metrics Collection -

    Continuous system profiling - Capturing Logs @simon_zeltser
  6. Ultimate Recipe for reliable cloud service Track System Health Capture

    traces, metrics and logs, create alerts on data Detect Problems Locality, networking, scheduling, dependencies Fix & Refine Optimize performance and cost of services Observability Lifecycle @simon_zeltser
  7. Observability in Distributed Systems Hard and Expensive: • Context propagation

    between components • Multiple environments • External dependencies • Vendor Lock-in • Cost @simon_zeltser Architecture from Hipstershop Demo App
  8. Meet OpenCensus! @simon_zeltser OpenCensus is a single distribution of libraries

    that collect metrics and distributed traces from your services Problems OpenCensus solves: - Vendor neutrality for tracing and monitoring APIs - Context propagation
  9. OpenCensus is: • Single set of APIs for tracing and

    metrics collection • Integrations with popular web, RPC and storage frameworks • Standardized Context Propagation • Exporters for sending data to backend of choice • Collector for smart traces & metrics aggregation @simon_zeltser
  10. OpenCensus Nuget Packages Package Name Description OpenCensus.Abstractions .NET Core API

    Abstractions OpenCensus .NET Core API Implementation OpenCensus.Collector.Dependencies Collector for .NET Core HttpClient OpenCensus.Collector.AspNetCore Collector for ASP.NET Core Full list @simon_zeltser
  11. Tracing with OpenCensus - the options Agentless Using Agent Exporter

    Trace Backend Container/VM Application Agent Trace Backend Container/VM HTTP IN HTTP IN Traces Application Initialize exporter in app code Install the agent alongside the app @simon_zeltser
  12. Tracing with OpenCensus - Terminology @simon_zeltser Trace - a collection

    of spans Span - a single operation in a trace Sampler - decide whether to export a span Exporter - sending traces to observability systems
  13. Configure Exporter // 1. Configure exporter to export traces to

    Zipkin var exporter = new ZipkinTraceExporter( new ZipkinTraceExporterOptions() { Endpoint = new Uri(zipkinUri), ServiceName = "tracing-to-zipkin-service", }, Tracing.ExportComponent); exporter.Start(); @simon_zeltser
  14. Configure Sampler // 2. 100% sample rate, otherwise, few traces

    will be sampled. ITraceConfig traceConfig = Tracing.TraceConfig; ITraceParams currentConfig = traceConfig.ActiveTraceParams; var newConfig = currentConfig.ToBuilder() .SetSampler(Samplers.AlwaysSample) .Build(); traceConfig.UpdateActiveTraceParams(newConfig); @simon_zeltser
  15. Using the Tracer // 3. Tracer is global singleton. //

    You can register it via dependency injection if it exists // but if not - you can use it as follows: var tracer = Tracing.Tracer; @simon_zeltser
  16. Create a Span // 4. Create a scoped span -

    only covers “using” block using (var scope = tracer.SpanBuilder("Main").StartScopedSpan()) { Console.WriteLine("About to do a busy work"); for (int i = 0; i < 10; i++) { DoWork(i); } } @simon_zeltser
  17. Trace Context Propagation Frontend AdService Checkout Payment Service EmailService {context}

    {context} {context} {context} Common scenarios • A/B testing TraceId=>{context} Frontend AdService Checkout Payment Email time Trace Span
  18. Trace Context Propagation in OpenCensus W3C Standard for Context Propagation

    FORMAT: trace-id "-" parent-id "-" trace-flags trace-id = 32HEXDIG ; 16 bytes array identifier. All zeroes forbidden parent-id = 16HEXDIG ; 8 bytes array identifier. All zeroes forbidden trace-flags = 2HEXDIG ; 8 bit flags. Currently only one bit is used (“recorded”) Supported in all OpenCensus client libraries @simon_zeltser
  19. OpenCensus Service WHAT? • Agent & Collector for metrics and

    traces WHY? • Smart Sampling • Export to one or more monitoring/tracing backends OC Service Repo .NET Core OC Agent Repo OpenCensus Tracing Library OpenCensus Monitoring Library Jaeger Zipkin Stackdriver Prometheus AppInsights OpenCensus Agent OpenCensus Collector OpenCensus Service BE Destinations @simon_zeltser
  20. Traces & Sampling Problem: Traced apps can generate many traces

    with many spans • Higher operational costs • Noise @simon_zeltser Solution: Smart Sampling • Head-based (sampled at the beginning of the trace) • Tail-based (sampled at the end of the trace)
  21. Architecture @simon_zeltser Synthetic Load Generator simulates traffic to Hipster Shop

    Demo App Checkout Synthetic Load Generator OpenCensus Service (Collector) Jaeger 1 Jaeger 2 Cassandra Cassandra Hipster Shop App Browse Jaeger 1 - head based sampling Jaeger 2 - tail based sampling Demo repo
  22. zPages /Tracez Examine and bucketize spans by latency buckets (0us,

    10us, 100us, 1ms, 10ms, 100ms, 1s, 10s, 1m] They also allow you to quickly examine error samples /Rpcz Examine statistics of remote procedure calls (RPCs) that are are instrumented with OpenCensus. For example when using gRPC @simon_zeltser
  23. Integrations OpenCensus is integrated with a wide variety of frameworks,

    products and libraries. It provides observability for the following: • Redis • Memcached • Google Cloud • Dropwizard • SQL • Caddy • Go kit • GroupCache • MongoDB @simon_zeltser
  24. Opencensus roadmap IMPROVE EASE OF USE • Embedded in most

    popular frameworks and platforms • Easy to get started ENHANCE FEATURE SET • Correlation between metrics, traces metadata and logs • Clients Support BUILD STRONG COMMUNITY • CNCF • OpenCensus with OpenTracing @simon_zeltser WORKSTREAMS • OC Agent and Service • Integrations • New Telemetry • Community