$30 off During Our Annual Pro Sale. View Details »

Distributed Tracing and Monitoring with OpenTelemetry

Distributed Tracing and Monitoring with OpenTelemetry

OpenTelemetry is an emerging standard for tracing and the metrics of cloud services. You can use it to gain observability into applications that span multiple clouds and technological stacks.

I explain how to use open source and vendor-agnostic client libraries for OpenTelemetry and export telemetry to common APM systems such as Zipkin and others. Along the way, we discuss core concepts such as tags, metrics, exporters, zPages, and trace context propagation.

Simon Zeltser

June 13, 2019
Tweet

More Decks by Simon Zeltser

Other Decks in Programming

Transcript

  1. Distributed Tracing and Monitoring
    with OpenTelemetry
    Simon Zeltser
    @simon_zeltser

    View Slide

  2. Software Evolution
    @simon_zeltser
    On Prem Cloud
    Virtual Machines Containers
    Monolith Microservices
    Single Language / Stack Polyglot
    Single Cloud Multiple Cloud Providers / Hybrid
    Containers Cloud Functions / Serverless

    View Slide

  3. New architectures => new challenges
    Debugging
    Observability
    Standardizing development practices
    Deployment / Packaging
    Configuration Management
    Secrets management
    @simon_zeltser

    View Slide

  4. Who Am I
    Software Engineer at
    Cloud Developer Experience -
    Infrastructure and Operations Tools
    Over a decade in distributed systems
    observability
    github.com/simonz130
    @simon_zeltser

    View Slide

  5. What is Observability?
    Being able to debug the system and gain insights into the
    system’s behavior
    @simon_zeltser

    View Slide

  6. Observability > Monitoring
    @simon_zeltser
    Venn diagram by Cindy Sridharan

    View Slide

  7. @simon_zeltser

    View Slide

  8. Signals
    Holistic Approach:
    - Distributed Tracing
    - Metrics Collection
    - Continuous system profiling
    - Capturing Logs
    @simon_zeltser

    View Slide

  9. Ultimate Recipe for reliable cloud service
    Track System Health
    Capture traces, metrics and
    logs, create alerts on data
    Detect Problems
    Locality, networking,
    scheduling, dependencies
    Fix & Refine
    Optimize performance and
    cost of services
    Observability
    Lifecycle
    @simon_zeltser

    View Slide

  10. Observability in Distributed Systems
    Hard and Expensive:
    ● Context propagation between
    components
    ● Multiple environments
    ● External dependencies
    ● Vendor Lock-in
    ● Cost
    @simon_zeltser
    Architecture from Hipstershop Demo App

    View Slide

  11. Meet OpenTelemetry!
    @simon_zeltser
    OpenTelemetry is integrated set of APIs and libraries
    to generate, collect and describe telemetry in
    distributed systems
    Problems OpenTelemetry solves:
    - Vendor neutrality for tracing, monitoring and
    logging APIs
    - Context propagation

    View Slide

  12. OpenTelemetry is:
    ● Single set of APIs for tracing and metrics collection
    ● Integrations with popular web, RPC and storage frameworks
    ● Standardized Context Propagation
    ● Exporters for sending data to backend of choice
    ● Collector for smart traces & metrics aggregation
    @simon_zeltser

    View Slide

  13. Next major version of the OpenTracing and OpenCensus projects
    + =

    View Slide

  14. Roadmap
    Announcement: https://medium.com/opentracing/merging-opentracing-and-opencensus-f0fe9c7ca6f0
    Roadmap: https://medium.com/opentracing/a-roadmap-to-convergence-b074e5815289

    View Slide

  15. Who is behind OpenTelemetry?
    @simon_zeltser
    Core Contributors

    View Slide

  16. Metrics with OpenTelemetry

    View Slide

  17. Instrumentation with metrics
    View
    Measurement
    Tag
    Measure
    Aggregation
    View Data
    @simon_zeltser

    View Slide

  18. Demo
    Getting Started - Metrics collection with OpenTelemetry
    Demo coDE: https://github.com/simonz130/opencensus-csharp-samples
    @simon_zeltser

    View Slide

  19. Tracing with OpenTelemetry
    Demo coDE: https://github.com/simonz130/opencensus-csharp-samples
    @simon_zeltser

    View Slide

  20. Tracing with OpenTelemetry - the options
    Agentless Using Agent
    Exporter
    Trace
    Backend
    Container/VM
    Application
    Agent
    Trace
    Backend
    Container/VM
    HTTP IN
    HTTP IN
    Traces
    Application
    Initialize exporter in app code Install the agent alongside the app
    @simon_zeltser

    View Slide

  21. Tracing with OpenTelemetry - Terminology
    @simon_zeltser
    Trace - a collection of spans
    Span - a single operation in a trace
    Sampler - decide whether to export a span
    Exporter - sending traces to observability systems

    View Slide

  22. Configure Exporter
    // Configure exporter to export traces to Zipkin
    var exporter = new ZipkinTraceExporter(
    new ZipkinTraceExporterOptions()
    {
    Endpoint = new Uri(zipkinUri),
    ServiceName = "tracing-to-zipkin-service",
    },
    Tracing.ExportComponent);
    exporter.Start();
    @simon_zeltser

    View Slide

  23. Configure Sampler
    // 100% sample rate, otherwise, few traces will be sampled.
    ITraceConfig traceConfig = Tracing.TraceConfig;
    ITraceParams currentConfig = traceConfig.ActiveTraceParams;
    var newConfig = currentConfig.ToBuilder()
    .SetSampler(Samplers.AlwaysSample)
    .Build();
    traceConfig.UpdateActiveTraceParams(newConfig);
    @simon_zeltser

    View Slide

  24. Using the Tracer
    // 3. Tracer is global singleton.
    // You can register it via dependency injection if it exists
    // but if not - you can use it as follows:
    var tracer = Tracing.Tracer;
    @simon_zeltser

    View Slide

  25. Create a Span
    // Create a scoped span - only covers “using” block
    using (var scope = tracer.SpanBuilder("Main").StartScopedSpan())
    {
    for (int i = 0; i < 10; i++)
    {
    DoWork(i);
    }
    }
    @simon_zeltser

    View Slide

  26. Trace Context Propagation
    Frontend
    AdService Checkout
    Payment
    Service
    EmailService
    {context} {context}
    {context} {context}
    Common scenarios
    ● A/B testing
    TraceId=>{context}
    Frontend
    AdService
    Checkout
    Payment Email
    time
    Trace
    Span

    View Slide

  27. Trace Context Propagation in OpenTelemetry
    W3C Standard for Context Propagation
    FORMAT: trace-id "-" parent-id "-" trace-flags
    trace-id = 32HEXDIG ; 16 bytes array identifier. All zeroes forbidden
    parent-id = 16HEXDIG ; 8 bytes array identifier. All zeroes forbidden
    trace-flags = 2HEXDIG ; 8 bit flags. Currently only one bit is used (“recorded”)
    Supported in all OpenCensus client libraries, will be supported in OpenTelemetry
    @simon_zeltser

    View Slide

  28. OpenCensus Service
    WHAT?
    ● Agent & Collector for metrics
    and traces
    WHY?
    ● Smart Sampling
    ● Export to one or more
    monitoring/tracing backends
    OC Service Repo
    .NET Core OC Agent Repo
    OpenCensus
    Tracing Library
    OpenCensus
    Monitoring Library
    Jaeger
    Zipkin
    Stackdriver
    Prometheus
    AppInsights
    OpenCensus
    Agent
    OpenCensus
    Collector
    OpenCensus Service
    BE Destinations
    @simon_zeltser

    View Slide

  29. Traces & Sampling
    Problem: Traced apps can generate many traces with many
    spans
    ● Higher operational costs
    ● Noise
    @simon_zeltser
    Solution: Smart Sampling
    ● Head-based (sampled at the beginning of the trace)
    ● Tail-based (sampled at the end of the trace)

    View Slide

  30. DEMO
    Tail-based sampling
    Demo coDE
    @simon_zeltser

    View Slide

  31. Architecture
    @simon_zeltser
    Synthetic Load Generator
    simulates traffic to Hipster Shop Demo App
    Checkout
    Synthetic
    Load
    Generator
    OpenCensus
    Service
    (Collector)
    Jaeger 1
    Jaeger 2
    Cassandra
    Cassandra
    Hipster Shop App
    Browse
    Jaeger 1 - head based sampling
    Jaeger 2 - tail based sampling
    Demo repo

    View Slide

  32. Integrations
    OpenCensus and OpenTracing are integrated with a wide variety of frameworks, products and libraries.
    It provides observability for the following:
    ● Redis
    ● Memcached
    ● Google Cloud
    ● Dropwizard
    ● SQL
    ● Caddy
    ● Go kit
    ● GroupCache
    ● MongoDB
    @simon_zeltser

    View Slide

  33. Community
    OpenTelemetry Website: https://opentelemetry.io
    Specifications: https://github.com/open-telemetry/opentelemetry-specification
    Meetings: https://github.com/open-telemetry/community
    @simon_zeltser

    View Slide

  34. Thank you!
    @simon_zeltser
    CONTACT ME:
    [email protected]
    Slides:
    https://speakerdeck.com/simonz130
    Sample Code: https://github.com/simonz130/opencensus-csharp-samples

    View Slide

  35. View Slide