Upgrade to Pro — share decks privately, control downloads, hide ads and more …

OpenTelemetry at AWS

E7526ec3e801f8ba99f6746498a154a6?s=47 JBD
May 18, 2021

OpenTelemetry at AWS

Register and watch this talk now! https://o11yfest.org/attend.

E7526ec3e801f8ba99f6746498a154a6?s=128

JBD

May 18, 2021
Tweet

Transcript

  1. @rakyll OpenTelemetry at AWS Jaana Dogan Principal Engineer, AWS jbd@amazon.com

  2. @rakyll Who? Jaana Dogan, AWS Explicit focus on instrumentation

  3. @rakyll Five AWS stories... Too many agents Too many formats

    Too little correlation Too many ways to propagate Too many products to support
  4. @rakyll Too many agents 4-5 agents Friction in installation Operational

    burden Friction in configuration delivery Performance penalty
  5. @rakyll Too many formats EMF CloudWatch Prometheus statsd Vendor formats

    ... X-Ray Zipkin Jaeger Vendor formats ...
  6. @rakyll Too little correlation Tool fatigue Disjoint views Missing metadata

    Friction in troubleshooting
  7. @rakyll Too many ways to propagate Lack of end-to-end traces

    Missing label propagation No W3C TraceContext or B3 support No runtime propagation standards
  8. @rakyll Too many products to support CloudWatch X-Ray Prometheus Elasticsearch/OpenSearch

    New Relic, Datadog, Splunk, Honeycomb, Lightstep and more.
  9. @rakyll What do we use? Specification Context Propagation Semantic Conventions

    Data Model Protocol (OTLP) Collector Client Libraries
  10. @rakyll What’s next? collector Managed on EC2, ECS, EKS, Lambda,

    etc.
  11. @rakyll What’s next? collector Managed on EC2, ECS, EKS, Lambda,

    etc.
  12. @rakyll What’s next? collector Managed on EC2, ECS, EKS, Lambda,

    etc. OTLP Prometheus statsd X-Ray Jaeger Zipkin
  13. @rakyll What’s next? collector Managed on EC2, ECS, EKS, Lambda,

    etc. OTLP Prometheus statsd X-Ray Jaeger Zipkin CloudWatch Prometheus X-Ray Elastic/OpenSearch Jaeger Zipkin Vendors Raw storage
  14. @rakyll What’s next? collector Managed on EC2, ECS, EKS, Lambda,

    etc. OTLP Prometheus statsd X-Ray Jaeger Zipkin CloudWatch Prometheus X-Ray Jaeger Zipkin Vendors Raw storage enrich, transform, ...
  15. @rakyll Container Insights now collected by OpenTelemetry.

  16. @rakyll What do we use? Specification Context Propagation Semantic Conventions

    Data Model Protocol (OTLP) Collector Client Libraries
  17. @rakyll What works well? Flexible Composable Lightweight enough Holistic Legacy

    protocol friendly Community
  18. @rakyll What challenges us? Stability Custom builds Compatibility (Prometheus &

    CloudWatch) Boilerplate in client libraries
  19. @rakyll What are we working on next?

  20. @rakyll Prometheus

  21. @rakyll Prometheus Drop-in replacement for Prometheus Data model changes Remote

    write compliance Discovery + scrape config compliance Kubernetes operator
  22. @rakyll Components Container Insights receivers and processors CloudWatch histogram compatibility

    CloudWatch Logs exporter S3 exporter
  23. @rakyll Propagation Adopting 128-bit trace IDs in X-Ray Context propagation

    in SQL
  24. @rakyll Platforms EC2 ECS EKS Lambda (and control plane components...)

  25. @rakyll Lambda support

  26. @rakyll Others... eBPF Profiles Real time user monitoring Network diagnostics

    Database performance
  27. @rakyll One more thing...

  28. @rakyll Exporting to vendors? Vended data streams CloudWatch Metric Streams

    support OTLP CW Metrics S3 (in JSON or OTLP) Kinesis (in JSON or OTLP)
  29. @rakyll It’s not a fork. It’s a snapshot for security,

    performance, support.
  30. @rakyll Thank you Jaana Dogan Principal Engineer, AWS jbd@amazon.com