Upgrade to Pro — share decks privately, control downloads, hide ads and more …

The Importance of Observability for Kafka-based applications with Zipkin

E73937b577f216b7d144601296395ca9?s=47 Jorge Quilcate
September 10, 2018

The Importance of Observability for Kafka-based applications with Zipkin


Jorge Quilcate

September 10, 2018


  1. The Importance of Observability for Kafka-based applications with Zipkin jorge.quilcate@sysco.no

  2. Jorge Quilcate-Otoya @jeqo89 github.com/jeqo github.com/sysco-middleware Middleware team at SYSCO AS

    focused on Data-Integration and Distributed Tracing
  3. SYSCO AS Middleware department: Integration and Data Engineering We are

    hiring! Partners: github.com/sysco-middleware sysco.no/
  4. Agenda Event-Driven Applications and Kafka Observability and Distributed Tracing Simulating

    Observability tools
  5. Apache Kafka “Apache Kafka® is a distributed Streaming platform.”

  6. Event-Driven Applications and Kafka Amazonas river

  7. Event-Driven Architectural Style https://docs.microsoft.com/en-us/azure/architecture/guide/architecture-styles/event-driven

  8. Service Collaboration and Dataflow Svc Svc Svc Svc Orchestration Event

    Bus Svc Svc Svc Svc Choreography
  9. https://www.slideshare.net/ConfluentInc/etl-is-dead-long-live-streams Kafka Ecosystem

  10. Observability and Distributed Tracing Titicaca Lake

  11. What is Observability? “In control theory, observability is a measure

    of how well internal states of a system can be inferred from knowledge of its external outputs.” - Wikipedia
  12. Observability is for *Unknown Unknowns* https://twitter.com/mipsytipsy/status/963956028940234752

  13. Observability methods

  14. Observability methods

  15. Span = execution of a task Trace = tree of

    spans Context Propagation = pass trace context between distributed components (e.g. HTTP Headers, Kafka-record Headers) Distributed Tracing Concepts
  16. Demo Lab 01: Hello world to Distributed Tracing • Tracing

    concepts • Brave instrumentation https://github.com/jeqo/talk-kafka-zipkin#lab-1-hello-world-distributed-tracing
  17. Adoption approaches Annotation-based - Part of your code - Instrument

    libraries first - Add custom spans on-demand - Check benchmarks Black-box
  18. How does it work? Svc 0 Svc 1 tracer tracer

    Collector Tracing System Tracing DB
  19. Zipkin Architecture

  20. Demo Lab 02: Tracing Kafka-based applications • Kafka-clients and Kafka-streams

    instrumentation • Kafka Interceptors for Kafka Connectors https://github.com/jeqo/talk-kafka-zipkin#lab-02-twitter-kafka-based-application
  21. Adoption approaches Annotation-based - Part of your code - Instrument

    libraries first - Add custom spans on-demand - Check benchmarks Black-box - Agent-based model - Framework/Protocol support - Machine impact - Promising approach: Service Mesh/Sidecar Proxy
  22. Service Meshes and Zipkin

  23. #QOTD https://twitter.com/rakyll/status/971231712049971200

  24. Simulating Observability tools Lima - Chorrillos

  25. ➔ Model your architecture ➔ Simulate interaction ➔ Generate Traces

    ➔ Visualize your system’s traffic with Vizceral “SimianViz/ Spigo” - Simulation Protocol Interaction in GO github.com/adrianco/spigo
  26. "Monitoring Microservices: A Challenge" - Adrian Cockcroft

  27. Models from Traces, e.g. Vizceral https://www.youtube.com/watch?v=jWpI8qzqNHk

  28. Demo Lab 03: Spigo and Vizceral • Spigo for Simulation

    of Architecture behavior • Zipkin for Tracing and Vizceral for Traffic Monitoring https://github.com/jeqo/talk-kafka-zipkin#lab-3-spigo-simulation
  29. Takeaways ➔ If are doing Distributed Systems — using Kafka

    or not — consider Distributed Tracing. ➔ Instrument libraries first, not your code. ➔ Experiment by simulating your deployment. ➔ How many models can you build from tracing data?!
  30. References Papers - Dapper: https://static.googleusercontent.com/media/research.google.com /en//pubs/archive/36356.pdf - Canopy: http://cs.brown.edu/~jcmace/papers/kaldor2017canopy.pdf -

    Automating Failure Testing Research at Internet Scale: https://people.ucsc.edu/~palvaro/socc16.pdf Posts: - Logging v. Instrumentation https://peter.bourgon.org/blog/2016/02/07/logging-v-instrument ation.html - Monitoring and Observability https://medium.com/@copyconstruct/monitoring-and-observability -8417d1952e1c - Monitoring in the Time of Cloud Native https://medium.com/@copyconstruct/monitoring-in-the-time-of-cl oud-native-c87c7a5bfa3e Tools: - Zipkin: https://zipkin.io/ - Brave: https://github.com/openzipkin/brave - Kafka Interceptors: https://github.com/sysco-middleware/kafka-interceptors - Spigo: https://github.com/adrianco/spigo - Vizceral: https://github.com/Netflix/vizceral
  31. Thanks! Q&A github.com/jeqo/talk-kafka-zipkin github.com/sysco-middleware Machu Picchu