Upgrade to Pro — share decks privately, control downloads, hide ads and more …

Making sense of event-driven dataflows - Kafka ...

Making sense of event-driven dataflows - Kafka Summit NYC 2019

Tracing Apache Kafka-based applications with Zipkin

Jorge Quilcate

April 02, 2019
Tweet

More Decks by Jorge Quilcate

Other Decks in Technology

Transcript

  1. @jeqo89 at #kafkasummit “Complexity is anything [...] that makes a

    system hard to understand and modify” John Ousterhout, “A Philosophy of Software Design”
  2. @jeqo89 at #kafkasummit “Complexity is caused by two things: dependencies

    and obscurity” John Ousterhout, “A Philosophy of Software Design”
  3. @jeqo89 at #kafkasummit Jorge Esteban Quilcate Otoya twitter: @jeqo89 |

    github: jeqo Peruvian in Oslo, Norway Integration team at SYSCO AS Part of Apache Kafka and Zipkin communities
  4. @jeqo89 at #kafkasummit Talk: “Making sense of your event-driven dataflows”

    40 min Q&A Why? What distributed tracing? How to instrument Kafka apps? Demo What’s next? Demo time
  5. @jeqo89 at #kafkasummit Is Kafka producer `send` sync or async?

    github.com/jeqo/tracing-kafka-apps Blocking call
  6. @jeqo89 at #kafkasummit “The more accurately you try to measure

    the position of a particle, the less accurately you can measure its speed” Heisenberg's uncertainty principle
  7. @jeqo89 at #kafkasummit /** Annotation-based approach **/ ScopedSpan span =

    tracer.startScopedSpan("process"); try { // The span is in "scope" doProcess(); } catch (RuntimeException | Error e) { span.error(e); // mark as error throw e; } finally { span.finish(); // always finish }
  8. @jeqo89 at #kafkasummit /** Annotation-based approach **/ ScopedSpan span =

    tracer.startScopedSpan("process"); try { // The span is in "scope" doProcess(); } catch (RuntimeException | Error e) { span.error(e); // mark as error throw e; } finally { span.finish(); // always finish }
  9. @jeqo89 at #kafkasummit /** Annotation-based approach **/ ScopedSpan span =

    tracer.startScopedSpan("process"); try { // The span is in "scope" doProcess(); } catch (RuntimeException | Error e) { span.error(e); // mark as error throw e; } finally { span.finish(); // always finish }
  10. @jeqo89 at #kafkasummit /** Instrumentation for Kafka Clients **/ Producer<K,

    V> producer = new KafkaProducer<>(settings); Producer<K, V> tracedProducer = kafkaTracing.producer(producer); producer.send( new ProducerRecord<>( "my-topic", key, value ));
  11. @jeqo89 at #kafkasummit /** Instrumentation for Kafka Clients **/ Producer<K,

    V> producer = new KafkaProducer<>(settings); Producer<K, V> tracedProducer = kafkaTracing.producer(producer); // wrap tracedProducer.send( new ProducerRecord<>( "my-topic", key, value ));
  12. @jeqo89 at #kafkasummit /** Instrumentation for Kafka Clients **/ Consumer<K,

    V> consumer = new KafkaConsumer<>(settings); Consumer<K, V> tracedConsumer = kafkaTracing.consumer(consumer); while (running) { var records = consumer.poll(1000); records.forEach(this::process); }
  13. @jeqo89 at #kafkasummit /** Instrumentation for Kafka Clients **/ Consumer<K,

    V> consumer = new KafkaConsumer<>(settings); Consumer<K, V> tracedConsumer = kafkaTracing.consumer(consumer); // wrap while (running) { var records = tracedConsumer.poll(1000); records.forEach(this::process); }
  14. @jeqo89 at #kafkasummit /** Instrumentation for Kafka Clients **/ void

    process(ConsumerRecord<K, V> record){ // extract span from record headers Span span = kafkaTracing.nextSpan(record) .name("process") .start(); try (var ws = tracer.withSpanInScope(span)) { doProcess(record); } catch (RuntimeException | Error e) { span.error(e); throw e; } finally { span.finish(); } }
  15. @jeqo89 at #kafkasummit /** Instrumentation for Kafka Streams **/ var

    b = new StreamsBuilder(); b.stream("input-topic") .map(this::parseRecord)) .join(table, this::tableJoiner) .transformValues(this::transform)) .to("output-topic"); KafkaStreams kafkaStreams = new KafkaStreams(b.build(), config); kafkaStreams.start();
  16. @jeqo89 at #kafkasummit /** Instrumentation for Kafka Streams **/ var

    b = new StreamsBuilder(); b.stream("input-topic") .map(this::parseRecord)) .join(table, this::tableJoiner) .transformValues(this::transform)) .to("output-topic"); KafkaStreams kafkaStreams = // wrap ksTracing.kafkaStreams(b.build(), config); kafkaStreams.start();
  17. @jeqo89 at #kafkasummit /** Instrumentation for Kafka Streams **/ var

    b = new StreamsBuilder(); b.stream("input-topic") .transform(ksTracing.map(“parse”, this::parseRecord)) .join(table, this::tableJoiner) .transformValues(ksTracing.transformValues( “transform”, this::transform))) .to("output-topic");
  18. @jeqo89 at #kafkasummit /** Transports for Zipkin **/ var sender

    = URLConnectionSender.create( "http://localhost:9411/api/v2/spans" ); var reporter = AsyncReporter.create(sender);
  19. @jeqo89 at #kafkasummit /** Transports for Zipkin **/ var sender

    = KafkaSender.newBuilder() .bootstrapServers( "localhost:9092") .build(); var reporter = AsyncReporter.create(sender);
  20. @jeqo89 at #kafkasummit /** Transports for Zipkin **/ var sender

    = BringYourOwnSender.newBuilder() .build(); var reporter = AsyncReporter.create(sender);
  21. @jeqo89 at #kafkasummit Streaming Messaging Kafka Clients REST Proxy KSQL

    Kafka Source Connector Kafka Streams Kafka Sink Connector
  22. @jeqo89 at #kafkasummit Distributed Tracing IS A Stream Processing Problem

    Span Consumer Trace Aggregation Span Store spans-collected traces-completed Dependencies Store github.com/jeqo/zipkin-storage-kafka Custom processors
  23. @jeqo89 at #kafkasummit Haystack: tracing and analysis platform Tracing Trends

    and Metrics Anomaly Detection Remediation Alerting services
  24. @jeqo89 at #kafkasummit * Demo 1, source code: github.com/jeqo/tracing-kafka-apps *

    Demo 2, source code: github.com/jeqo/talk-kafka-zipkin * Blog post: confluent.io/blog/importance-of-distributed-tracing-for-apache-kafka-based-applications * Zipkin: github.com/openzipkin (moving to Apache foundation), gitter.im/openzipkin/zipkin * Sites using Zipkin: cwiki.apache.org/confluence/display/ZIPKIN/Sites * Haystack: github.com/ExpediaDotCom/haystack, gitter.im/expedia-haystack * Zipkin Kafka Backend: github.com/jeqo/zipkin-storage-kafka * Kafka Interceptor for Zipkin: github.com/sysco-middleware/kafka-interceptor-zipkin * Martin Kleppmann et al. 2019. Online Event Processing. https://dl.acm.org/citation.cfm?id=3321612 * John Ousterhout. A Philosophy of Software Design. www.amazon.com/Philosophy-Software-Design-John-Ousterhout/dp/1732102201 * Jonathan Kaldor et al.2017. Canopy: An End-to-End Performance Tracing And Analysis System.SOSP’17(2017). doi.org/10.1145/3132747.3132749 * Peter Alvaro et al.2016. Automating Failure Testing Research at Internet Scale.SoCC ’16. dx.doi.org/10.1145/2987550.2987555 Resources