Slide 1

Slide 1 text

@jeqo89 at #kafkasummit Making sense of event-driven dataflows Distributed tracing for Apache Ka a®-based applications with Zipkin

Slide 2

Slide 2 text

@jeqo89 at #kafkasummit

Slide 3

Slide 3 text

@jeqo89 at #kafkasummit

Slide 4

Slide 4 text

@jeqo89 at #kafkasummit

Slide 5

Slide 5 text

@jeqo89 at #kafkasummit

Slide 6

Slide 6 text

@jeqo89 at #kafkasummit

Slide 7

Slide 7 text

@jeqo89 at #kafkasummit

Slide 8

Slide 8 text

@jeqo89 at #kafkasummit

Slide 9

Slide 9 text

@jeqo89 at #kafkasummit

Slide 10

Slide 10 text

@jeqo89 at #kafkasummit

Slide 11

Slide 11 text

@jeqo89 at #kafkasummit Complexity happens…

Slide 12

Slide 12 text

@jeqo89 at #kafkasummit “Complexity is anything [...] that makes a system hard to understand and modify” John Ousterhout, “A Philosophy of Software Design”

Slide 13

Slide 13 text

@jeqo89 at #kafkasummit “Complexity is caused by two things: dependencies and obscurity” John Ousterhout, “A Philosophy of Software Design”

Slide 14

Slide 14 text

@jeqo89 at #kafkasummit twitter.com/rakyll/status/971231712049971200

Slide 15

Slide 15 text

@jeqo89 at #kafkasummit Jorge Esteban Quilcate Otoya twitter: @jeqo89 | github: jeqo Peruvian in Oslo, Norway Integration team at SYSCO AS Part of Apache Kafka and Zipkin communities

Slide 16

Slide 16 text

@jeqo89 at #kafkasummit Talk: “Making sense of your event-driven dataflows” 40 min Q&A Why? What distributed tracing? How to instrument Kafka apps? Demo What’s next? Demo time

Slide 17

Slide 17 text

@jeqo89 at #kafkasummit Trace Span

Slide 18

Slide 18 text

@jeqo89 at #kafkasummit “Demystifying” Kafka client configurations Kafka producers Kafka Streams Kafka Consumer github.com/jeqo/tracing-kafka-apps

Slide 19

Slide 19 text

@jeqo89 at #kafkasummit github.com/jeqo/tracing-kafka-apps Trace ID Trace metrics Trace timeline Spans

Slide 20

Slide 20 text

@jeqo89 at #kafkasummit github.com/jeqo/tracing-kafka-apps

Slide 21

Slide 21 text

@jeqo89 at #kafkasummit github.com/jeqo/tracing-kafka-apps

Slide 22

Slide 22 text

@jeqo89 at #kafkasummit Is Kafka producer `send` sync or async? github.com/jeqo/tracing-kafka-apps Blocking call

Slide 23

Slide 23 text

@jeqo89 at #kafkasummit github.com/jeqo/tracing-kafka-apps Non-blocking call Is Kafka producer `send` sync or async?

Slide 24

Slide 24 text

@jeqo89 at #kafkasummit github.com/jeqo/tracing-kafka-apps Batched record

Slide 25

Slide 25 text

@jeqo89 at #kafkasummit github.com/jeqo/tracing-kafka-apps auto.commit=true

Slide 26

Slide 26 text

@jeqo89 at #kafkasummit github.com/jeqo/tracing-kafka-apps commit per record

Slide 27

Slide 27 text

@jeqo89 at #kafkasummit services Report

Slide 28

Slide 28 text

@jeqo89 at #kafkasummit “The more accurately you try to measure the position of a particle, the less accurately you can measure its speed” Heisenberg's uncertainty principle

Slide 29

Slide 29 text

@jeqo89 at #kafkasummit CLIENT SERVER TraceContext=abc tracer tracer Traces reporting: Annotation-based approach TRACES

Slide 30

Slide 30 text

@jeqo89 at #kafkasummit PRODUCER CONSUMER tracer tracer TraceContext=abc BROKER TraceContext=abc TRACES

Slide 31

Slide 31 text

@jeqo89 at #kafkasummit /** Annotation-based approach **/ ScopedSpan span = tracer.startScopedSpan("process"); try { // The span is in "scope" doProcess(); } catch (RuntimeException | Error e) { span.error(e); // mark as error throw e; } finally { span.finish(); // always finish }

Slide 32

Slide 32 text

@jeqo89 at #kafkasummit /** Annotation-based approach **/ ScopedSpan span = tracer.startScopedSpan("process"); try { // The span is in "scope" doProcess(); } catch (RuntimeException | Error e) { span.error(e); // mark as error throw e; } finally { span.finish(); // always finish }

Slide 33

Slide 33 text

@jeqo89 at #kafkasummit /** Annotation-based approach **/ ScopedSpan span = tracer.startScopedSpan("process"); try { // The span is in "scope" doProcess(); } catch (RuntimeException | Error e) { span.error(e); // mark as error throw e; } finally { span.finish(); // always finish }

Slide 34

Slide 34 text

@jeqo89 at #kafkasummit /** Instrumentation for Kafka Clients **/ Producer producer = new KafkaProducer<>(settings); Producer tracedProducer = kafkaTracing.producer(producer); producer.send( new ProducerRecord<>( "my-topic", key, value ));

Slide 35

Slide 35 text

@jeqo89 at #kafkasummit /** Instrumentation for Kafka Clients **/ Producer producer = new KafkaProducer<>(settings); Producer tracedProducer = kafkaTracing.producer(producer); // wrap tracedProducer.send( new ProducerRecord<>( "my-topic", key, value ));

Slide 36

Slide 36 text

@jeqo89 at #kafkasummit /** Instrumentation for Kafka Clients **/ Consumer consumer = new KafkaConsumer<>(settings); Consumer tracedConsumer = kafkaTracing.consumer(consumer); while (running) { var records = consumer.poll(1000); records.forEach(this::process); }

Slide 37

Slide 37 text

@jeqo89 at #kafkasummit /** Instrumentation for Kafka Clients **/ Consumer consumer = new KafkaConsumer<>(settings); Consumer tracedConsumer = kafkaTracing.consumer(consumer); // wrap while (running) { var records = tracedConsumer.poll(1000); records.forEach(this::process); }

Slide 38

Slide 38 text

@jeqo89 at #kafkasummit /** Instrumentation for Kafka Clients **/ void process(ConsumerRecord record){ // extract span from record headers Span span = kafkaTracing.nextSpan(record) .name("process") .start(); try (var ws = tracer.withSpanInScope(span)) { doProcess(record); } catch (RuntimeException | Error e) { span.error(e); throw e; } finally { span.finish(); } }

Slide 39

Slide 39 text

@jeqo89 at #kafkasummit /** Instrumentation for Kafka Streams **/ var b = new StreamsBuilder(); b.stream("input-topic") .map(this::parseRecord)) .join(table, this::tableJoiner) .transformValues(this::transform)) .to("output-topic"); KafkaStreams kafkaStreams = new KafkaStreams(b.build(), config); kafkaStreams.start();

Slide 40

Slide 40 text

@jeqo89 at #kafkasummit /** Instrumentation for Kafka Streams **/ var b = new StreamsBuilder(); b.stream("input-topic") .map(this::parseRecord)) .join(table, this::tableJoiner) .transformValues(this::transform)) .to("output-topic"); KafkaStreams kafkaStreams = // wrap ksTracing.kafkaStreams(b.build(), config); kafkaStreams.start();

Slide 41

Slide 41 text

@jeqo89 at #kafkasummit /** Instrumentation for Kafka Streams **/ var b = new StreamsBuilder(); b.stream("input-topic") .transform(ksTracing.map(“parse”, this::parseRecord)) .join(table, this::tableJoiner) .transformValues(ksTracing.transformValues( “transform”, this::transform))) .to("output-topic");

Slide 42

Slide 42 text

@jeqo89 at #kafkasummit CLIENT SERVER TraceContext=abc Traces reporting: Black-box approach agent agent TRACES

Slide 43

Slide 43 text

@jeqo89 at #kafkasummit CLIENT SERVER TraceContext=abc Traces reporting: mixed approach agent agent TRACES tracer tracer

Slide 44

Slide 44 text

@jeqo89 at #kafkasummit Report Transport services

Slide 45

Slide 45 text

@jeqo89 at #kafkasummit /** Transports for Zipkin **/ var sender = URLConnectionSender.create( "http://localhost:9411/api/v2/spans" ); var reporter = AsyncReporter.create(sender);

Slide 46

Slide 46 text

@jeqo89 at #kafkasummit /** Transports for Zipkin **/ var sender = KafkaSender.newBuilder() .bootstrapServers( "localhost:9092") .build(); var reporter = AsyncReporter.create(sender);

Slide 47

Slide 47 text

@jeqo89 at #kafkasummit /** Transports for Zipkin **/ var sender = BringYourOwnSender.newBuilder() .build(); var reporter = AsyncReporter.create(sender);

Slide 48

Slide 48 text

@jeqo89 at #kafkasummit Report Transport Storage BringYourOwnDB services

Slide 49

Slide 49 text

@jeqo89 at #kafkasummit Report Transport Storage Dependencies (batch) services Data-at-rest

Slide 50

Slide 50 text

@jeqo89 at #kafkasummit Streaming Messaging Kafka Clients REST Proxy KSQL Kafka Source Connector Kafka Streams Kafka Sink Connector

Slide 51

Slide 51 text

@jeqo89 at #kafkasummit REST Proxy KSQL Kafka Source Connector Kafka Sink Connector Kafka Interceptors

Slide 52

Slide 52 text

@jeqo89 at #kafkasummit Producer Interceptor API

Slide 53

Slide 53 text

@jeqo89 at #kafkasummit Consumer Interceptor API

Slide 54

Slide 54 text

@jeqo89 at #kafkasummit Demo: Tracing Kafka-based applications github.com/jeqo/talk-kafka-zipkin

Slide 55

Slide 55 text

@jeqo89 at #kafkasummit What’s next?

Slide 56

Slide 56 text

@jeqo89 at #kafkasummit Report Transport Storage Dependencies (batch) Distributed Tracing IS A Stream Processing Problem Data-at-rest services

Slide 57

Slide 57 text

@jeqo89 at #kafkasummit Distributed Tracing IS A Stream Processing Problem Span Consumer Trace Aggregation Span Store spans-collected traces-completed Dependencies Store github.com/jeqo/zipkin-storage-kafka Custom processors

Slide 58

Slide 58 text

@jeqo89 at #kafkasummit

Slide 59

Slide 59 text

@jeqo89 at #kafkasummit

Slide 60

Slide 60 text

@jeqo89 at #kafkasummit Canopy: An End-to-End Performance Tracing And Analysis System”

Slide 61

Slide 61 text

@jeqo89 at #kafkasummit

Slide 62

Slide 62 text

@jeqo89 at #kafkasummit

Slide 63

Slide 63 text

@jeqo89 at #kafkasummit

Slide 64

Slide 64 text

@jeqo89 at #kafkasummit

Slide 65

Slide 65 text

@jeqo89 at #kafkasummit Peter Alvaro et al., “Automating Failure Testing Research at Internet Scale”

Slide 66

Slide 66 text

@jeqo89 at #kafkasummit is there anyone applying this?

Slide 67

Slide 67 text

@jeqo89 at #kafkasummit Haystack: tracing and analysis platform Tracing Trends and Metrics Anomaly Detection Remediation Alerting services

Slide 68

Slide 68 text

@jeqo89 at #kafkasummit Demo: Extending Zipkin with Kafka and Haystack github.com/jeqo/talk-kafka-zipkin

Slide 69

Slide 69 text

@jeqo89 at #kafkasummit * Demo 1, source code: github.com/jeqo/tracing-kafka-apps * Demo 2, source code: github.com/jeqo/talk-kafka-zipkin * Blog post: confluent.io/blog/importance-of-distributed-tracing-for-apache-kafka-based-applications * Zipkin: github.com/openzipkin (moving to Apache foundation), gitter.im/openzipkin/zipkin * Sites using Zipkin: cwiki.apache.org/confluence/display/ZIPKIN/Sites * Haystack: github.com/ExpediaDotCom/haystack, gitter.im/expedia-haystack * Zipkin Kafka Backend: github.com/jeqo/zipkin-storage-kafka * Kafka Interceptor for Zipkin: github.com/sysco-middleware/kafka-interceptor-zipkin * Martin Kleppmann et al. 2019. Online Event Processing. https://dl.acm.org/citation.cfm?id=3321612 * John Ousterhout. A Philosophy of Software Design. www.amazon.com/Philosophy-Software-Design-John-Ousterhout/dp/1732102201 * Jonathan Kaldor et al.2017. Canopy: An End-to-End Performance Tracing And Analysis System.SOSP’17(2017). doi.org/10.1145/3132747.3132749 * Peter Alvaro et al.2016. Automating Failure Testing Research at Internet Scale.SoCC ’16. dx.doi.org/10.1145/2987550.2987555 Resources

Slide 70

Slide 70 text

@jeqo89 at #kafkasummit fin github.com/jeqo