Upgrade to Pro — share decks privately, control downloads, hide ads and more …

Stockholm Apache Kafka Meetup Jan 2021

Stockholm Apache Kafka Meetup Jan 2021

Ed81876bf33da90cdae47ce9b8df056b?s=128

Loïc DIVAD

January 26, 2021
Tweet

Transcript

  1. @loicmdivad SAK Meetup Kafka Tutorials Window Final Result Stockholm Apache

    Kafka Meetup
  2. @loicmdivad SAK Meetup Photo by Kelly Sikkema on Unsplash 2

  3. @loicmdivad SAK Meetup 3 Loïc DIVAD Software Engineer at Spotify

    @loicmdivad
  4. @loicmdivad SAK Meetup 4 Loïc DIVAD Software Engineer at Spotify

    @loicmdivad
  5. @loicmdivad SAK Meetup 5 @loicmdivad SAK Meetup

  6. @loicmdivad SAK Meetup 6

  7. @loicmdivad SAK Meetup @loicmdivad @loicmdivad SAK Meetup Collect data over

    time 7 Photo by Aron Visuals on Unsplash
  8. @loicmdivad SAK Meetup @loicmdivad SAK Meetup 8 1. The Use

    Case 2. The Implementation 3. The what’s next for 2021? Photo by Aron Visuals on Unsplash
  9. @loicmdivad SAK Meetup @loicmdivad SAK Meetup 9 1. The Use

    Case a. Context and motivation b. Emit a final result from a time window c. Kafka Tutorials presentation 2. The Implementation a. Develop b. Test c. Deploy 3. The what’s next for 2021? a. New year resolution at massive scale! Photo by Aron Visuals on Unsplash
  10. @loicmdivad SAK Meetup @loicmdivad SAK Meetup Chapter I: The Use

    Case 10
  11. @loicmdivad SAK Meetup Window Final Result for real-time alerting Let's

    be an automated drink order service: - Captors are sending pressure alerts 11
  12. @loicmdivad SAK Meetup Window Final Result for real-time alerting Let's

    be an automated drink order service: - Captors are sending pressure alerts - The pressure can be high - But these are just warnings - However, we don’t want them to be too frequents - It could indicate a damaged captors 12
  13. @loicmdivad SAK Meetup Window Final Result for real-time alerting Let's

    be an automated drink order service: - Captors are sending pressure alerts - The pressure can be high - But these are just warnings - However, we don’t want them to be too frequents - It could indicate a damaged captors - Instead of maintaining the overall values, we’d like to emit the final count - This can be done by aggregating the unbounded stream of pressure alerts 13
  14. @loicmdivad SAK Meetup Window Final Result for real-time alerting Let's

    be an automated drink order service: - Captors are sending pressure alerts - The pressure can be high - But these are just warnings - However, we don’t want them to be too frequents - It could indicate a damaged captors - Instead of maintaining the overall values, we’d like to emit the final count - This can be done by aggregating the unbounded stream of pressure alerts 14
  15. @loicmdivad SAK Meetup Data Pipeline 15

  16. @loicmdivad SAK Meetup Alerting Tumbling Windows 16 101 102

  17. @loicmdivad SAK Meetup Alerting Tumbling Windows 17 101 102

  18. @loicmdivad SAK Meetup Alerting Tumbling Windows 18 ⚠☢ ⚠☢ 101

    102
  19. @loicmdivad SAK Meetup What if the captor were laggy ??

    19 In this exercise we will probably face out-of-order elements and late data arrivals. We have to take this in account! The solution could be to add a extra period when late events can still join there corresponding aggregate and trigger a event. It’s called a grace period!
  20. @loicmdivad SAK Meetup 102 Alerting Tumbling Windows (with grace period)

    20 101
  21. @loicmdivad SAK Meetup 102 101 Alerting Tumbling Windows (with grace

    period) 21 ⚠☢
  22. @loicmdivad SAK Meetup The streaming app solving this has to:

    • Continually consume and process events • Aggregate events over time • Correctly integrate late events • Emit an event at the closing of each window • Be easy to test and deploy 22
  23. @loicmdivad SAK Meetup 23

  24. @loicmdivad SAK Meetup Kafka Tutorials Kafka Tutorials is a collection

    of common event streaming use cases, with each tutorial featuring an example scenario and several complete code solutions. It’s the fastest way to learn how to use Kafka with confidence. -- Michael Drogalis 24 from: https://www.confluent.io/blog/announci ng-apache-kafka-tutorials/
  25. @loicmdivad SAK Meetup WFR applied to real-time alerting • Step

    N°1: find an issue you’d like to tackle 25
  26. @loicmdivad SAK Meetup WFR applied to real-time alerting • Step

    N°1: find an issue you’d like to tackle • Step N°2: submit a PR referencing the previous issue 26
  27. @loicmdivad SAK Meetup WFR applied to real-time alerting • Step

    N°1: find an issue you’d like to tackle • Step N°2: submit a PR referencing the previous issue • Step N°3: Negotiate during the review to get your dirty code accepted! 27
  28. @loicmdivad SAK Meetup My first contribution 28

  29. @loicmdivad SAK Meetup @loicmdivad SAK Meetup 29 1. The Use

    Case a. Context and motivation b. Emit a final result from a time window c. Kafka Tutorials presentation 2. The Implementation a. Develop b. Test c. Deploy 3. The what’s next for 2021? a. New year resolution at massive scale! Photo by Aron Visuals on Unsplash
  30. @loicmdivad SAK Meetup @loicmdivad SAK Meetup Chapter II: The Implementation

    30
  31. @loicmdivad SAK Meetup @loicmdivad SAK Meetup Develop 31 Chapter II:

    The Implementation
  32. @loicmdivad SAK Meetup @loicmdivad SAK Meetup 32 Project Structure

  33. @loicmdivad SAK Meetup Dev environment and Docker Compose 33

  34. @loicmdivad SAK Meetup Application Build 34 Plugins: java, application, jib,

    shadow, avro dependencies: - kafka-streams - kafka-streams-avro-serde - kafka-streams-test-utils - junit source / target compatibility: 1.8
  35. @loicmdivad SAK Meetup The PressureAlert Schema 35

  36. @loicmdivad SAK Meetup 36 1. Simple 2. Expressif 3. Declarative

    The Kafka Streams DSL is built on top of the Streams Processor API. It is the recommended for most users… Most data processing operations can be expressed in just a few lines of DSL code.
  37. @loicmdivad SAK Meetup @loicmdivad SAK Meetup Event Time vs Processing

    Time 37
  38. @loicmdivad SAK Meetup 38 value key headers Kafka Message in

    value key headers Kafka Message out Null Timestamp
  39. @loicmdivad SAK Meetup @loicmdivad SAK Meetup 39 …

  40. @loicmdivad SAK Meetup @loicmdivad SAK Meetup 40

  41. @loicmdivad SAK Meetup @loicmdivad SAK Meetup 41 …

  42. @loicmdivad SAK Meetup @loicmdivad SAK Meetup 42 Kafka Streams Topology

  43. @loicmdivad SAK Meetup @loicmdivad SAK Meetup 43

  44. @loicmdivad SAK Meetup @loicmdivad SAK Meetup 44

  45. @loicmdivad SAK Meetup @loicmdivad SAK Meetup 45

  46. @loicmdivad SAK Meetup @loicmdivad SAK Meetup 46 …

  47. @loicmdivad SAK Meetup @loicmdivad SAK Meetup 47

  48. @loicmdivad SAK Meetup @loicmdivad SAK Meetup 48 Suppress operator

  49. @loicmdivad SAK Meetup 49 Time window specification 1. • Return

    a window definition with the given window size 2. • specifies by how much a window moves forward relative to the previous one 3. • Reject late events that arrive after the specified delay of advanceBy grace
  50. @loicmdivad SAK Meetup @loicmdivad SAK Meetup 50

  51. @loicmdivad SAK Meetup 51 Suppress Operator Aggregated state Alert Event

  52. @loicmdivad SAK Meetup 52 Suppress Operator Aggregated state Suppressed state

    Alert Event
  53. @loicmdivad SAK Meetup @loicmdivad SAK Meetup 53

  54. @loicmdivad SAK Meetup Streaming Topology print 54

  55. @loicmdivad SAK Meetup @loicmdivad SAK Meetup 55 MERCI ! (Thank

    you) Photo by Loïc DIVAD on Unsplash
  56. @loicmdivad SAK Meetup @loicmdivad SAK Meetup Test 56 Chapter II:

    The Implementation
  57. @loicmdivad SAK Meetup TopologyTestDriver 57 Processing Nodes Input Nodes Output

    Nodes
  58. @loicmdivad SAK Meetup TopologyTestDriver 58 Processing Nodes Input Nodes Output

    Nodes ✂ ✂ ✂
  59. @loicmdivad SAK Meetup TopologyTestDriver 59 Processing Nodes Input Nodes Output

    Nodes
  60. @loicmdivad SAK Meetup Topology should group over date-time windows 60

    Kafka Streams Topology 101/[T1-T2] N1 101/[T3-T4] N2 101/[T4-T5] N3 ⚠ ⚠ ⚠
  61. @loicmdivad SAK Meetup @loicmdivad SAK Meetup 61

  62. @loicmdivad SAK Meetup @loicmdivad SAK Meetup 62 … …

  63. @loicmdivad SAK Meetup Topology should group by Id 63 Kafka

    Streams Topology 101 / [T1-T2] N1 102 / [T1-T2] N2 103 / [T1-T2] N3 ⚠ ⚠ ⚠
  64. @loicmdivad SAK Meetup @loicmdivad @loicmdivad SAK Meetup 64

  65. @loicmdivad SAK Meetup @loicmdivad SAK Meetup 65 … …

  66. @loicmdivad SAK Meetup PDE should extract Timestamp 66 Kafka Streams

    Topology 101/[T1-T2] TS1 101/[T3-T4] TS2 101/[T4-T5] TS3 ⚠ ⚠ ⚠
  67. @loicmdivad SAK Meetup @loicmdivad SAK Meetup 67

  68. @loicmdivad SAK Meetup @loicmdivad SAK Meetup 68 …

  69. @loicmdivad SAK Meetup @loicmdivad SAK Meetup Deploy 69 Chapter II:

    The Implementation
  70. @loicmdivad SAK Meetup Package as you are 70 • Build

    a jar with all the dependencies and deploy it on machine having a JVM and where all the the configs files are automatically provided Or • Build a container, deploy the container
  71. @loicmdivad SAK Meetup @loicmdivad SAK Meetup Chapter III: The what’s

    next for 2021? 71