Upgrade to Pro — share decks privately, control downloads, hide ads and more …

Orange Dev Days 2019

Orange Dev Days 2019

Ed81876bf33da90cdae47ce9b8df056b?s=128

Loïc DIVAD

November 13, 2019
Tweet

Transcript

  1. 1 @loicmdivad @XebiaFr @loicmdivad @XebiaFr Streaming Apps and Poison Pills:

    handle the unexpected with Kafka Streams 13 Nov. 2019 - Orange Dev Test Day
  2. 2 @loicmdivad @XebiaFr @loicmdivad @XebiaFr Loïc DIVAD Data Engineer @XebiaFr

    @loicmdivad
  3. 3 @loicmdivad @XebiaFr Processor API: The dark side of Kafka

    Streams
  4. 4 @loicmdivad @XebiaFr

  5. 5 @loicmdivad @XebiaFr La XebiCon‘19 en quelques chiffres 28 novembre

    60 conférences techniques 15 retours d'expérience clients 1 500 participants le rendez-vous des C-Levels
  6. 6 @loicmdivad @XebiaFr The Rise of the Stream Data Platform

  7. 7 @loicmdivad @XebiaFr The Rise of the Stream Data Platform

  8. 8 @loicmdivad @XebiaFr The Rise of the Stream Data Platform

  9. 9 @loicmdivad @XebiaFr Apache Kafka: Partitions, Messages and Offsets 0

    1 2 3 4 5 6 7 8 9 10 11 12 13 CONSUMER A CONSUMER B New Writes
  10. 10 @loicmdivad @XebiaFr 10 @loicmdivad @XebiaFr Incoming records may be

    corrupted, or cannot be handled by the serializer / deserializer. These records are referred to as “poison pills” 1. Log and Crash 2. Skip the Corrupted 3. Sentinel Value Pattern 4. Dead Letter Queue Pattern
  11. 11 @loicmdivad @XebiaFr Apache Kafka: Brokers and Clients Kafka Broker

    Kafka Producer ✉ ✉ Client responsibilities: - batching - serialization - compression ✉ ✉ ✉ ✉ ✉ ✉ ✉ ✉ ✉ ✉ ✉ ✉ ✉ ✉ ✉ Topic Segments Open Segments
  12. 12 @loicmdivad @XebiaFr Ratatouille app, a delicious use case Streaming

    APP
  13. 13 @loicmdivad @XebiaFr Ratatouille app, a delicious use case Streaming

    APP
  14. 14 @loicmdivad @XebiaFr 14 @loicmdivad @XebiaFr Streaming App Poison Pills

    1. Log and Crash - Breakfast 2. Skip the Corrupted - Lunch 3. Sentinel Value Pattern - Drink 4. Dead Letter Queue Pattern - Dinner
  15. 15 @loicmdivad @XebiaFr

  16. 16 @loicmdivad @XebiaFr Really old systems receive raw bytes directly

    from message queues 10100110111010101 Exercise #1 - breakfast
  17. 17 @loicmdivad @XebiaFr Really old systems receive raw bytes directly

    from message queues With Kafka (Connect and Streams) we’d like to continuously transform these messages 10100110111010101 Kafka Connect Kafka Brokers Exercise #1 - breakfast
  18. 18 @loicmdivad @XebiaFr Really old systems receive raw bytes directly

    from message queues With Kafka (Connect and Streams) we’d like to continuously transform these messages But we need a deserializer with special decoder to understand each event What happens if we get a buggy implementation of the deserializer? 10100110111010101 Kafka Connect Kafka Brokers Kafka Streams Exercise #1 - breakfast
  19. 19 @loicmdivad @XebiaFr The Tooling Team They will provide an

    appropriate deserializer
  20. 20 @loicmdivad @XebiaFr

  21. 21 @loicmdivad @XebiaFr

  22. 22 @loicmdivad @XebiaFr

  23. 23 @loicmdivad @XebiaFr Take Away

  24. 24 @loicmdivad @XebiaFr @loicmdivad @XebiaFr

  25. 25 @loicmdivad @XebiaFr Log and Crash

  26. 26 @loicmdivad @XebiaFr Log and Crash

  27. 27 @loicmdivad @XebiaFr

  28. 28 @loicmdivad @XebiaFr

  29. 29 @loicmdivad @XebiaFr

  30. 30 @loicmdivad @XebiaFr ▼ Change consumer group ▼ Manually update

    my offsets ▼ Reset my streaming app and set my auto reset to ▽ ▼ Destroy the topic, no message = no poison pill ▽ ▼ My favourite <3 ▽ Don’t Do ▼ Fill an issue and suggest a fix to the tooling team
  31. 31 @loicmdivad @XebiaFr @loicmdivad @XebiaFr

  32. 32 @loicmdivad @XebiaFr 32 @loicmdivad @XebiaFr Log and Crash Like

    all consumers, Kafka Streams applications deserialize messages from the broker. The deserialization process can fail. It raises an exception that cannot be caught by our code. Buggy deserializers have to be fixed before the application restarts, by default ...
  33. 33 @loicmdivad @XebiaFr

  34. 34 @loicmdivad @XebiaFr Stream Topology Streaming APP

  35. 35 @loicmdivad @XebiaFr

  36. 36 @loicmdivad @XebiaFr • starter • main • dessert

  37. 37 @loicmdivad @XebiaFr @loicmdivad @XebiaFr

  38. 38 @loicmdivad @XebiaFr Skip the Corrupted

  39. 39 @loicmdivad @XebiaFr 39 @loicmdivad @XebiaFr

  40. 40 @loicmdivad @XebiaFr

  41. 41 @loicmdivad @XebiaFr Take Away

  42. 42 @loicmdivad @XebiaFr @loicmdivad @XebiaFr

  43. 43 @loicmdivad @XebiaFr 43 @loicmdivad @XebiaFr The Exception Handler in

    the call stack Powered by the Flow intelliJ plugin ➞ findtheflow.io
  44. 44 @loicmdivad @XebiaFr 44 @loicmdivad @XebiaFr Powered by the Flow

    intelliJ plugin ➞ findtheflow.io The Exception Handler in the call stack
  45. 45 @loicmdivad @XebiaFr 45 @loicmdivad @XebiaFr Powered by the Flow

    intelliJ plugin ➞ findtheflow.io The Exception Handler in the call stack
  46. 46 @loicmdivad @XebiaFr 46 @loicmdivad @XebiaFr Powered by the Flow

    intelliJ plugin ➞ findtheflow.io The Exception Handler in the call stack
  47. 47 @loicmdivad @XebiaFr 47 @loicmdivad @XebiaFr Skip the Corrupted All

    exceptions thrown by deserializers are caught by a A handler returns Fail or Continue You can implement your own Handler But the two handlers provided by the library are really basic… let’s explore other methods
  48. 48 @loicmdivad @XebiaFr 48 @loicmdivad @XebiaFr All exceptions thrown by

    deserializers are caught by a A handler returns Fail or Continue You can implement your own Handler But the two handlers provided by the library are really basic… let’s explore other methods Skip the Corrupted Take Away
  49. 49 @loicmdivad @XebiaFr

  50. 50 @loicmdivad @XebiaFr Stream Topology Streaming APP

  51. 51 @loicmdivad @XebiaFr

  52. 52 @loicmdivad @XebiaFr • wine • rhum • beer •

    champagne • ...
  53. 53 @loicmdivad @XebiaFr We need to turn the deserialization process

    into a pure transformation that cannot crash To do so, we will replace corrupted message by a sentinel value. It’s a special-purpose record (e.g: null, None, Json.Null, etc ...) Sentinel Value Pattern → G H
  54. 54 @loicmdivad @XebiaFr We need to turn the deserialization process

    into a pure transformation that cannot crash To do so, we will replace corrupted message by a sentinel value. It’s a special-purpose record (e.g: null, None, Json.Null, etc ...) This allows downstream processors to recognize and handle such sentinel values → G H G H Sentinel Value Pattern
  55. 55 @loicmdivad @XebiaFr We need to turn the deserialization process

    into a pure transformation that cannot crash To do so, we will replace corrupted message by a sentinel value. It’s a special-purpose record (e.g: null, None, Json.Null, etc ...) This allows downstream processors to recognize and handle such sentinel values With Kafka Streams this can be achieved by implementing a Deserializer → G H G H null Sentinel Value Pattern
  56. 56 @loicmdivad @XebiaFr

  57. 57 @loicmdivad @XebiaFr

  58. 58 @loicmdivad @XebiaFr @loicmdivad @XebiaFr

  59. 59 @loicmdivad @XebiaFr

  60. 60 @loicmdivad @XebiaFr @loicmdivad @XebiaFr

  61. 61 @loicmdivad @XebiaFr

  62. 62 @loicmdivad @XebiaFr 62 @loicmdivad @XebiaFr Sentinel Value Pattern By

    implementing a custom serde we can create a safe . Downstreams now receive a sentinel value indicating a deserialization error. Errors can then be treated correctly, example: monitoring the number of deserialization errors with a custom metric But we lost a lot of information about the error… let’s see a last method
  63. 63 @loicmdivad @XebiaFr 63 @loicmdivad @XebiaFr Sentinel Value Pattern By

    implementing a custom serde we can create a safe . Downstreams now receive a sentinel value indicating a deserialization error. Errors can then be treated correctly, example: monitoring the number of deserialization errors with a custom metric But we lost a lot of information about the error… let’s see a last method Take Away
  64. 64 @loicmdivad @XebiaFr

  65. 65 @loicmdivad @XebiaFr Stream Topology Streaming APP

  66. 66 @loicmdivad @XebiaFr

  67. 67 @loicmdivad @XebiaFr Dead Letter Queue Pattern In this method

    we will let the deserializer fail. For each failure we will send a message to a topic containing corrupted messages. Each message will have the original content of the input message (for reprocessing) and additional meta data about the failure. With Kafka Streams this can be achieved by implementing a DeserializationExceptionHandler Streaming APP dead letter queue input topic output topic
  68. 68 @loicmdivad @XebiaFr

  69. 69 @loicmdivad @XebiaFr

  70. 70 @loicmdivad @XebiaFr

  71. 71 @loicmdivad @XebiaFr

  72. 72 @loicmdivad @XebiaFr Fill the headers with some meta data

    01061696e0016536f6d6500000005736f6d65206f Value message to hexa
  73. 73 @loicmdivad @XebiaFr

  74. 74 @loicmdivad @XebiaFr Take Away

  75. 75 @loicmdivad @XebiaFr @loicmdivad @XebiaFr

  76. 76 @loicmdivad @XebiaFr

  77. 77 @loicmdivad @XebiaFr 414554 = AET = Australia/Sydney

  78. 78 @loicmdivad @XebiaFr Stream Topology Streaming APP

  79. 79 @loicmdivad @XebiaFr 79 @loicmdivad @XebiaFr Dead Letter Queue Pattern

    You can provide your own implementation of . This lets you use the Producer API to write a corrupted record directly to a quarantine topic. Then you can manually analyse your corrupted records ⚠Warning: This approach have side effects that are invisible to the Kafka Streams runtime.
  80. 80 @loicmdivad @XebiaFr 80 @loicmdivad @XebiaFr Dead Letter Queue Pattern

    You can provide your own implementation of . This lets you use the Producer API to write a corrupted record directly to a quarantine topic. Then you can manually analyse your corrupted records ⚠Warning: This approach have side effects that are invisible to the Kafka Streams runtime. Take Away
  81. 81 @loicmdivad @XebiaFr

  82. 82 @loicmdivad @XebiaFr 82 @loicmdivad @XebiaFr Links XKE-RATATOUILLE CONFLUENT FAQ

  83. 83 @loicmdivad @XebiaFr 83 @loicmdivad @XebiaFr Related Post Kafka Connect

    Deep Dive – Error Handling and Dead Letter Queues - by Robin Moffatt Building Reliable Reprocessing and Dead Letter Queues with Apache Kafka - by Ning Xia Handling bad messages using Kafka's Streams API - answer by Matthias J. Sax
  84. 84 @loicmdivad @XebiaFr 84 @loicmdivad @XebiaFr Conclusion When using Kafka,

    deserialization is the responsibility of the clients. These internal errors are not easy to catch When it’s possible, use Avro + Schema Registry When it’s not possible, Kafka Streams applies techniques to deal with serde errors: - DLQ: By extending a - Sentinel Value: By extending a
  85. 85 @loicmdivad @XebiaFr @loicmdivad @XebiaFr

  86. 86 @loicmdivad @XebiaFr 86 @loicmdivad @XebiaFr Images Photo by Michele

    Blackwell on Unsplash Photo by rawpixel on Unsplash Photo by João Marcelo Martins on Unsplash Photo by Jordane Mathieu on Unsplash Photo by Brooke Lark on Unsplash Photo by Jakub Kapusnak on Unsplash Photo by Melissa Walker Horn on Unsplash Photo by Aneta Pawlik on Unsplash
  87. 87 @loicmdivad @XebiaFr 87 @loicmdivad @XebiaFr With special thanks to

    Robin M. Sylvain L. Giulia B.
  88. 88 @loicmdivad @XebiaFr How the generator works?

  89. 89 @loicmdivad @XebiaFr Pure HTML Akka Http Server Akka Actor

    System Kafka Topic Exercise1 Exercise2 Me, clicking everywhere Akka Stream Kafka