Upgrade to Pro — share decks privately, control downloads, hide ads and more …

Saint Louis Apache Kafka Meetup Jul 2020

Saint Louis Apache Kafka Meetup Jul 2020

Ed81876bf33da90cdae47ce9b8df056b?s=128

Loïc DIVAD

July 28, 2020
Tweet

Transcript

  1. 1 @loicmdivad @PubSapientEng @loicmdivad @PubSapientEng Streaming Apps and Poison Pills:

    handle the unexpected with Kafka Streams
  2. 2 @loicmdivad @PubSapientEng @loicmdivad @PubSapientEng Loïc DIVAD Software Engineer @loicmdivad

  3. 3 @loicmdivad @PubSapientEng Streaming Apps and Poison Pills

  4. 4 @loicmdivad @PubSapientEng 4 @loicmdivad @PubSapientEng Incoming records may be

    corrupted, or cannot be handled by the serializer / deserializer. These records are referred to as “poison pills” 1. Log and Crash 2. Skip the Corrupted 3. Sentinel Value Pattern 4. Dead Letter Queue Pattern
  5. 5 @loicmdivad @PubSapientEng Ratatouille app, a delicious use case Streaming

    APP
  6. 6 @loicmdivad @PubSapientEng Ratatouille app, a delicious use case Streaming

    APP
  7. 7 @loicmdivad @PubSapientEng 7 @loicmdivad @PubSapientEng Streaming App Poison Pills

    1. Log and Crash - Breakfast 2. Skip the Corrupted - Lunch 3. Sentinel Value Pattern - Drink 4. Dead Letter Queue Pattern - Dinner
  8. 8 @loicmdivad @PubSapientEng Apache Kafka Brokers / Clients

  9. 9 @loicmdivad @PubSapientEng

  10. 10 @loicmdivad @PubSapientEng Really old systems receive raw bytes directly

    from message queues 10100110111010101 Exercise #1 - breakfast
  11. 11 @loicmdivad @PubSapientEng Really old systems receive raw bytes directly

    from message queues With Kafka (Connect and Streams) we’d like to continuously transform these messages 10100110111010101 Kafka Connect Kafka Brokers Exercise #1 - breakfast
  12. 12 @loicmdivad @PubSapientEng Really old systems receive raw bytes directly

    from message queues With Kafka (Connect and Streams) we’d like to continuously transform these messages But we need a deserializer with special decoder to understand each event What happens if we get a buggy implementation of the deserializer? 10100110111010101 Kafka Connect Kafka Brokers Kafka Streams Exercise #1 - breakfast
  13. 13 @loicmdivad @PubSapientEng The Tooling Team They will provide an

    appropriate deserializer
  14. 14 @loicmdivad @PubSapientEng

  15. 15 @loicmdivad @PubSapientEng

  16. 16 @loicmdivad @PubSapientEng

  17. 17 @loicmdivad @PubSapientEng Take Away

  18. 18 @loicmdivad @PubSapientEng @loicmdivad @PubSapientEng

  19. 19 @loicmdivad @PubSapientEng Log and Crash

  20. 20 @loicmdivad @PubSapientEng Log and Crash

  21. 21 @loicmdivad @PubSapientEng

  22. 22 @loicmdivad @PubSapientEng

  23. 23 @loicmdivad @PubSapientEng

  24. 24 @loicmdivad @PubSapientEng ▼ Change consumer group ▼ Manually update

    my offsets ▼ Reset my streaming app and set my auto reset to ▽ ▼ Destroy the topic, no message = no poison pill ▽ ▼ My favourite <3 ▽ Don’t Do ▼ Fill an issue and suggest a fix to the tooling team
  25. 25 @loicmdivad @PubSapientEng @loicmdivad @PubSapientEng

  26. 26 @loicmdivad @PubSapientEng 26 @loicmdivad @PubSapientEng Log and Crash Like

    all consumers, Kafka Streams applications deserialize messages from the broker. The deserialization process can fail. It raises an exception that cannot be caught by our code. Buggy deserializers have to be fixed before the application restarts, by default ...
  27. 27 @loicmdivad @PubSapientEng

  28. 28 @loicmdivad @PubSapientEng

  29. 29 @loicmdivad @PubSapientEng • starter • main • dessert

  30. 30 @loicmdivad @PubSapientEng @loicmdivad @PubSapientEng

  31. 31 @loicmdivad @PubSapientEng Skip the Corrupted

  32. 32 @loicmdivad @PubSapientEng 32 @loicmdivad @PubSapientEng

  33. 33 @loicmdivad @PubSapientEng

  34. 34 @loicmdivad @PubSapientEng Take Away

  35. 35 @loicmdivad @PubSapientEng @loicmdivad @PubSapientEng

  36. 36 @loicmdivad @PubSapientEng 36 @loicmdivad @PubSapientEng The Exception Handler in

    the call stack Powered by the Flow intelliJ plugin ➞ findtheflow.io
  37. 37 @loicmdivad @PubSapientEng 37 @loicmdivad @PubSapientEng Powered by the Flow

    intelliJ plugin ➞ findtheflow.io The Exception Handler in the call stack
  38. 38 @loicmdivad @PubSapientEng 38 @loicmdivad @PubSapientEng Powered by the Flow

    intelliJ plugin ➞ findtheflow.io The Exception Handler in the call stack
  39. 39 @loicmdivad @PubSapientEng 39 @loicmdivad @PubSapientEng Powered by the Flow

    intelliJ plugin ➞ findtheflow.io The Exception Handler in the call stack
  40. 40 @loicmdivad @PubSapientEng 40 @loicmdivad @PubSapientEng Skip the Corrupted All

    exceptions thrown by deserializers are caught by a A handler returns Fail or Continue You can implement your own Handler But the two handlers provided by the library are really basic… let’s explore other methods
  41. 41 @loicmdivad @PubSapientEng 41 @loicmdivad @PubSapientEng All exceptions thrown by

    deserializers are caught by a A handler returns Fail or Continue You can implement your own Handler But the two handlers provided by the library are really basic… let’s explore other methods Skip the Corrupted Take Away
  42. 42 @loicmdivad @PubSapientEng

  43. 43 @loicmdivad @PubSapientEng

  44. 44 @loicmdivad @PubSapientEng • wine • rhum • beer •

    champagne • ...
  45. 45 @loicmdivad @PubSapientEng We need to turn the deserialization process

    into a pure transformation that cannot crash To do so, we will replace corrupted message by a sentinel value. It’s a special-purpose record (e.g: null, None, Json.Null, etc ...) Sentinel Value Pattern → G H
  46. 46 @loicmdivad @PubSapientEng We need to turn the deserialization process

    into a pure transformation that cannot crash To do so, we will replace corrupted message by a sentinel value. It’s a special-purpose record (e.g: null, None, Json.Null, etc ...) This allows downstream processors to recognize and handle such sentinel values Sentinel Value Pattern → G H G H
  47. 47 @loicmdivad @PubSapientEng We need to turn the deserialization process

    into a pure transformation that cannot crash To do so, we will replace corrupted message by a sentinel value. It’s a special-purpose record (e.g: null, None, Json.Null, etc ...) This allows downstream processors to recognize and handle such sentinel values With Kafka Streams this can be achieved by implementing a Deserializer Sentinel Value Pattern → G H G H null
  48. 48 @loicmdivad @PubSapientEng

  49. 49 @loicmdivad @PubSapientEng

  50. 50 @loicmdivad @PubSapientEng @loicmdivad @PubSapientEng

  51. 51 @loicmdivad @PubSapientEng

  52. 52 @loicmdivad @PubSapientEng @loicmdivad @PubSapientEng

  53. 53 @loicmdivad @PubSapientEng

  54. 54 @loicmdivad @PubSapientEng 54 @loicmdivad @PubSapientEng Sentinel Value Pattern By

    implementing a custom serde we can create a safe . Downstreams now receive a sentinel value indicating a deserialization error. Errors can then be treated correctly, example: monitoring the number of deserialization errors with a custom metric But we lost a lot of information about the error… let’s see a last method
  55. 55 @loicmdivad @PubSapientEng 55 @loicmdivad @PubSapientEng Sentinel Value Pattern By

    implementing a custom serde we can create a safe . Downstreams now receive a sentinel value indicating a deserialization error. Errors can then be treated correctly, example: monitoring the number of deserialization errors with a custom metric But we lost a lot of information about the error… let’s see a last method Take Away
  56. 56 @loicmdivad @PubSapientEng

  57. 57 @loicmdivad @PubSapientEng

  58. 58 @loicmdivad @PubSapientEng Dead Letter Queue Pattern In this method

    we will let the deserializer fail. For each failure we will send a message to a topic containing corrupted messages. Each message will have the original content of the input message (for reprocessing) and additional meta data about the failure. With Kafka Streams this can be achieved by implementing a DeserializationExceptionHandler Streaming APP dead letter queue input topic output topic
  59. 59 @loicmdivad @PubSapientEng

  60. 60 @loicmdivad @PubSapientEng

  61. 61 @loicmdivad @PubSapientEng

  62. 62 @loicmdivad @PubSapientEng

  63. 63 @loicmdivad @PubSapientEng Fill the headers with some meta data

    01061696e0016536f6d6500000005736f6d65206f Value message to hexa
  64. 64 @loicmdivad @PubSapientEng

  65. 65 @loicmdivad @PubSapientEng Take Away

  66. 66 @loicmdivad @PubSapientEng @loicmdivad @PubSapientEng

  67. 67 @loicmdivad @PubSapientEng

  68. 68 @loicmdivad @PubSapientEng 414554 = AET = Australia/Sydney

  69. 69 @loicmdivad @PubSapientEng 69 @loicmdivad @PubSapientEng Dead Letter Queue Pattern

    You can provide your own implementation of . This lets you use the Producer API to write a corrupted record directly to a quarantine topic. Then you can manually analyse your corrupted records ⚠Warning: This approach have side effects that are invisible to the Kafka Streams runtime.
  70. 70 @loicmdivad @PubSapientEng 70 @loicmdivad @PubSapientEng Dead Letter Queue Pattern

    You can provide your own implementation of . This lets you use the Producer API to write a corrupted record directly to a quarantine topic. Then you can manually analyse your corrupted records ⚠Warning: This approach have side effects that are invisible to the Kafka Streams runtime. Take Away
  71. 71 @loicmdivad @PubSapientEng

  72. 72 @loicmdivad @PubSapientEng 72 @loicmdivad @PubSapientEng CONFLUENT FAQ Links XKE-RATATOUILLE

  73. 73 @loicmdivad @PubSapientEng 73 @loicmdivad @PubSapientEng Related Post Kafka Connect

    Deep Dive – Error Handling and Dead Letter Queues - by Robin Moffatt Building Reliable Reprocessing and Dead Letter Queues with Apache Kafka - by Ning Xia Handling bad messages using Kafka's Streams API - answer by Matthias J. Sax
  74. 74 @loicmdivad @PubSapientEng 74 @loicmdivad @PubSapientEng Conclusion When using Kafka,

    deserialization is the responsibility of the clients. These internal errors are not easy to catch When it’s possible, use Avro + Schema Registry When it’s not possible, Kafka Streams applies techniques to deal with serde errors: - DLQ: By extending a - Sentinel Value: By extending a
  75. 75 @loicmdivad @PubSapientEng @loicmdivad @PubSapientEng

  76. 76 @loicmdivad @PubSapientEng 76 @loicmdivad @PubSapientEng Images Photo by rawpixel

    on Unsplash Photo by João Marcelo Martins on Unsplash Photo by Jordane Mathieu on Unsplash Photo by Brooke Lark on Unsplash Photo by Jakub Kapusnak on Unsplash Photo by Melissa Walker Horn on Unsplash Photo by Aneta Pawlik on Unsplash
  77. 77 @loicmdivad @PubSapientEng 77 @loicmdivad @PubSapientEng With special thanks to

    Robin M. Sylvain L. Giulia B.
  78. 78 @loicmdivad @PubSapientEng How the generator works?

  79. 79 @loicmdivad @PubSapientEng Pure HTML Akka Http Server Akka Actor

    System Kafka Topic Exercise1 Exercise2 Me, clicking everywhere Akka Stream Kafka