be handled by the serializer / deserializer. These records are referred to as “poison pills” 1. Consume / Produce Request 2. Log and Crash 3. Skip the Corrupted 4. Sentinel Value Pattern 5. Dead Letter Queue Pattern 5
raw bytes directly from message queues With Kafka (Connect and Streams) we’d like to continuously transform these messages But we need a deserializer with special decoder to understand each event What happens if we get a buggy implementation of the deserializer? 10100110111010101 Kafka Connect Kafka Brokers Kafka Streams
- Exception caught during Deserialization, taskId: 0_0, topic: exercise-breakfast, partition: 0, offset: 109 Exception in thread "answer-one-breakfast-0d808ce7-0ef1-44c6-808a-f594bc7fceae-StreamThread-1" org.apache.kafka.streams.errors.StreamsException: Deserialization exception handler is set to fail upon a deserialization error. If you would rather have the streaming pipeline continue after a deserialization error, please set the default.deserialization.exception.handler appropriately. at org.apache.kafka.streams.processor.internals.RecordDeserializer.deserialize(RecordDeserializer.java:80) at org.apache.kafka.streams.processor.internals.RecordQueue.addRawRecords(RecordQueue.java:101) at org.apache.kafka.streams.processor.internals.PartitionGroup.addRawRecords(PartitionGroup.java:124) ... at org.apache.kafka.streams.processor.internals.StreamTask.addRecords(StreamTask.java:711) at org.apache.kafka.streams.processor.internals.StreamThread.run(StreamThread.java:747) Caused by: java.lang.IllegalArgumentException: dishes: Insufficient number of elements: decoded 0 but should have decoded 268435712 at scodec.Attempt$Failure.require(Attempt.scala:108) at fr.xebia.ldi.ratatouille.serde.BreakfastDeserializer.deserialize(BreakfastDeserializer.scala:22) at fr.xebia.ldi.ratatouille.serde.BreakfastDeserializer.deserialize(BreakfastDeserializer.scala:15) at org.apache.kafka.common.serialization.Deserializer.deserialize(Deserializer.java:58) at fr.xebia.ldi.ratatouille.serde.BreakfastDeserializer.deserialize(BreakfastDeserializer.scala:15) at org.apache.kafka.streams.processor.internals.SourceNode.deserializeValue(SourceNode.java:60) at org.apache.kafka.streams.processor.internals.RecordDeserializer.deserialize(RecordDeserializer.java:66)
- Exception caught during Deserialization, taskId: 0_0, topic: exercise-breakfast, partition: 0, offset: 109 Exception in thread "answer-one-breakfast-0d808ce7-0ef1-44c6-808a-f594bc7fceae-StreamThread-1" org.apache.kafka.streams.errors.StreamsException: Deserialization exception handler is set to fail upon a deserialization error. If you would rather have the streaming pipeline continue after a deserialization error, please set the default.deserialization.exception.handler appropriately. at org.apache.kafka.streams.processor.internals.RecordDeserializer.deserialize(RecordDeserializer.java:80) at org.apache.kafka.streams.processor.internals.RecordQueue.addRawRecords(RecordQueue.java:101) at org.apache.kafka.streams.processor.internals.PartitionGroup.addRawRecords(PartitionGroup.java:124) ... at org.apache.kafka.streams.processor.internals.StreamTask.addRecords(StreamTask.java:711) at org.apache.kafka.streams.processor.internals.StreamThread.run(StreamThread.java:747) Caused by: java.lang.IllegalArgumentException: dishes: Insufficient number of elements: decoded 0 but should have decoded 268435712 at scodec.Attempt$Failure.require(Attempt.scala:108) at fr.xebia.ldi.ratatouille.serde.BreakfastDeserializer.deserialize(BreakfastDeserializer.scala:22) at fr.xebia.ldi.ratatouille.serde.BreakfastDeserializer.deserialize(BreakfastDeserializer.scala:15) at org.apache.kafka.common.serialization.Deserializer.deserialize(Deserializer.java:58) at fr.xebia.ldi.ratatouille.serde.BreakfastDeserializer.deserialize(BreakfastDeserializer.scala:15) at org.apache.kafka.streams.processor.internals.SourceNode.deserializeValue(SourceNode.java:60) at org.apache.kafka.streams.processor.internals.RecordDeserializer.deserialize(RecordDeserializer.java:66)
deserialize messages from the broker. The deserialization process can fail. It raises an exception that cannot be caught by our code. Buggy deserializers has to be fixed before the application restart, by default ... 29
(LogAndFailExceptionHandler.java:39) - Exception caught during Deserialization, taskId: 0_0, topic: exercise-breakfast, partition: 0, offset: 109 Exception in thread "answer-one-breakfast-0d808ce7-0ef1-44c6-808a-f594bc7fceae-StreamThread-1" org.apache.kafka.streams.errors.StreamsException: Deserialization exception handler is set to fail upon a deserialization error. If you would rather have the streaming pipeline continue after a deserialization error, please set the default.deserialization.exception.handler appropriately. at org.apache.kafka.streams.processor.internals.RecordDeserializer.deserialize(RecordDeserializer.java:80) at org.apache.kafka.streams.processor.internals.RecordQueue.addRawRecords(RecordQueue.java:101) at org.apache.kafka.streams.processor.internals.PartitionGroup.addRawRecords(PartitionGroup.java:124) ... at org.apache.kafka.streams.processor.internals.StreamTask.addRecords(StreamTask.java:711) at org.apache.kafka.streams.processor.internals.StreamThread.run(StreamThread.java:747) Caused by: java.lang.IllegalArgumentException: ... decoded 0 but should have decoded 268435712 at scodec.Attempt$Failure.require(Attempt.scala:108) at fr.xebia.ldi.ratatouille.serde.BreakfastDeserializer.deserialize(BreakfastDeserializer.scala:22) at fr.xebia.ldi.ratatouille.serde.BreakfastDeserializer.deserialize(BreakfastDeserializer.scala:15) at org.apache.kafka.common.serialization.Deserializer.deserialize(Deserializer.java:58) at fr.xebia.ldi.ratatouille.serde.BreakfastDeserializer.deserialize(BreakfastDeserializer.scala:15) at org.apache.kafka.streams.processor.internals.SourceNode.deserializeValue(SourceNode.java:60) at org.apache.kafka.streams.processor.internals.RecordDeserializer.deserialize(RecordDeserializer.java:66) 34
caught by a DeserializationExceptionHandler A handler returns Fail or Continue You can implement your own Handler But the two handlers provided by the library are really basic… let’s explore other methods 40
caught by a DeserializationExceptionHandler A handler returns Fail or Continue You can implement your own Handler But the two handlers provided by the library are really basic… let’s explore other methods 41 Take Away
deserialization process into a pure transformation that cannot crash To do so, we will replace corrupted message by a sentinel value. It’s a special-purpose record (e.g: null, None, Json.Null, etc ...) f: G → H G H
deserialization process into a pure transformation that cannot crash To do so, we will replace corrupted message by a sentinel value. It’s a special-purpose record (e.g: null, None, Json.Null, etc ...) This allows downstream processors to recognize and handle such sentinel values f: G → H G H G H
deserialization process into a pure transformation that cannot crash To do so, we will replace corrupted message by a sentinel value. It’s a special-purpose record (e.g: null, None, Json.Null, etc ...) This allows downstream processors to recognize and handle such sentinel values With Kafka Streams this can be achieved by implementing a Deserializer f: G → H G H G H null
safe Deserializer. Downstreams now receive a sentinel value indicating a deserialization error. Errors can then be treated correctly, example: monitoring the number of deserialization errors with a custom metric But we lost a lot of information about the error… let’s see a last method Sentinel value Pattern 53
safe Deserializer. Downstreams now receive a sentinel value indicating a deserialization error. Errors can then be treated correctly, example: monitoring the number of deserialization errors with a custom metric But we lost a lot of information about the error… let’s see a last method Sentinel value Pattern 54 Take Away
will let the deserializer fail. For each failure we will send a message to a topic containing corrupted messages. Streaming APP dead letter queue input topic output topic
will let the deserializer fail. For each failure we will send a message to a topic containing corrupted messages. Each message will have the original content of the input message (for reprocessing) and additional meta data about the failure. Streaming APP dead letter queue input topic output topic
will let the deserializer fail. For each failure we will send a message to a topic containing corrupted messages. Each message will have the original content of the input message (for reprocessing) and additional meta data about the failure. With Kafka Streams this can be achieved by implementing a DeserializationExceptionHandler Streaming APP dead letter queue input topic output topic
DeserializationExceptionHandler implementation. This let you use the Producer API to write a corrupted record directly to a quarantine topic. Then you can manually analyse your corrupted records Warrning: This approach has side effects that are invisible to the Kafka Streams runtime. 65
DeserializationExceptionHandler implementation. This let you use the Producer API to write a corrupted record directly to a quarantine topic. Then you can manually analyse your corrupted records Warrning: This approach has side effects that are invisible to the Kafka Streams runtime. 66 Take Away
of the clients. These internal errors are not easy to catch When it’s possible, use Avro + Schema Registry When it’s not possible, Kafka Streams applies techniques to deal with serde errors: - DLQ: By extending a handler - Sentinel Value: By extending a deserializer
Handling and Dead Letter Queues - by Robin Moffatt Building Reliable Reprocessing and Dead Letter Queues with Apache Kafka - by Ning Xia Handling bad messages using Kafka's Streams API - answer by Matthias J. Sax
João Marcelo Martins on Unsplash Photo by Jordane Mathieu on Unsplash Photo by Brooke Lark on Unsplash Photo by Jakub Kapusnak on Unsplash Photo by Melissa Walker Horn on Unsplash Photo by Aneta Pawlik on Unsplash