Upgrade to Pro — share decks privately, control downloads, hide ads and more …

Poison pills: handle the unexpected with Kafka Streams

Poison pills: handle the unexpected with Kafka Streams

Ed81876bf33da90cdae47ce9b8df056b?s=128

Loïc DIVAD

April 18, 2019
Tweet

Transcript

  1. loicmdivad Streaming Apps and the poison pills April17, 2019 -

    DEVOXX France’19 1 Handle the unexpected with Kafka Streams Loïc M. DIVAD
  2. loicmdivad 2

  3. loicmdivad Loïc DIVAD Data Engineer @XebiaFr @loicmdivad 3 dataxday.fr organizer

  4. loicmdivad Processor API The dark side of Kafka Streams March

    2018 4
  5. loicmdivad > println(sommaire) Incoming records may be corrupted, or cannot

    be handled by the serializer / deserializer. These records are referred to as “poison pills” 1. Consume / Produce Request 2. Log and Crash 3. Skip the Corrupted 4. Sentinel Value Pattern 5. Dead Letter Queue Pattern 5
  6. loicmdivad XKE Ratatouille, a delicious use case Streaming APP Input

    topic: food-order 6
  7. loicmdivad XKE Ratatouille, a delicious use case Streaming APP Input

    topic: food-order 7
  8. loicmdivad Streaming App Poison Pills A. Consume / Produce Request

    B. Log and Crash - Breakfast C. Skip the Corrupted - Lunch D. Sentinel Value Pattern - Drink E. Dead Letter Queue Pattern - Dinner 8
  9. loicmdivad Consume / Produce Request From the Kafka Clients perspective

    9
  10. loicmdivad 10 connectors connectors consumers producers streaming apps topic partitions

    topic Kafka-cluster Apache Kafka
  11. loicmdivad 11 Kafka Brokers Log storage: • Log segments completion

    ◦ log.flush.interval.ms • Leader election (Controller) ◦ unclean.leader.election.enable • Commit request (Coordinator) • Log cleaning ◦ log.retention.ms ◦ log.retention.check.interval.ms • Log compaction ◦ cleanup.policy From http://kafka.apache.org
  12. loicmdivad 12 Kafka Brokers Log storage: • Log segments completion

    ◦ log.flush.interval.ms • Leader election (Controller) ◦ unclean.leader.election.enable • Commit request (Coordinator) • Log cleaning ◦ log.retention.ms ◦ log.retention.check.interval.ms • Log compaction ◦ cleanup.policy From http://kafka.apache.org
  13. loicmdivad 13 Kafka Producers Produce request creation: • Message buffering

    ◦ linger.ms • Acknowledgement verification ◦ acks ◦ retries • Message serialization ◦ key.serializer ◦ value.serializer • Batch compression ◦ compression.type (gzip, snappy, lz4, zstandard) From http://kafka.apache.org
  14. loicmdivad 14 Kafka Producers Produce request creation: • Message buffering

    ◦ linger.ms • Acknowledgement verification ◦ acks ◦ retries • Message serialization ◦ key.serializer ◦ value.serializer • Batch compression ◦ compression.type (gzip, snappy, lz4, zstandard) From http://kafka.apache.org
  15. loicmdivad 15 consumers producers brokers

  16. loicmdivad Log and Crash Exercise #1 - breakfast 16

  17. loicmdivad Exercise #1 - breakfast 17 Really old systems receive

    raw bytes directly from message queues With Kafka (Connect and Streams) we’d like to continuously transform these messages But we need a deserializer with special decoder to understand each event What happens if we get a buggy implementation of the deserializer? 10100110111010101 Kafka Connect Kafka Brokers Kafka Streams
  18. loicmdivad // Exercise #1: Breakfast sealed trait FoodOrder case class

    Breakfast(lang: Lang, liquid: Liquid, fruit: Fruit, pastries: Vector[Pastry] = Vector.empty) extends FoodOrder 18
  19. loicmdivad // Exercise #1: Breakfast sealed trait FoodOrder case class

    Breakfast(lang: Lang, liquid: Liquid, fruit: Fruit, pastries: Vector[Pastry] = Vector.empty) extends FoodOrder implicit lazy val BreakfastCodec: Codec[Breakfast] = new Codec[Breakfast] = ??? 19
  20. loicmdivad // Exercise #1: Breakfast sealed trait FoodOrder case class

    Breakfast(lang: Lang, liquid: Liquid, fruit: Fruit, pastries: Vector[Pastry] = Vector.empty) extends FoodOrder implicit lazy val BreakfastCodec: Codec[Breakfast] = new Codec[Breakfast] = ??? class FoodOrderSerializer extends Serializer[FoodOrder] = ??? class FoodOrderDeserializer extends Deserializer[FoodOrder] = ??? 20
  21. loicmdivad // Exercise #1: Breakfast sealed trait FoodOrder case class

    Breakfast(lang: Lang, liquid: Liquid, fruit: Fruit, pastries: Vector[Pastry] = Vector.empty) extends FoodOrder implicit lazy val BreakfastCodec: Codec[Breakfast] = new Codec[Breakfast] = ??? class FoodOrderSerializer extends Serializer[FoodOrder] = ??? class FoodOrderDeserializer extends Deserializer[FoodOrder] = ??? Take Away 21 org.apache.kafka.common.serialization
  22. loicmdivad 22

  23. loicmdivad Log and Crash 23 2019-04-17 03:43:12 macbook-de-lolo [ERROR] (LogAndFailExceptionHandler.java:39)

    - Exception caught during Deserialization, taskId: 0_0, topic: exercise-breakfast, partition: 0, offset: 109 Exception in thread "answer-one-breakfast-0d808ce7-0ef1-44c6-808a-f594bc7fceae-StreamThread-1" org.apache.kafka.streams.errors.StreamsException: Deserialization exception handler is set to fail upon a deserialization error. If you would rather have the streaming pipeline continue after a deserialization error, please set the default.deserialization.exception.handler appropriately. at org.apache.kafka.streams.processor.internals.RecordDeserializer.deserialize(RecordDeserializer.java:80) at org.apache.kafka.streams.processor.internals.RecordQueue.addRawRecords(RecordQueue.java:101) at org.apache.kafka.streams.processor.internals.PartitionGroup.addRawRecords(PartitionGroup.java:124) ... at org.apache.kafka.streams.processor.internals.StreamTask.addRecords(StreamTask.java:711) at org.apache.kafka.streams.processor.internals.StreamThread.run(StreamThread.java:747) Caused by: java.lang.IllegalArgumentException: dishes: Insufficient number of elements: decoded 0 but should have decoded 268435712 at scodec.Attempt$Failure.require(Attempt.scala:108) at fr.xebia.ldi.ratatouille.serde.BreakfastDeserializer.deserialize(BreakfastDeserializer.scala:22) at fr.xebia.ldi.ratatouille.serde.BreakfastDeserializer.deserialize(BreakfastDeserializer.scala:15) at org.apache.kafka.common.serialization.Deserializer.deserialize(Deserializer.java:58) at fr.xebia.ldi.ratatouille.serde.BreakfastDeserializer.deserialize(BreakfastDeserializer.scala:15) at org.apache.kafka.streams.processor.internals.SourceNode.deserializeValue(SourceNode.java:60) at org.apache.kafka.streams.processor.internals.RecordDeserializer.deserialize(RecordDeserializer.java:66)
  24. loicmdivad Log and Crash 24 2019-04-17 03:43:12 macbook-de-lolo [ERROR] (LogAndFailExceptionHandler.java:39)

    - Exception caught during Deserialization, taskId: 0_0, topic: exercise-breakfast, partition: 0, offset: 109 Exception in thread "answer-one-breakfast-0d808ce7-0ef1-44c6-808a-f594bc7fceae-StreamThread-1" org.apache.kafka.streams.errors.StreamsException: Deserialization exception handler is set to fail upon a deserialization error. If you would rather have the streaming pipeline continue after a deserialization error, please set the default.deserialization.exception.handler appropriately. at org.apache.kafka.streams.processor.internals.RecordDeserializer.deserialize(RecordDeserializer.java:80) at org.apache.kafka.streams.processor.internals.RecordQueue.addRawRecords(RecordQueue.java:101) at org.apache.kafka.streams.processor.internals.PartitionGroup.addRawRecords(PartitionGroup.java:124) ... at org.apache.kafka.streams.processor.internals.StreamTask.addRecords(StreamTask.java:711) at org.apache.kafka.streams.processor.internals.StreamThread.run(StreamThread.java:747) Caused by: java.lang.IllegalArgumentException: dishes: Insufficient number of elements: decoded 0 but should have decoded 268435712 at scodec.Attempt$Failure.require(Attempt.scala:108) at fr.xebia.ldi.ratatouille.serde.BreakfastDeserializer.deserialize(BreakfastDeserializer.scala:22) at fr.xebia.ldi.ratatouille.serde.BreakfastDeserializer.deserialize(BreakfastDeserializer.scala:15) at org.apache.kafka.common.serialization.Deserializer.deserialize(Deserializer.java:58) at fr.xebia.ldi.ratatouille.serde.BreakfastDeserializer.deserialize(BreakfastDeserializer.scala:15) at org.apache.kafka.streams.processor.internals.SourceNode.deserializeValue(SourceNode.java:60) at org.apache.kafka.streams.processor.internals.RecordDeserializer.deserialize(RecordDeserializer.java:66)
  25. loicmdivad 25 val frame1: Array[Byte] = Array(0x33, 0xd4, 0xfc, 0x00,

    0x00, 0x00, 0x01, 0xa5) val frame2: Array[Byte] = Array(0x44, 0xd2, 0xfe, 0x10, 0x02, 0x03, 0x01)
  26. loicmdivad val frame1: Array[Byte] = Array( , 0xd4, 0xfc, 0x00,

    0x00, 0x00, 0x01, 0xa5) val frame2: Array[Byte] = Array( , 0xd2, 0xfe, 0x10, 0x02, 0x03, 0x01) 26
  27. loicmdivad 27 val frame1: Array[Byte] = Array( , 0xd4, 0xfc,

    0x00, 0x00, 0x00, 0x01, 0xa5) val frame2: Array[Byte] = Array( , 0xd2, 0xfe, 0x10, x2, 0x03, 0x01) case class Meat(sausages: Int, beacons: Int, . . . )
  28. loicmdivad 28

  29. loicmdivad Log and Crash Like all consumers, Kafka Streams applications

    deserialize messages from the broker. The deserialization process can fail. It raises an exception that cannot be caught by our code. Buggy deserializers has to be fixed before the application restart, by default ... 29
  30. loicmdivad Skip the Corrupted Exercise #2 - lunch 30

  31. loicmdivad 31 // Exercise #2: Lunch sealed trait FoodOrder case

    class Lunch(name: String, price: Double, `type`: LunchType) extends FoodOrder
  32. loicmdivad 32 // Exercise #2: Lunch sealed trait FoodOrder case

    class Lunch(name: String, price: Double, `type`: LunchType) extends FoodOrder • starter • main • dessert
  33. loicmdivad 33

  34. loicmdivad Log and Crash - ExceptionHandler 2019-04-17 03:43:12 macbook-de-lolo [ERROR]

    (LogAndFailExceptionHandler.java:39) - Exception caught during Deserialization, taskId: 0_0, topic: exercise-breakfast, partition: 0, offset: 109 Exception in thread "answer-one-breakfast-0d808ce7-0ef1-44c6-808a-f594bc7fceae-StreamThread-1" org.apache.kafka.streams.errors.StreamsException: Deserialization exception handler is set to fail upon a deserialization error. If you would rather have the streaming pipeline continue after a deserialization error, please set the default.deserialization.exception.handler appropriately. at org.apache.kafka.streams.processor.internals.RecordDeserializer.deserialize(RecordDeserializer.java:80) at org.apache.kafka.streams.processor.internals.RecordQueue.addRawRecords(RecordQueue.java:101) at org.apache.kafka.streams.processor.internals.PartitionGroup.addRawRecords(PartitionGroup.java:124) ... at org.apache.kafka.streams.processor.internals.StreamTask.addRecords(StreamTask.java:711) at org.apache.kafka.streams.processor.internals.StreamThread.run(StreamThread.java:747) Caused by: java.lang.IllegalArgumentException: ... decoded 0 but should have decoded 268435712 at scodec.Attempt$Failure.require(Attempt.scala:108) at fr.xebia.ldi.ratatouille.serde.BreakfastDeserializer.deserialize(BreakfastDeserializer.scala:22) at fr.xebia.ldi.ratatouille.serde.BreakfastDeserializer.deserialize(BreakfastDeserializer.scala:15) at org.apache.kafka.common.serialization.Deserializer.deserialize(Deserializer.java:58) at fr.xebia.ldi.ratatouille.serde.BreakfastDeserializer.deserialize(BreakfastDeserializer.scala:15) at org.apache.kafka.streams.processor.internals.SourceNode.deserializeValue(SourceNode.java:60) at org.apache.kafka.streams.processor.internals.RecordDeserializer.deserialize(RecordDeserializer.java:66) 34
  35. loicmdivad 35 public class LogAndFailExceptionHandler implements DeserializationExceptionHandler /* ... */

    public class LogAndContinueExceptionHandler implements DeserializationExceptionHandler /* ... */
  36. loicmdivad 36 public class LogAndFailExceptionHandler implements DeserializationExceptionHandler /* ... */

    public class LogAndContinueExceptionHandler implements DeserializationExceptionHandler /* ... */ public interface DeserializationExceptionHandler extends Configurable { DeserializationHandlerResponse handle(final ProcessorContext context, final ConsumerRecord<byte[], byte[]> record, final Exception exception); enum DeserializationHandlerResponse { CONTINUE(0, "CONTINUE"), FAIL(1, "FAIL"); /* ... */ } } }
  37. loicmdivad 37 public class LogAndFailExceptionHandler implements DeserializationExceptionHandler /* ... */

    public class LogAndContinueExceptionHandler implements DeserializationExceptionHandler /* ... */ public interface DeserializationExceptionHandler extends Configurable { DeserializationHandlerResponse handle(final ProcessorContext context, final ConsumerRecord<byte[], byte[]> record, final Exception exception); enum DeserializationHandlerResponse { CONTINUE(0, "CONTINUE"), FAIL(1, "FAIL"); /* ... */ } } } Take Away
  38. loicmdivad 38 .poll() life cycle & the ExceptionHandler StreamTask#process() │

    … └──PartitionGroup#nextRecord() │ … └──RecordQueue#poll() │ … └──RecordDeserializer#deserialize() deserialize(): ConsumerRecord<byte[], byte[]> => ConsumerRecord<K, V>
  39. loicmdivad 39

  40. loicmdivad Skip the Corrupted All exceptions thrown by deserializers are

    caught by a DeserializationExceptionHandler A handler returns Fail or Continue You can implement your own Handler But the two handlers provided by the library are really basic… let’s explore other methods 40
  41. loicmdivad Skip the Corrupted All exceptions thrown by deserializers are

    caught by a DeserializationExceptionHandler A handler returns Fail or Continue You can implement your own Handler But the two handlers provided by the library are really basic… let’s explore other methods 41 Take Away
  42. loicmdivad Sentinel Value Pattern Exercise #3 - drinks 42

  43. loicmdivad 43 // Exercise #3: Drink sealed trait FoodOrder case

    class Drink(name: String, `type`: DrinkType, quantity: Int, alcohol: Option[Double]) extends FoodOrder
  44. loicmdivad 44 Sentinel value pattern We need to turn the

    deserialization process into a pure transformation that cannot crash To do so, we will replace corrupted message by a sentinel value. It’s a special-purpose record (e.g: null, None, Json.Null, etc ...) f: G → H G H
  45. loicmdivad 45 Sentinel value pattern We need to turn the

    deserialization process into a pure transformation that cannot crash To do so, we will replace corrupted message by a sentinel value. It’s a special-purpose record (e.g: null, None, Json.Null, etc ...) This allows downstream processors to recognize and handle such sentinel values f: G → H G H G H
  46. loicmdivad 46 Sentinel value pattern We need to turn the

    deserialization process into a pure transformation that cannot crash To do so, we will replace corrupted message by a sentinel value. It’s a special-purpose record (e.g: null, None, Json.Null, etc ...) This allows downstream processors to recognize and handle such sentinel values With Kafka Streams this can be achieved by implementing a Deserializer f: G → H G H G H null
  47. loicmdivad 47 case object FoodOrderErr extends FoodOrder class FoodOrderDeserializer extends

    Deserializer[FoodOrder] = ??? class SentinelValueDeserializer extends FoodOrderDeserializer { override def deserialize(topic: String, data: Array[Byte]): FoodOrder = Try(super.deserialize(topic, data)).getOrElse(FoodOrderErr) }
  48. loicmdivad 48

  49. loicmdivad 49 class FoodOrderErrorSink extends ValueTransformer[Json, Unit] { var sensor:

    Sensor = _ var context: ProcessorContext = _ def metricName: MetricName = ??? override def init(context: ProcessorContext): Unit = { this.context = context this.sensor = this.context.metrics.addSensor(??? // TODO: Create a sensor sensor.add(metricName, new Rate()) } override def transform(value: Json): Unit = { sensor.record() } }
  50. loicmdivad 50 class FoodOrderErrorSink extends ValueTransformer[Json, Unit] { var sensor:

    Sensor = _ var context: ProcessorContext = _ def metricName: MetricName = ??? override def init(context: ProcessorContext): Unit = { this.context = context this.sensor = this.context.metrics.addSensor(??? // TODO: Create a sensor sensor.add(metricName, new Rate()) } override def transform(value: Json): Unit = { sensor.record() } } Take Away org.apache.kafka.common.metrics
  51. loicmdivad 51

  52. loicmdivad 52

  53. loicmdivad By implementing a custom serde we can create a

    safe Deserializer. Downstreams now receive a sentinel value indicating a deserialization error. Errors can then be treated correctly, example: monitoring the number of deserialization errors with a custom metric But we lost a lot of information about the error… let’s see a last method Sentinel value Pattern 53
  54. loicmdivad By implementing a custom serde we can create a

    safe Deserializer. Downstreams now receive a sentinel value indicating a deserialization error. Errors can then be treated correctly, example: monitoring the number of deserialization errors with a custom metric But we lost a lot of information about the error… let’s see a last method Sentinel value Pattern 54 Take Away
  55. loicmdivad Dead Letter Queue Pattern Exercise #4 - dinner 55

  56. loicmdivad 56 // Exercise #4: Dinner sealed trait FoodOrder case

    class Dinner(dish: Command, maybeClient: Option[Client], moment: Moment, zone: String) extends FoodOrder
  57. loicmdivad 57

  58. loicmdivad Dead letter queue pattern 58 In this method we

    will let the deserializer fail. For each failure we will send a message to a topic containing corrupted messages. Streaming APP dead letter queue input topic output topic
  59. loicmdivad Dead letter queue pattern 59 In this method we

    will let the deserializer fail. For each failure we will send a message to a topic containing corrupted messages. Each message will have the original content of the input message (for reprocessing) and additional meta data about the failure. Streaming APP dead letter queue input topic output topic
  60. loicmdivad Dead letter queue pattern 60 In this method we

    will let the deserializer fail. For each failure we will send a message to a topic containing corrupted messages. Each message will have the original content of the input message (for reprocessing) and additional meta data about the failure. With Kafka Streams this can be achieved by implementing a DeserializationExceptionHandler Streaming APP dead letter queue input topic output topic
  61. loicmdivad 61 class DeadLetterQueueFoodExceptionHandler() extends DeserializationExceptionHandler { var topic: String

    = _ var producer: KafkaProducer[Array[Byte], GenericRecord] = _ override def handle(context: ProcessorContext, record: ConsumerRecord[Array[Byte], Array[Byte]], exception: Exception): DeserializationHandlerResponse = { val valueMessage: GenericRecord = ??? producer.send(new ProducerRecord[Byte, GenericRecord](???)) DeserializationHandlerResponse.CONTINUE } override def configure(configs: util.Map[String, _]): Unit = { topic = ??? producer = new KafkaProducer[Array[Byte], GenericRecord](???) }
  62. loicmdivad 62 class DeadLetterQueueFoodExceptionHandler() extends DeserializationExceptionHandler { var topic: String

    = _ var producer: KafkaProducer[Array[Byte], GenericRecord] = _ override def handle(context: ProcessorContext, record: ConsumerRecord[Array[Byte], Array[Byte]], exception: Exception): DeserializationHandlerResponse = { val valueMessage: GenericRecord = ??? producer.send(new ProducerRecord[Byte, GenericRecord](???)) DeserializationHandlerResponse.CONTINUE } override def configure(configs: util.Map[String, _]): Unit = { topic = ??? producer = new KafkaProducer[Array[Byte], GenericRecord](???) } Take Away
  63. loicmdivad 63

  64. loicmdivad 64

  65. loicmdivad Dead Letter Queue Pattern You can provide your own

    DeserializationExceptionHandler implementation. This let you use the Producer API to write a corrupted record directly to a quarantine topic. Then you can manually analyse your corrupted records Warrning: This approach has side effects that are invisible to the Kafka Streams runtime. 65
  66. loicmdivad Dead Letter Queue Pattern You can provide your own

    DeserializationExceptionHandler implementation. This let you use the Producer API to write a corrupted record directly to a quarantine topic. Then you can manually analyse your corrupted records Warrning: This approach has side effects that are invisible to the Kafka Streams runtime. 66 Take Away
  67. loicmdivad 67 Conclusion Exercise #0 - take away

  68. loicmdivad Conclusion 68 When using Kafka, deserialization is the responsibility

    of the clients. These internal errors are not easy to catch When it’s possible, use Avro + Schema Registry When it’s not possible, Kafka Streams applies techniques to deal with serde errors: - DLQ: By extending a handler - Sentinel Value: By extending a deserializer
  69. loicmdivad MERCI 69

  70. loicmdivad Links 70 XKE-RATATOUILLE CONFLUENT FAQ

  71. loicmdivad 71 JUNE, 27th 2019 - PAN PIPER, PARIS DATAXDAY.FR

  72. loicmdivad 72

  73. loicmdivad Related Post 73 Kafka Connect Deep Dive – Error

    Handling and Dead Letter Queues - by Robin Moffatt Building Reliable Reprocessing and Dead Letter Queues with Apache Kafka - by Ning Xia Handling bad messages using Kafka's Streams API - answer by Matthias J. Sax
  74. loicmdivad Images 74 Photo by rawpixel on Unsplash Photo by

    João Marcelo Martins on Unsplash Photo by Jordane Mathieu on Unsplash Photo by Brooke Lark on Unsplash Photo by Jakub Kapusnak on Unsplash Photo by Melissa Walker Horn on Unsplash Photo by Aneta Pawlik on Unsplash
  75. loicmdivad With specials thank to Robin M. Sylvain L. Giulia

    B. 75