Upgrade to Pro — share decks privately, control downloads, hide ads and more …

Kafka Streams on k8s, the difficulties

Kafka Streams on k8s, the difficulties

Ed81876bf33da90cdae47ce9b8df056b?s=128

Loïc DIVAD

June 04, 2019
Tweet

Transcript

  1. Powered by Autoscaling Kafka Streams with Kubernetes 1

  2. Loïc DIVAD Data Engineer @XebiaFr @loicmdivad dataxday.fr organizer 2

  3. @loicmdivad #ContainerDayFR Event Streaming Apps with Kafka Streams APP 3

  4. @loicmdivad #ContainerDayFR Event Streaming Apps with Kafka Streams APP APP

    APP 4
  5. BUILD THE FUTURE 5

  6. @loicmdivad #ContainerDayFR CPU Usage and Memory consumption 6

  7. @loicmdivad #ContainerDayFR APP Jmx Metrics k8s and the support for

    custom metrics 7
  8. @loicmdivad #ContainerDayFR APP Jmx Metrics k8s and the support for

    custom metrics 8
  9. @loicmdivad #ContainerDayFR APP Jmx Metrics k8s and the support for

    custom metrics 9 custom-metrics-stackdriver-adapter & prometheus-to-sd
  10. @loicmdivad #ContainerDayFR Streaming apps auto-scaling: the implementation Stackdriver Metrics Server

    https://gcr.io/google-containers/custom-metrics-stackdriver-adapter Prometheus to Stackdriver https://gcr.io/google-containers/prometheus-to-sd k8s master 10 Your Streaming App https://docs.confluent.io/current/streams/index.html
  11. @loicmdivad #ContainerDayFR My kubernetes journey Youtube: Xebicon 2018 Github: xke-kingof-scaling

    Medium: Kafka Streams: a road to autoscaling via k8s 11
  12. @loicmdivad #ContainerDayFR topic-partition-0 topic-partition-1 topic-partition-2 topic-partition-N APP APP APP APP

    broker.id 0 broker.id N broker.id 1 . . . Kafka and the group management protocol 12 The maximum parallelism is determined by the number of partitions . . .
  13. @loicmdivad #ContainerDayFR The maximum parallelism is determined by the number

    of partitions 13 topic-partition-0 topic-partition-1 topic-partition-2 topic-partition-N APP APP APP APP Kafka and the group management protocol kafka cluster
  14. @loicmdivad #ContainerDayFR State migration Moving with your valuable (Data) 14

  15. @loicmdivad #ContainerDayFR Most of streaming applications have state 15 Single

    consumer topic-partition-0 topic-partition-1 topic-partition-2 topic-partition-N APP kafka cluster
  16. @loicmdivad #ContainerDayFR APP 0 1 2 3 16 Scale out

    for the 1st time topic-partition-0 topic-partition-1 topic-partition-2 topic-partition-3 kafka cluster
  17. @loicmdivad #ContainerDayFR APP States are sharded across instances when the

    workload is splited 0 1 APP 2 3 2 3 17 Scale out for the 1st time topic-partition-0 topic-partition-1 topic-partition-2 topic-partition-3 kafka cluster
  18. @loicmdivad #ContainerDayFR APP APP 1 18 APP States are sharded

    across instances when the workload is splited 0 1 2 2 topic-partition-0 topic-partition-1 topic-partition-2 topic-partition-3 kafka cluster Scale out for the 1st time 3 3
  19. @loicmdivad #ContainerDayFR States are backup inside of Kafka with a

    changelog topic, which is compacted my-application-state-partition-3-changelog Topic Partition Topic Segments Compacted Topic Segments 1 GB Streams as a changelog of states 19
  20. @loicmdivad #ContainerDayFR Having a persisted state speeds up the application

    recovery. It gets faster to status RUNNING APP 0 1 2 3 1 2 3 20 topic-partition-0 topic-partition-1 topic-partition-2 topic-partition-N kafka cluster Scaling out, again
  21. @loicmdivad #ContainerDayFR 21 Scaling out, again Having a persisted state

    speeds up the application recovery. It gets faster to status RUNNING APP 0 1 APP 2 3 1 2 3 topic-partition-0 topic-partition-1 topic-partition-2 topic-partition-N kafka cluster
  22. @loicmdivad #ContainerDayFR State migration Large states imply high recovery times

    StateFulSet are essential Kafka Streams use sticky assignment when rebalancing Tuning the segment file rolling may help 22
  23. @loicmdivad #ContainerDayFR Rebalance Protocol Consuming (the flow) together 23

  24. @loicmdivad #ContainerDayFR kafka broker Group Coordinator Streaming Application Consumer Group

    Rebalance 1 JoinRequest 2 JoinResponse 3 SyncGroup 4 Rebalance Protocol 24
  25. @loicmdivad #ContainerDayFR 1. Rebalance: Trigger a rebalance for all the

    group 2. JoinRequest: All active members ask to join the group 3. JoinResponse: Group Coordinator accepts 4. SyncGroup: All accepted new members ask for workload Rebalance Protocol 25 kafka broker Group Coordinator Streaming Application Consumer Group Rebalance 1 JoinRequest 2 JoinResponse 3 SyncGroup 4
  26. @loicmdivad #ContainerDayFR topic-partition-0 topic-partition-1 topic-partition-2 APP APP kafka cluster Rebalance

    Protocol 26
  27. @loicmdivad #ContainerDayFR Rebalance Protocol 27 topic-partition-0 topic-partition-1 topic-partition-2 kafka cluster

    APP APP APP
  28. @loicmdivad #ContainerDayFR topic-partition-0 topic-partition-1 topic-partition-2 APP APP kafka cluster APP

    Rebalance Protocol 28
  29. @loicmdivad #ContainerDayFR topic-partition-0 topic-partition-1 topic-partition-2 APP APP kafka cluster APP

    Rebalance Protocol 29
  30. @loicmdivad #ContainerDayFR topic-partition-0 topic-partition-1 topic-partition-2 kafka cluster APP APP APP

    Rebalances take time and affect all the members of the group Rebalance Protocol 30
  31. @loicmdivad #ContainerDayFR Rebalance Protocol Rebalancing takes time, and when it

    appends, it impacts everyone This is sometime referred as stop the world effect Incremental rebalancing may be a solution Due to dynamic membership, we can not avoid rebalancing when scaling out or upgrading 31
  32. @loicmdivad #ContainerDayFR What’s in the Box? 32

  33. @loicmdivad #ContainerDayFR Conclusion The future is bright ❖ A lot

    of effort is put on the standardization of storage for Kubernetes ➢ StateFulSets is in beta and it’s already a game changer ❖ A few improvements are in developpement to make Scaling out for Kafka Streams easier: ➢ Static Membership ➢ Incremental rebalancing 33
  34. @loicmdivad #ContainerDayFR [k8s] StatefulSet, Pods for stateful applications and distributed

    systems [KIP-345] Introduce static membership protocol to reduce consumer rebalances [KIP-415] Incremental Cooperative Rebalancing in Kafka Connect [KIP-429] Kafka Consumer Incremental Rebalance Protocol [KIP-441] Smooth Scaling Out for Kafka Streams Conclusion The future is bright 34
  35. Standards & Craftsmanship 35

  36. MERCI 36

  37. @loicmdivad #ContainerDayFR • Related talks: ◦ Deploying Kafka Streams Applications

    with Docker and Kubernetes - by Gwen Shapira and Matthias J. Sax ◦ Everything You Always Wanted to Know About Kafka’s Rebalance Protocol but Were Afraid to Ask - by Matthias J. Sax ◦ Horizontal Pod Autoscaler Reloaded - Scale on Custom Metrics - Maciej Pytel & Solly Ross • Pictures: ◦ Photo by Shwetha Shankar on Unsplash ◦ Photo by chuttersnap on Unsplash ◦ Photo by Bonnie Kittle on Unsplash Resources 37
  38. Powered by Standards & Craftsmanship 38