Upgrade to Pro — share decks privately, control downloads, hide ads and more …

Kafka Streams on k8s, the difficulties

Kafka Streams on k8s, the difficulties

Loïc DIVAD

June 04, 2019
Tweet

More Decks by Loïc DIVAD

Other Decks in Programming

Transcript

  1. @loicmdivad #ContainerDayFR APP Jmx Metrics k8s and the support for

    custom metrics 9 custom-metrics-stackdriver-adapter & prometheus-to-sd
  2. @loicmdivad #ContainerDayFR Streaming apps auto-scaling: the implementation Stackdriver Metrics Server

    https://gcr.io/google-containers/custom-metrics-stackdriver-adapter Prometheus to Stackdriver https://gcr.io/google-containers/prometheus-to-sd k8s master 10 Your Streaming App https://docs.confluent.io/current/streams/index.html
  3. @loicmdivad #ContainerDayFR topic-partition-0 topic-partition-1 topic-partition-2 topic-partition-N APP APP APP APP

    broker.id 0 broker.id N broker.id 1 . . . Kafka and the group management protocol 12 The maximum parallelism is determined by the number of partitions . . .
  4. @loicmdivad #ContainerDayFR The maximum parallelism is determined by the number

    of partitions 13 topic-partition-0 topic-partition-1 topic-partition-2 topic-partition-N APP APP APP APP Kafka and the group management protocol kafka cluster
  5. @loicmdivad #ContainerDayFR Most of streaming applications have state 15 Single

    consumer topic-partition-0 topic-partition-1 topic-partition-2 topic-partition-N APP kafka cluster
  6. @loicmdivad #ContainerDayFR APP 0 1 2 3 16 Scale out

    for the 1st time topic-partition-0 topic-partition-1 topic-partition-2 topic-partition-3 kafka cluster
  7. @loicmdivad #ContainerDayFR APP States are sharded across instances when the

    workload is splited 0 1 APP 2 3 2 3 17 Scale out for the 1st time topic-partition-0 topic-partition-1 topic-partition-2 topic-partition-3 kafka cluster
  8. @loicmdivad #ContainerDayFR APP APP 1 18 APP States are sharded

    across instances when the workload is splited 0 1 2 2 topic-partition-0 topic-partition-1 topic-partition-2 topic-partition-3 kafka cluster Scale out for the 1st time 3 3
  9. @loicmdivad #ContainerDayFR States are backup inside of Kafka with a

    changelog topic, which is compacted my-application-state-partition-3-changelog Topic Partition Topic Segments Compacted Topic Segments 1 GB Streams as a changelog of states 19
  10. @loicmdivad #ContainerDayFR Having a persisted state speeds up the application

    recovery. It gets faster to status RUNNING APP 0 1 2 3 1 2 3 20 topic-partition-0 topic-partition-1 topic-partition-2 topic-partition-N kafka cluster Scaling out, again
  11. @loicmdivad #ContainerDayFR 21 Scaling out, again Having a persisted state

    speeds up the application recovery. It gets faster to status RUNNING APP 0 1 APP 2 3 1 2 3 topic-partition-0 topic-partition-1 topic-partition-2 topic-partition-N kafka cluster
  12. @loicmdivad #ContainerDayFR State migration Large states imply high recovery times

    StateFulSet are essential Kafka Streams use sticky assignment when rebalancing Tuning the segment file rolling may help 22
  13. @loicmdivad #ContainerDayFR kafka broker Group Coordinator Streaming Application Consumer Group

    Rebalance 1 JoinRequest 2 JoinResponse 3 SyncGroup 4 Rebalance Protocol 24
  14. @loicmdivad #ContainerDayFR 1. Rebalance: Trigger a rebalance for all the

    group 2. JoinRequest: All active members ask to join the group 3. JoinResponse: Group Coordinator accepts 4. SyncGroup: All accepted new members ask for workload Rebalance Protocol 25 kafka broker Group Coordinator Streaming Application Consumer Group Rebalance 1 JoinRequest 2 JoinResponse 3 SyncGroup 4
  15. @loicmdivad #ContainerDayFR topic-partition-0 topic-partition-1 topic-partition-2 kafka cluster APP APP APP

    Rebalances take time and affect all the members of the group Rebalance Protocol 30
  16. @loicmdivad #ContainerDayFR Rebalance Protocol Rebalancing takes time, and when it

    appends, it impacts everyone This is sometime referred as stop the world effect Incremental rebalancing may be a solution Due to dynamic membership, we can not avoid rebalancing when scaling out or upgrading 31
  17. @loicmdivad #ContainerDayFR Conclusion The future is bright ❖ A lot

    of effort is put on the standardization of storage for Kubernetes ➢ StateFulSets is in beta and it’s already a game changer ❖ A few improvements are in developpement to make Scaling out for Kafka Streams easier: ➢ Static Membership ➢ Incremental rebalancing 33
  18. @loicmdivad #ContainerDayFR [k8s] StatefulSet, Pods for stateful applications and distributed

    systems [KIP-345] Introduce static membership protocol to reduce consumer rebalances [KIP-415] Incremental Cooperative Rebalancing in Kafka Connect [KIP-429] Kafka Consumer Incremental Rebalance Protocol [KIP-441] Smooth Scaling Out for Kafka Streams Conclusion The future is bright 34
  19. @loicmdivad #ContainerDayFR • Related talks: ◦ Deploying Kafka Streams Applications

    with Docker and Kubernetes - by Gwen Shapira and Matthias J. Sax ◦ Everything You Always Wanted to Know About Kafka’s Rebalance Protocol but Were Afraid to Ask - by Matthias J. Sax ◦ Horizontal Pod Autoscaler Reloaded - Scale on Custom Metrics - Maciej Pytel & Solly Ross • Pictures: ◦ Photo by Shwetha Shankar on Unsplash ◦ Photo by chuttersnap on Unsplash ◦ Photo by Bonnie Kittle on Unsplash Resources 37