Upgrade to Pro — share decks privately, control downloads, hide ads and more …

Kafka Streams on k8s, the difficulties

Kafka Streams on k8s, the difficulties

Avatar for Loïc DIVAD

Loïc DIVAD

June 04, 2019
Tweet

More Decks by Loïc DIVAD

Other Decks in Programming

Transcript

  1. @loicmdivad #ContainerDayFR APP Jmx Metrics k8s and the support for

    custom metrics 9 custom-metrics-stackdriver-adapter & prometheus-to-sd
  2. @loicmdivad #ContainerDayFR Streaming apps auto-scaling: the implementation Stackdriver Metrics Server

    https://gcr.io/google-containers/custom-metrics-stackdriver-adapter Prometheus to Stackdriver https://gcr.io/google-containers/prometheus-to-sd k8s master 10 Your Streaming App https://docs.confluent.io/current/streams/index.html
  3. @loicmdivad #ContainerDayFR topic-partition-0 topic-partition-1 topic-partition-2 topic-partition-N APP APP APP APP

    broker.id 0 broker.id N broker.id 1 . . . Kafka and the group management protocol 12 The maximum parallelism is determined by the number of partitions . . .
  4. @loicmdivad #ContainerDayFR The maximum parallelism is determined by the number

    of partitions 13 topic-partition-0 topic-partition-1 topic-partition-2 topic-partition-N APP APP APP APP Kafka and the group management protocol kafka cluster
  5. @loicmdivad #ContainerDayFR Most of streaming applications have state 15 Single

    consumer topic-partition-0 topic-partition-1 topic-partition-2 topic-partition-N APP kafka cluster
  6. @loicmdivad #ContainerDayFR APP 0 1 2 3 16 Scale out

    for the 1st time topic-partition-0 topic-partition-1 topic-partition-2 topic-partition-3 kafka cluster
  7. @loicmdivad #ContainerDayFR APP States are sharded across instances when the

    workload is splited 0 1 APP 2 3 2 3 17 Scale out for the 1st time topic-partition-0 topic-partition-1 topic-partition-2 topic-partition-3 kafka cluster
  8. @loicmdivad #ContainerDayFR APP APP 1 18 APP States are sharded

    across instances when the workload is splited 0 1 2 2 topic-partition-0 topic-partition-1 topic-partition-2 topic-partition-3 kafka cluster Scale out for the 1st time 3 3
  9. @loicmdivad #ContainerDayFR States are backup inside of Kafka with a

    changelog topic, which is compacted my-application-state-partition-3-changelog Topic Partition Topic Segments Compacted Topic Segments 1 GB Streams as a changelog of states 19
  10. @loicmdivad #ContainerDayFR Having a persisted state speeds up the application

    recovery. It gets faster to status RUNNING APP 0 1 2 3 1 2 3 20 topic-partition-0 topic-partition-1 topic-partition-2 topic-partition-N kafka cluster Scaling out, again
  11. @loicmdivad #ContainerDayFR 21 Scaling out, again Having a persisted state

    speeds up the application recovery. It gets faster to status RUNNING APP 0 1 APP 2 3 1 2 3 topic-partition-0 topic-partition-1 topic-partition-2 topic-partition-N kafka cluster
  12. @loicmdivad #ContainerDayFR State migration Large states imply high recovery times

    StateFulSet are essential Kafka Streams use sticky assignment when rebalancing Tuning the segment file rolling may help 22
  13. @loicmdivad #ContainerDayFR kafka broker Group Coordinator Streaming Application Consumer Group

    Rebalance 1 JoinRequest 2 JoinResponse 3 SyncGroup 4 Rebalance Protocol 24
  14. @loicmdivad #ContainerDayFR 1. Rebalance: Trigger a rebalance for all the

    group 2. JoinRequest: All active members ask to join the group 3. JoinResponse: Group Coordinator accepts 4. SyncGroup: All accepted new members ask for workload Rebalance Protocol 25 kafka broker Group Coordinator Streaming Application Consumer Group Rebalance 1 JoinRequest 2 JoinResponse 3 SyncGroup 4
  15. @loicmdivad #ContainerDayFR topic-partition-0 topic-partition-1 topic-partition-2 kafka cluster APP APP APP

    Rebalances take time and affect all the members of the group Rebalance Protocol 30
  16. @loicmdivad #ContainerDayFR Rebalance Protocol Rebalancing takes time, and when it

    appends, it impacts everyone This is sometime referred as stop the world effect Incremental rebalancing may be a solution Due to dynamic membership, we can not avoid rebalancing when scaling out or upgrading 31
  17. @loicmdivad #ContainerDayFR Conclusion The future is bright ❖ A lot

    of effort is put on the standardization of storage for Kubernetes ➢ StateFulSets is in beta and it’s already a game changer ❖ A few improvements are in developpement to make Scaling out for Kafka Streams easier: ➢ Static Membership ➢ Incremental rebalancing 33
  18. @loicmdivad #ContainerDayFR [k8s] StatefulSet, Pods for stateful applications and distributed

    systems [KIP-345] Introduce static membership protocol to reduce consumer rebalances [KIP-415] Incremental Cooperative Rebalancing in Kafka Connect [KIP-429] Kafka Consumer Incremental Rebalance Protocol [KIP-441] Smooth Scaling Out for Kafka Streams Conclusion The future is bright 34
  19. @loicmdivad #ContainerDayFR • Related talks: ◦ Deploying Kafka Streams Applications

    with Docker and Kubernetes - by Gwen Shapira and Matthias J. Sax ◦ Everything You Always Wanted to Know About Kafka’s Rebalance Protocol but Were Afraid to Ask - by Matthias J. Sax ◦ Horizontal Pod Autoscaler Reloaded - Scale on Custom Metrics - Maciej Pytel & Solly Ross • Pictures: ◦ Photo by Shwetha Shankar on Unsplash ◦ Photo by chuttersnap on Unsplash ◦ Photo by Bonnie Kittle on Unsplash Resources 37