Upgrade to Pro — share decks privately, control downloads, hide ads and more …

Kafka and Zookeeper on AWS

Kafka and Zookeeper on AWS

How to use Elastic Network Interfaces to smoothly recover from Zookeeper failures. Link to gist https://gist.github.com/rcillo/1a64d757bf3ebaffcb3c71eb95607f1f

Ricardo de Cillo

November 07, 2017
Tweet

Other Decks in Technology

Transcript

  1. How to smoothly recover from Zookeeper failures Ricardo de Cillo

    Data engineer at Zalando @rcillo Kafka on AWS
  2. A typical setup Kafka Cluster Broker 1 Broker 2 Broker

    N . . . Zookeeper Ensemble Node 1 Node 2 Node 3
  3. A typical setup… in real life Kafka Cluster Broker 1

    Broker 2 Broker N . . . Zookeeper Ensemble Node 1 Node 2 Node 3
  4. A typical setup… in real life Kafka Cluster Broker 1

    Broker 2 Broker N . . . Zookeeper + Exhibitor Node 1: 10.0.0.11 Node 2: 10.0.0.12 Node 3: 10.0.0.13 S3 configs 1:10.0.0.11 2:10.0.0.12 3:10.0.0.13
  5. A typical setup… in real life Zookeeper + Exhibitor Node

    1: 10.0.0.11 Node 2: 10.0.0.12 Node 3: 10.0.0.13 S3 configs 1:10.0.0.11 2:10.0.0.12 3:10.0.0.13 Kafka Cluster Broker 1 Broker 2 Broker N . . .
  6. A typical setup… in real life Zookeeper + Exhibitor Node

    1: 10.0.0.11 Node 2: 10.0.0.12 Node 3: 10.0.0.13 S3 configs 1:10.0.0.11 2:10.0.0.12 3:10.0.0.13 Kafka Cluster Broker 1 Broker 2 Broker N . . . server properties
  7. A typical setup… in real life Zookeeper + Exhibitor Node

    1: 10.0.0.11 Node 2: 10.0.0.12 Node 3: 10.0.0.13 S3 configs 1:10.0.0.11 2:10.0.0.12 3:10.0.0.13 Kafka Cluster Broker 1 Broker 2 Broker N . . . server properties zookeeper connect
  8. A typical setup… in real life Zookeeper + Exhibitor Node

    1: 10.0.0.11 Node 2: 10.0.0.12 Node 3: 10.0.0.13 S3 configs 1:10.0.0.11 2:10.0.0.12 3:10.0.0.13 Kafka Cluster Broker 1 Broker 2 Broker N . . . server properties zookeeper connect 10.0.0.11 10.0.0.12 10.0.0.13
  9. A failure scenario Zookeeper + Exhibitor Node 1: 10.0.0.11 Node

    2: 10.0.0.12 Node 3: 10.0.0.13 S3 configs 1:10.0.0.11 2:10.0.0.12 3:10.0.0.13 Kafka Cluster Broker 1 Broker 2 Broker N . . . server properties zookeeper connect 10.0.0.11 10.0.0.12 10.0.0.13
  10. 1. Everything keeps working (Quorum satisfied) 2. Maybe a Zookeeper

    leadership election 3. Kafka-Zookeeper client gracefully reconnects to the remaining nodes What happens
  11. What happens 1. Everything keeps working (Quorum satisfied) 2. Maybe

    a Zookeeper leadership election 3. Kafka-Zookeeper client gracefully reconnects to the remaining nodes 4. Users probably don't even notice
  12. Recovering Zookeeper + Exhibitor Node 1: 10.0.0.11 Node 2: 10.0.0.12

    Node 3: 10.0.0.13 S3 configs 1:10.0.0.11 2:10.0.0.12 3:10.0.0.13 Kafka Cluster Broker 1 Broker 2 Broker N . . . server properties zookeeper connect 10.0.0.11 10.0.0.12 10.0.0.13
  13. Recovering Zookeeper + Exhibitor Node 1: 10.0.0.11 Node 2: 10.0.0.12

    Node 3: 10.0.0.130 S3 configs 1:10.0.0.11 2:10.0.0.12 3:10.0.0.130 Kafka Cluster Broker 1 Broker 2 Broker N . . . server properties zookeeper connect 10.0.0.11 10.0.0.12 10.0.0.130
  14. Recovering Zookeeper + Exhibitor Node 1: 10.0.0.11 Node 2: 10.0.0.12

    Node 3: 10.0.0.130 S3 configs 1:10.0.0.11 2:10.0.0.12 3:10.0.0.130 Kafka Cluster Broker 1 Broker 2 Broker N . . . server properties zookeeper connect 10.0.0.11 10.0.0.12 10.0.0.13
  15. Recovering Zookeeper + Exhibitor Node 1: 10.0.0.11 Node 2: 10.0.0.12

    Node 3: 10.0.0.130 S3 configs 1:10.0.0.11 2:10.0.0.12 3:10.0.0.130 Kafka Cluster Broker 1 Broker 2 Broker N . . . server properties zookeeper connect 10.0.0.11 10.0.0.12 10.0.0.13
  16. Recovering Zookeeper + Exhibitor Node 1: 10.0.0.11 Node 2: 10.0.0.12

    Node 3: 10.0.0.130 S3 configs 1:10.0.0.11 2:10.0.0.12 3:10.0.0.130 Kafka Cluster Broker 1 Broker 2 Broker N . . . server properties zookeeper connect 10.0.0.11 10.0.0.12 10.0.0.13
  17. Recovering Zookeeper + Exhibitor Node 1: 10.0.0.11 Node 2: 10.0.0.12

    Node 3: 10.0.0.130 S3 configs 1:10.0.0.11 2:10.0.0.12 3:10.0.0.130 Kafka Cluster Broker 1 Broker 2 Broker N . . . server properties zookeeper connect 10.0.0.11 10.0.0.12 10.0.0.13
  18. Recovering Zookeeper + Exhibitor Node 1: 10.0.0.11 Node 2: 10.0.0.12

    Node 3: 10.0.0.130 S3 configs 1:10.0.0.11 2:10.0.0.12 3:10.0.0.130 Kafka Cluster Broker 1 Broker 2 Broker N . . . server properties zookeeper connect 10.0.0.11 10.0.0.12 10.0.0.13
  19. Recovering Zookeeper + Exhibitor Node 1: 10.0.0.11 Node 2: 10.0.0.12

    Node 3: 10.0.0.130 S3 configs 1:10.0.0.11 2:10.0.0.12 3:10.0.0.130 Kafka Cluster Broker 1 Broker 2 Broker N . . . server properties zookeeper connect 10.0.0.11 10.0.0.12 10.0.0.13
  20. Recovering Zookeeper + Exhibitor Node 1: 10.0.0.11 Node 2: 10.0.0.12

    Node 3: 10.0.0.130 S3 configs 1:10.0.0.11 2:10.0.0.12 3:10.0.0.130 Kafka Cluster Broker 1 Broker 2 Broker N . . . server properties zookeeper connect 10.0.0.11 10.0.0.12 10.0.0.130
  21. Recovering Zookeeper + Exhibitor Node 1: 10.0.0.11 Node 2: 10.0.0.12

    Node 3: 10.0.0.130 S3 configs 1:10.0.0.11 2:10.0.0.12 3:10.0.0.130 Kafka Cluster Broker 1 Broker 2 Broker N . . . server properties zookeeper connect 10.0.0.11 10.0.0.12 10.0.0.130
  22. Recovering Zookeeper + Exhibitor Node 1: 10.0.0.11 Node 2: 10.0.0.12

    Node 3: 10.0.0.130 S3 configs 1:10.0.0.11 2:10.0.0.12 3:10.0.0.130 Kafka Cluster Broker 1 Broker 2 Broker N . . . server properties zookeeper connect 10.0.0.11 10.0.0.12 10.0.0.130
  23. Recovering Zookeeper + Exhibitor Node 1: 10.0.0.11 Node 2: 10.0.0.12

    Node 3: 10.0.0.130 S3 configs 1:10.0.0.11 2:10.0.0.12 3:10.0.0.130 Kafka Cluster Broker 1 Broker 2 Broker N . . . server properties zookeeper connect 10.0.0.11 10.0.0.12 10.0.0.130
  24. My personal opinion - Kafka was not designed to run

    on disposable hardware like EC2 instances
  25. My personal opinion - Kafka was not designed to run

    on disposable hardware like EC2 instances - It works better with static IPs
  26. My personal opinion - Kafka was not designed to run

    on disposable hardware like EC2 instances - It works better with static IPs - It does not work with zookeeper.connect=domain-name
  27. My personal opinion - Kafka was not designed to run

    on disposable hardware like EC2 instances - It works best with static IPs - It does not work with zookeeper.connect=domain-name - Kafka reloads config
  28. My personal opinion - Kafka was not designed to run

    on disposable hardware like EC2 instances - It works best with static IPs - It does not work with zookeeper.connect=domain-name - Kafka reloads config
  29. How we improved it Zookeeper + Exhibitor Node 1: 10.0.0.11

    Node 2: 10.0.0.12 Node 3: 10.0.0.13 S3 configs 1:10.0.0.11 2:10.0.0.12 3:10.0.0.13 Kafka Cluster Broker 1 Broker 2 Broker N . . . server properties zookeeper connect 10.0.0.11 10.0.0.12 10.0.0.13
  30. How we improved it Zookeeper + Exhibitor S3 configs 1:10.0.0.11

    2:10.0.0.12 3:10.0.0.13 Kafka Cluster Broker 1 Broker 2 Broker N . . . server properties zookeeper connect 10.0.0.11 10.0.0.12 10.0.0.13 Node 1: 10.0.0.1 Node 2: 10.0.0.2 Node 3: 10.0.0.3 10.0.0.11 10.0.0.12 10.0.0.13
  31. How we improved it Zookeeper + Exhibitor S3 configs 1:10.0.0.11

    2:10.0.0.12 3:10.0.0.13 Kafka Cluster Broker 1 Broker 2 Broker N . . . server properties zookeeper connect 10.0.0.11 10.0.0.12 10.0.0.13 Node 1: 10.0.0.1 Node 2: 10.0.0.2 Node 3: 10.0.0.3 10.0.0.11 10.0.0.12 10.0.0.13
  32. How we improved it Zookeeper + Exhibitor S3 configs 1:10.0.0.11

    2:10.0.0.12 3:10.0.0.13 Kafka Cluster Broker 1 Broker 2 Broker N . . . server properties zookeeper connect 10.0.0.11 10.0.0.12 10.0.0.13 Node 1: 10.0.0.1 Node 2: 10.0.0.2 Node 3: 10.0.0.4 10.0.0.11 10.0.0.12 10.0.0.13
  33. =)