Upgrade to Pro — share decks privately, control downloads, hide ads and more …

Kafka and Zookeeper on AWS

Kafka and Zookeeper on AWS

How to use Elastic Network Interfaces to smoothly recover from Zookeeper failures. Link to gist https://gist.github.com/rcillo/1a64d757bf3ebaffcb3c71eb95607f1f

Avatar for Ricardo de Cillo

Ricardo de Cillo

November 07, 2017
Tweet

Other Decks in Technology

Transcript

  1. How to smoothly recover from Zookeeper failures Ricardo de Cillo

    Data engineer at Zalando @rcillo Kafka on AWS
  2. A typical setup Kafka Cluster Broker 1 Broker 2 Broker

    N . . . Zookeeper Ensemble Node 1 Node 2 Node 3
  3. A typical setup… in real life Kafka Cluster Broker 1

    Broker 2 Broker N . . . Zookeeper Ensemble Node 1 Node 2 Node 3
  4. A typical setup… in real life Kafka Cluster Broker 1

    Broker 2 Broker N . . . Zookeeper + Exhibitor Node 1: 10.0.0.11 Node 2: 10.0.0.12 Node 3: 10.0.0.13 S3 configs 1:10.0.0.11 2:10.0.0.12 3:10.0.0.13
  5. A typical setup… in real life Zookeeper + Exhibitor Node

    1: 10.0.0.11 Node 2: 10.0.0.12 Node 3: 10.0.0.13 S3 configs 1:10.0.0.11 2:10.0.0.12 3:10.0.0.13 Kafka Cluster Broker 1 Broker 2 Broker N . . .
  6. A typical setup… in real life Zookeeper + Exhibitor Node

    1: 10.0.0.11 Node 2: 10.0.0.12 Node 3: 10.0.0.13 S3 configs 1:10.0.0.11 2:10.0.0.12 3:10.0.0.13 Kafka Cluster Broker 1 Broker 2 Broker N . . . server properties
  7. A typical setup… in real life Zookeeper + Exhibitor Node

    1: 10.0.0.11 Node 2: 10.0.0.12 Node 3: 10.0.0.13 S3 configs 1:10.0.0.11 2:10.0.0.12 3:10.0.0.13 Kafka Cluster Broker 1 Broker 2 Broker N . . . server properties zookeeper connect
  8. A typical setup… in real life Zookeeper + Exhibitor Node

    1: 10.0.0.11 Node 2: 10.0.0.12 Node 3: 10.0.0.13 S3 configs 1:10.0.0.11 2:10.0.0.12 3:10.0.0.13 Kafka Cluster Broker 1 Broker 2 Broker N . . . server properties zookeeper connect 10.0.0.11 10.0.0.12 10.0.0.13
  9. A failure scenario Zookeeper + Exhibitor Node 1: 10.0.0.11 Node

    2: 10.0.0.12 Node 3: 10.0.0.13 S3 configs 1:10.0.0.11 2:10.0.0.12 3:10.0.0.13 Kafka Cluster Broker 1 Broker 2 Broker N . . . server properties zookeeper connect 10.0.0.11 10.0.0.12 10.0.0.13
  10. 1. Everything keeps working (Quorum satisfied) 2. Maybe a Zookeeper

    leadership election 3. Kafka-Zookeeper client gracefully reconnects to the remaining nodes What happens
  11. What happens 1. Everything keeps working (Quorum satisfied) 2. Maybe

    a Zookeeper leadership election 3. Kafka-Zookeeper client gracefully reconnects to the remaining nodes 4. Users probably don't even notice
  12. Recovering Zookeeper + Exhibitor Node 1: 10.0.0.11 Node 2: 10.0.0.12

    Node 3: 10.0.0.13 S3 configs 1:10.0.0.11 2:10.0.0.12 3:10.0.0.13 Kafka Cluster Broker 1 Broker 2 Broker N . . . server properties zookeeper connect 10.0.0.11 10.0.0.12 10.0.0.13
  13. Recovering Zookeeper + Exhibitor Node 1: 10.0.0.11 Node 2: 10.0.0.12

    Node 3: 10.0.0.130 S3 configs 1:10.0.0.11 2:10.0.0.12 3:10.0.0.130 Kafka Cluster Broker 1 Broker 2 Broker N . . . server properties zookeeper connect 10.0.0.11 10.0.0.12 10.0.0.130
  14. Recovering Zookeeper + Exhibitor Node 1: 10.0.0.11 Node 2: 10.0.0.12

    Node 3: 10.0.0.130 S3 configs 1:10.0.0.11 2:10.0.0.12 3:10.0.0.130 Kafka Cluster Broker 1 Broker 2 Broker N . . . server properties zookeeper connect 10.0.0.11 10.0.0.12 10.0.0.13
  15. Recovering Zookeeper + Exhibitor Node 1: 10.0.0.11 Node 2: 10.0.0.12

    Node 3: 10.0.0.130 S3 configs 1:10.0.0.11 2:10.0.0.12 3:10.0.0.130 Kafka Cluster Broker 1 Broker 2 Broker N . . . server properties zookeeper connect 10.0.0.11 10.0.0.12 10.0.0.13
  16. Recovering Zookeeper + Exhibitor Node 1: 10.0.0.11 Node 2: 10.0.0.12

    Node 3: 10.0.0.130 S3 configs 1:10.0.0.11 2:10.0.0.12 3:10.0.0.130 Kafka Cluster Broker 1 Broker 2 Broker N . . . server properties zookeeper connect 10.0.0.11 10.0.0.12 10.0.0.13
  17. Recovering Zookeeper + Exhibitor Node 1: 10.0.0.11 Node 2: 10.0.0.12

    Node 3: 10.0.0.130 S3 configs 1:10.0.0.11 2:10.0.0.12 3:10.0.0.130 Kafka Cluster Broker 1 Broker 2 Broker N . . . server properties zookeeper connect 10.0.0.11 10.0.0.12 10.0.0.13
  18. Recovering Zookeeper + Exhibitor Node 1: 10.0.0.11 Node 2: 10.0.0.12

    Node 3: 10.0.0.130 S3 configs 1:10.0.0.11 2:10.0.0.12 3:10.0.0.130 Kafka Cluster Broker 1 Broker 2 Broker N . . . server properties zookeeper connect 10.0.0.11 10.0.0.12 10.0.0.13
  19. Recovering Zookeeper + Exhibitor Node 1: 10.0.0.11 Node 2: 10.0.0.12

    Node 3: 10.0.0.130 S3 configs 1:10.0.0.11 2:10.0.0.12 3:10.0.0.130 Kafka Cluster Broker 1 Broker 2 Broker N . . . server properties zookeeper connect 10.0.0.11 10.0.0.12 10.0.0.13
  20. Recovering Zookeeper + Exhibitor Node 1: 10.0.0.11 Node 2: 10.0.0.12

    Node 3: 10.0.0.130 S3 configs 1:10.0.0.11 2:10.0.0.12 3:10.0.0.130 Kafka Cluster Broker 1 Broker 2 Broker N . . . server properties zookeeper connect 10.0.0.11 10.0.0.12 10.0.0.130
  21. Recovering Zookeeper + Exhibitor Node 1: 10.0.0.11 Node 2: 10.0.0.12

    Node 3: 10.0.0.130 S3 configs 1:10.0.0.11 2:10.0.0.12 3:10.0.0.130 Kafka Cluster Broker 1 Broker 2 Broker N . . . server properties zookeeper connect 10.0.0.11 10.0.0.12 10.0.0.130
  22. Recovering Zookeeper + Exhibitor Node 1: 10.0.0.11 Node 2: 10.0.0.12

    Node 3: 10.0.0.130 S3 configs 1:10.0.0.11 2:10.0.0.12 3:10.0.0.130 Kafka Cluster Broker 1 Broker 2 Broker N . . . server properties zookeeper connect 10.0.0.11 10.0.0.12 10.0.0.130
  23. Recovering Zookeeper + Exhibitor Node 1: 10.0.0.11 Node 2: 10.0.0.12

    Node 3: 10.0.0.130 S3 configs 1:10.0.0.11 2:10.0.0.12 3:10.0.0.130 Kafka Cluster Broker 1 Broker 2 Broker N . . . server properties zookeeper connect 10.0.0.11 10.0.0.12 10.0.0.130
  24. My personal opinion - Kafka was not designed to run

    on disposable hardware like EC2 instances
  25. My personal opinion - Kafka was not designed to run

    on disposable hardware like EC2 instances - It works better with static IPs
  26. My personal opinion - Kafka was not designed to run

    on disposable hardware like EC2 instances - It works better with static IPs - It does not work with zookeeper.connect=domain-name
  27. My personal opinion - Kafka was not designed to run

    on disposable hardware like EC2 instances - It works best with static IPs - It does not work with zookeeper.connect=domain-name - Kafka reloads config
  28. My personal opinion - Kafka was not designed to run

    on disposable hardware like EC2 instances - It works best with static IPs - It does not work with zookeeper.connect=domain-name - Kafka reloads config
  29. How we improved it Zookeeper + Exhibitor Node 1: 10.0.0.11

    Node 2: 10.0.0.12 Node 3: 10.0.0.13 S3 configs 1:10.0.0.11 2:10.0.0.12 3:10.0.0.13 Kafka Cluster Broker 1 Broker 2 Broker N . . . server properties zookeeper connect 10.0.0.11 10.0.0.12 10.0.0.13
  30. How we improved it Zookeeper + Exhibitor S3 configs 1:10.0.0.11

    2:10.0.0.12 3:10.0.0.13 Kafka Cluster Broker 1 Broker 2 Broker N . . . server properties zookeeper connect 10.0.0.11 10.0.0.12 10.0.0.13 Node 1: 10.0.0.1 Node 2: 10.0.0.2 Node 3: 10.0.0.3 10.0.0.11 10.0.0.12 10.0.0.13
  31. How we improved it Zookeeper + Exhibitor S3 configs 1:10.0.0.11

    2:10.0.0.12 3:10.0.0.13 Kafka Cluster Broker 1 Broker 2 Broker N . . . server properties zookeeper connect 10.0.0.11 10.0.0.12 10.0.0.13 Node 1: 10.0.0.1 Node 2: 10.0.0.2 Node 3: 10.0.0.3 10.0.0.11 10.0.0.12 10.0.0.13
  32. How we improved it Zookeeper + Exhibitor S3 configs 1:10.0.0.11

    2:10.0.0.12 3:10.0.0.13 Kafka Cluster Broker 1 Broker 2 Broker N . . . server properties zookeeper connect 10.0.0.11 10.0.0.12 10.0.0.13 Node 1: 10.0.0.1 Node 2: 10.0.0.2 Node 3: 10.0.0.4 10.0.0.11 10.0.0.12 10.0.0.13
  33. =)