Upgrade to Pro — share decks privately, control downloads, hide ads and more …

Kafka Cluster Design Pattern Fall -JJUG CCC 2021 Fall-

keigodasu
November 21, 2021

Kafka Cluster Design Pattern Fall -JJUG CCC 2021 Fall-

keigodasu

November 21, 2021
Tweet

Transcript

  1. Who am I? 2 Keigo Suda • Working as a

    Solutions Architect/Professional Services at Confluent
  2. What’s Kafka? 5 • Kafka itself is a simple and

    strong messaging system. • Its core capabilities are ◦ To write and read streams of events with high performance ◦ To store streams of events durably and reliably • Messages are grouped into “Topic”. Producer Consumer Kafka Streams of Events Write → ← Read Store Topic Topic Topic
  3. Topic Topic/Partition 7 partition 1 partition 2 partition 3 partition

    4 • Topic consists of one or more partitions. • Partition is the smallest unit that holds a subset of records. P C
  4. Broker02 Broker03 Partition & Replica 8 Topic A - Partition

    1 (Follower) Topic A - Partition 1 (Follower) Topic A - Partition 2 (Leader) Topic A - Partition 2 (Follower) Topic A - Partition 3 (Leader) Topic A - Partition 3 (Follower) Broker01 Topic A - Partition 1 (Leader) Topic A - Partition 2 (Follower) Topic A - Partition 3 (Follower) • Kafka replicates messages in each partition across brokers. • Partition Producer and Consumer handle is Leader partition, partition fetch messages from Leader partition fetching messages from Leader partition is Follower partition. • When a server fails, the Follower partition is automatically promoted to Leader Partition.
  5. AZ02 AZ03 Broker02 Broker03 Rack Awareness 9 AZ01 Topic A

    - Partition 1 (Follower) Topic A - Partition 1 (Follower) Topic A - Partition 2 (Leader) Topic A - Partition 2 (Follower) Topic A - Partition 3 (Leader) Topic A - Partition 3 (Follower) Broker01 Topic A - Partition 1 (Leader) Topic A - Partition 2 (Follower) Topic A - Partition 3 (Follower) broker.rack = “AZ01” broker.rack = “AZ02” broker.rack = “AZ03” • Rack Awareness enables spreading replicas across different locations • Kafka provides robustness for broker-failure, rack-failure, dc-failure, and az failure.
  6. Producer • Producer sends messages to multiple partitions based on

    partitioning strategy. • Kery in a message is used to decide partition. partition 1 partition 2 partition 3 partition 4 P Message<Key, Value> Partitioner partition = hash(key) % numPartitions
  7. Broker02 Broker03 Producer - acks - 11 Topic A -

    Partition 1 (Follower) Topic A - Partition 1 (Follower) Broker01 Topic A - Partition 1 (Leader) Producer Fetch min.insync.replicas = 2 acks = 0 acks = 1 acks = all • Kafka provides a parameter to decide when to reply acknowledgment to Producer (acks) ◦ acks = 0 ▪ Producer will not wait for any acknowledgment from Brokers ◦ acks = 1 ▪ Producer will wait until leader partition write a record to its local log ◦ acks = all(-1) ▪ Producer will wait until leader partition writes a record to its local log and sync up follower partition to meet min.insync.replicas Fetch
  8. Group Consuming From Kafka - Grouped Consumers C C C

    • Consumer group is a set of consumers which cooperate to consumer data from topics. • When changes in the group, assigned partitions will be re-assigned among members. • After fetching messages, consumer commits offset to manage position. offset=7 offset=5 offset=3 offset=8
  9. Points to Consider 14 • RTO/RPO requirement? • What are

    producer requirements? ◦ Produce into single DC? or dual produce? • What are consumer requirements? ◦ Does order matter to consumers? ◦ Does consumer handle duplicates? • Failover Requirements ◦ No Auto Client Failover ◦ No Offset Preservation • Asynchronous Replication acceptable? or Synchronous needed? • Data governance?
  10. Cluster Design Patterns Single Cluster Pattern ◦ one cluster stretched

    over one or multiple locations. Multi Cluster Pattern ◦ multiple clusters located in separate locations 15
  11. Single Cluster Pattern • Single Cluster Pattern is ◦ one

    cluster stretched over one or multiple locations. ◦ synchronous replication base • This patterns are ◦ One Location Cluster ◦ Stretched Cluster ▪ 2 DCs ▪ 2.5 DCs ▪ 3 DCs 17
  12. DC Cluster Broker One Location Cluster 18 Broker Broker Producer

    Consumer Read Write Description This is one cluster in one DC location layout and the simplest pattern among typical cluster patterns. This clustering pattern can handle node failure without any data loss but can’t cover DC failure without continuing operations. RPO = 0 and RTO is the time of promoting replica partition to leader partition. Note Fundamental cluster layout. Quorum Zookeeper Zookeeper Zookeeper Topic
  13. DC#2 DC#1 Hierarchical Quorum Group Quorum Cluster Stretched Cluster 19

    Broker Broker Broker Broker Broker Zookeeper Broker Broker Broker Group Broker Broker Zookeeper DC#2 DC#1 Cluster Broker Broker Broker Zookeeper Broker Broker Broker Zookeeper DC#3 Zookeeper Quorum DC#2 DC#1 Cluster Broker Broker Broker Zookeeper Broker Broker Broker Zookeeper DC#3 Broker Broker Broker Zookeeper 2 DCs 2.5 DCs 3 DCs • Stretched Cluster is a big one cluster stretched over multiple DCs. • Can set up robust cluster across DCs, and handle failure easily with more DC locations. • Message is synchronously replicated over multiple locations. • Low network latency between all DCs is required.
  14. Group Group DC#1 DC#2 Hierarchical Quorum Group Stretched Cluster -

    2 DCs - 20 Description This is one cluster stretched over 2 DCs. Kafka Brokers are clustered across 2 DCs and Zookeeper is in a hierarchical quorum. Kafka Brokers connect to the local Zookeeper group. When DC failure, RTO > 0 because you need to remove Zookeeper group. RPO is dependent on min.insync.replica setting. Note Need to select failover strategy (Consistency vs Availability) Broker Broker Broker ZK ZK ZK Broker Broker Broker ZK ZK ZK Producer Consumer Topic Producer Consumer Read Write Read Write
  15. DC #1 DC #2 group1 group2 Zookeeper Hierarchical Quorum 21

    zk1 zk2 zk3 zk4 zk5 zk6 • Hierarchical Quorum is like quorum of quorum. server.1=zk1:2888:3888 server.2=zk2:2888:3888 server.3=zk3:2888:3888 server.4=zk4:2888:3888 server.5=zk5:2888:3888 server.6=zk6:2888:3888 group.1=1:2:3 group.2=4:5:6
  16. DC #1 DC #2 Group When DC Failure 22 Broker

    L Broker F Broker Zookeeper Zookeeper Zookeeper Broker F → L Broker F Broker Zookeeper Zookeeper Zookeeper Producer Failover 1 2 4 6 1 2 3 4 5 DC#1 failure occured Can’t elect Leader Partition Remove zk and group of outage from configuration and restart zk New Leader Partition is elected on DC#2 5 Change min.insync.replica 3 to 2 6 Producer can send messages to new Leader Partition Failback 1 2 Restore Zookeeper hierarchical quorum Restore min.insync.replicas min.insync.replicas=3 min.insync.replicas > (replication-factor / 2) #server.1=zk1:2888:3888 #server.2=zk2:2888:3888 #server.3=zk3:2888:3888 server.4=zk4:2888:3888 server.5=zk5:2888:3888 server.6=zk6:2888:3888 #group.1=1:2:3 #group.2=4:5:6 3 acks=all
  17. DC#2 Quorum DC#1 Cluster Stretched Cluster - 2.5 DCs -

    23 Description This is one cluster stretched over 2 DCs + 1 DC for running a single Zookeeper. Zookeeper can maintain quorum across 3 DCs. RPO & RTO are 0 when a DC failure running a single Zookeeper. Note In terms of RPO, Consistency vs Availability consideration still exists when one DC running Kafka brokers. Broker Broker ZK Broker Broker ZK Producer Consumer Topic Producer Consumer Read Write Read Write DC#3 ZK Topic
  18. DC#2 DC#1 DC#3 Quorum Cluster Stretched Cluster - 3 DCs

    - 24 Description This is one cluster stretched over 3 DCs. RTO & RPO are 0 when one DC failure. This pattern is the simplest and most robust among all patterns. Note This pattern is very common in public cloud (using multiple AZ) Broker Broker ZK Broker Broker ZK Producer Consumer Topic Producer Consumer Read Write Read Write ZK Broker Broker Producer Consumer Read Write Topic
  19. Multi Cluster Pattern • Multi Cluster Pattern is ◦ Multiple

    clusters located in separate locations ◦ Asynchronous replication between clusters • This patterns are ◦ Active - Passive ◦ Active - Active ◦ Aggregation 26
  20. DC#2 Cluster DC#1 Quorum Quorum Cluster Active - Passive 27

    Description Parimay cluster(Active) mirrors data to standby cluster(Passive) using MM2. When active site failure, You need to move Producer and Consumer applications to the standby site. RTO depends on how much Standby site warmed up. When it comes to RPO, data loss might happen because MM2 asynchronously copies data from active site to standby site. Note MM2 is running on the destination site. Applications are independently able to consumer mirrored data on the standby site. Broker Broker Broker ZK ZK ZK Broker Broker Broker ZK ZK ZK Producer Consumer Consumer Topic Topic Topic MM2 DC#1 → DC#2 Write Read Read
  21. DC#2 Cluster DC#1 Quorum Quorum Cluster Active - Active 28

    Description Two clusters bidirectional mirroring. Messages produced at both clusters are mirrored to each other. RPO might be less than the Active-Passive pattern because both sites are hot status. But you need to decide which is active when a problem happens. Note Because MM2 make destination topic with prefix, consumer applications might have to specify topics by a prefix like *.Topic. Broker Broker Broker ZK ZK ZK Broker Broker Broker ZK ZK ZK Producer Consumer Topic Producer Consumer DC#1.Topic DC#2.Topic Topic Write Read Read Write MM2 DC#1 → DC#2 MM2 DC#2 → DC#1
  22. Center DC Cluster Aggregation Description This pattern is to aggregate

    messages across multiple clusters into another cluster. This pattern can centrally analyze messages generated across multiple clusters. Note Hybrid & multi-cloud architecture is also similar to this pattern. Broker Broker Broker Consumer DC#1.Topic DC#2.Topic MM2 MM2 DC#1 Cluster Broker Broker Broker DC#1.Topic DC#2 Cluster Broker Broker Broker DC#2.Topic Read
  23. How to care Consumer Offset? 30 • In Multiple cluster

    layout, how can consumers resume from where they left off on the source cluster? • MM2 supports to sync commit offset by RemoteClusterUtil API. ◦ Transferring Commit Offset with MirrorMaker 2 • MirrorCheckpointConnector tracks offsets for consumer groups, and resume consuming on the destination cluster by using RemoteClusterUtil API.
  24. How to continue other components? 31 Kafka Streams & ksqlDB

    • Kafka Streams & ksqlDB use internal topics to manage their state. • Avoid mirroring these internal topics to another cluster because consistency can be corrupted when launching applications on the destination site. Kafka Connect • Kafka Connect has to maintain consistency among related all settings and states (source system, destination system, Kafka Connect settings ,and running Connectors settings) Approach • Single cluster pattern is much safer for these components. • Running these components as standby on the destination site is another approach. (continue to update state information on the destination site and reduce RTO)
  25. Summary 32 • Cluster layout pattern is categorized into mainly

    two ways. ◦ Single Cluster Pattern ◦ Multi Cluster Pattern • Single Cluster Pattern is ◦ The cluster stretched over multiple locations ◦ Synchronous replication ◦ Minimize RTO & RPO ◦ Simpler & easier to operate and maintain • Multi Cluster Pattern is ◦ The multiple clusters located in separate locations ◦ Asynchronous replication between clusters ◦ Preserving commit offset should be cared if needed ◦ Hybrid environments