$30 off During Our Annual Pro Sale. View Details »

Kafka Cluster Design Pattern Fall -JJUG CCC 2021 Fall-

keigodasu
November 21, 2021

Kafka Cluster Design Pattern Fall -JJUG CCC 2021 Fall-

keigodasu

November 21, 2021
Tweet

Other Decks in Technology

Transcript

  1. 2021.11.21
    Kafka Cluster Design Pattern Fall
    JJUG CCC 2021 Fall

    View Slide

  2. Who am I?
    2
    Keigo Suda
    ● Working as a Solutions Architect/Professional Services at Confluent

    View Slide

  3. Table of Contents
    ● Apache Kafka 101
    ● Cluster Design Pattern
    3

    View Slide

  4. Apache Kafka 101

    View Slide

  5. What’s Kafka?
    5
    ● Kafka itself is a simple and strong messaging system.
    ● Its core capabilities are
    ○ To write and read streams of events with high performance
    ○ To store streams of events durably and reliably
    ● Messages are grouped into “Topic”.
    Producer Consumer
    Kafka
    Streams of Events
    Write → ← Read
    Store
    Topic
    Topic
    Topic

    View Slide

  6. Basic Architecture
    BROKER BROKER BROKER BROKER
    ZOOKEEPER ZOOKEEPER ZOOKEEPER
    KAFKA
    Producer
    Consumer
    Consumer
    GROUP
    Quorum

    View Slide

  7. Topic
    Topic/Partition
    7
    partition 1
    partition 2
    partition 3
    partition 4
    ● Topic consists of one or more partitions.
    ● Partition is the smallest unit that holds a subset of records.
    P C

    View Slide

  8. Broker02 Broker03
    Partition & Replica
    8
    Topic A - Partition 1
    (Follower)
    Topic A - Partition 1
    (Follower)
    Topic A - Partition 2
    (Leader)
    Topic A - Partition 2
    (Follower)
    Topic A - Partition 3
    (Leader)
    Topic A - Partition 3
    (Follower)
    Broker01
    Topic A - Partition 1
    (Leader)
    Topic A - Partition 2
    (Follower)
    Topic A - Partition 3
    (Follower)
    ● Kafka replicates messages in each partition across brokers.
    ● Partition Producer and Consumer handle is Leader partition, partition fetch messages from
    Leader partition fetching messages from Leader partition is Follower partition.
    ● When a server fails, the Follower partition is automatically promoted to Leader Partition.

    View Slide

  9. AZ02 AZ03
    Broker02 Broker03
    Rack Awareness
    9
    AZ01
    Topic A - Partition 1
    (Follower)
    Topic A - Partition 1
    (Follower)
    Topic A - Partition 2
    (Leader)
    Topic A - Partition 2
    (Follower)
    Topic A - Partition 3
    (Leader)
    Topic A - Partition 3
    (Follower)
    Broker01
    Topic A - Partition 1
    (Leader)
    Topic A - Partition 2
    (Follower)
    Topic A - Partition 3
    (Follower)
    broker.rack = “AZ01” broker.rack = “AZ02” broker.rack = “AZ03”
    ● Rack Awareness enables spreading replicas across different locations
    ● Kafka provides robustness for broker-failure, rack-failure, dc-failure, and az failure.

    View Slide

  10. Producer
    ● Producer sends messages to multiple partitions based on partitioning strategy.
    ● Kery in a message is used to decide partition.
    partition 1
    partition 2
    partition 3
    partition 4
    P
    MessageValue>
    Partitioner
    partition = hash(key) % numPartitions

    View Slide

  11. Broker02 Broker03
    Producer - acks -
    11
    Topic A - Partition 1
    (Follower)
    Topic A - Partition 1
    (Follower)
    Broker01
    Topic A - Partition 1
    (Leader)
    Producer
    Fetch
    min.insync.replicas = 2
    acks = 0
    acks = 1
    acks = all
    ● Kafka provides a parameter to decide when to reply acknowledgment to Producer (acks)
    ○ acks = 0
    ■ Producer will not wait for any acknowledgment from Brokers
    ○ acks = 1
    ■ Producer will wait until leader partition write a record to its local log
    ○ acks = all(-1)
    ■ Producer will wait until leader partition writes a record to its local log and sync up follower partition
    to meet min.insync.replicas
    Fetch

    View Slide

  12. Group
    Consuming From Kafka - Grouped Consumers
    C
    C
    C
    ● Consumer group is a set of consumers which cooperate to consumer data from topics.
    ● When changes in the group, assigned partitions will be re-assigned among members.
    ● After fetching messages, consumer commits offset to manage position.
    offset=7
    offset=5
    offset=3
    offset=8

    View Slide

  13. Cluster Design Pattern

    View Slide

  14. Points to Consider
    14
    ● RTO/RPO requirement?
    ● What are producer requirements?
    ○ Produce into single DC? or dual produce?
    ● What are consumer requirements?
    ○ Does order matter to consumers?
    ○ Does consumer handle duplicates?
    ● Failover Requirements
    ○ No Auto Client Failover
    ○ No Offset Preservation
    ● Asynchronous Replication acceptable? or Synchronous needed?
    ● Data governance?

    View Slide

  15. Cluster Design Patterns
    Single Cluster Pattern
    ○ one cluster stretched over one or multiple locations.
    Multi Cluster Pattern
    ○ multiple clusters located in separate locations
    15

    View Slide

  16. Cluster Design Pattern
    Single Cluster Pattern

    View Slide

  17. Single Cluster Pattern
    ● Single Cluster Pattern is
    ○ one cluster stretched over one or multiple locations.
    ○ synchronous replication base
    ● This patterns are
    ○ One Location Cluster
    ○ Stretched Cluster
    ■ 2 DCs
    ■ 2.5 DCs
    ■ 3 DCs
    17

    View Slide

  18. DC
    Cluster
    Broker
    One Location Cluster
    18
    Broker Broker
    Producer Consumer
    Read
    Write
    Description
    This is one cluster in one DC location layout and the
    simplest pattern among typical cluster patterns.
    This clustering pattern can handle node failure
    without any data loss but can’t cover DC failure
    without continuing operations.
    RPO = 0 and RTO is the time of promoting replica
    partition to leader partition.
    Note
    Fundamental cluster layout.
    Quorum
    Zookeeper Zookeeper Zookeeper
    Topic

    View Slide

  19. DC#2
    DC#1
    Hierarchical Quorum
    Group
    Quorum
    Cluster
    Stretched Cluster
    19
    Broker
    Broker
    Broker
    Broker
    Broker
    Zookeeper
    Broker
    Broker
    Broker
    Group
    Broker
    Broker
    Zookeeper
    DC#2
    DC#1
    Cluster
    Broker
    Broker
    Broker
    Zookeeper
    Broker
    Broker
    Broker
    Zookeeper
    DC#3
    Zookeeper
    Quorum
    DC#2
    DC#1
    Cluster
    Broker
    Broker
    Broker
    Zookeeper
    Broker
    Broker
    Broker
    Zookeeper
    DC#3
    Broker
    Broker
    Broker
    Zookeeper
    2 DCs 2.5 DCs 3 DCs
    ● Stretched Cluster is a big one cluster stretched over multiple DCs.
    ● Can set up robust cluster across DCs, and handle failure easily with more DC locations.
    ● Message is synchronously replicated over multiple locations.
    ● Low network latency between all DCs is required.

    View Slide

  20. Group Group
    DC#1 DC#2
    Hierarchical Quorum
    Group
    Stretched Cluster - 2 DCs -
    20
    Description
    This is one cluster stretched over 2 DCs.
    Kafka Brokers are clustered across 2 DCs and Zookeeper is
    in a hierarchical quorum. Kafka Brokers connect to the local
    Zookeeper group.
    When DC failure, RTO > 0 because you need to remove
    Zookeeper group. RPO is dependent on min.insync.replica
    setting.
    Note
    Need to select failover strategy (Consistency vs Availability)
    Broker Broker Broker
    ZK ZK ZK
    Broker Broker Broker
    ZK ZK ZK
    Producer Consumer
    Topic
    Producer Consumer
    Read
    Write Read
    Write

    View Slide

  21. DC #1 DC #2
    group1 group2
    Zookeeper Hierarchical Quorum
    21
    zk1
    zk2 zk3
    zk4
    zk5 zk6
    ● Hierarchical Quorum is like quorum of quorum.
    server.1=zk1:2888:3888
    server.2=zk2:2888:3888
    server.3=zk3:2888:3888
    server.4=zk4:2888:3888
    server.5=zk5:2888:3888
    server.6=zk6:2888:3888
    group.1=1:2:3
    group.2=4:5:6

    View Slide

  22. DC #1 DC #2
    Group
    When DC Failure
    22
    Broker
    L
    Broker
    F
    Broker
    Zookeeper
    Zookeeper Zookeeper
    Broker
    F → L
    Broker
    F
    Broker
    Zookeeper
    Zookeeper Zookeeper
    Producer
    Failover
    1 2
    4
    6
    1
    2
    3
    4
    5
    DC#1 failure occured
    Can’t elect Leader Partition
    Remove zk and group of outage from
    configuration and restart zk
    New Leader Partition is elected on DC#2
    5
    Change min.insync.replica 3 to 2
    6 Producer can send messages to new Leader Partition
    Failback
    1
    2
    Restore Zookeeper hierarchical quorum
    Restore min.insync.replicas
    min.insync.replicas=3
    min.insync.replicas > (replication-factor / 2)
    #server.1=zk1:2888:3888
    #server.2=zk2:2888:3888
    #server.3=zk3:2888:3888
    server.4=zk4:2888:3888
    server.5=zk5:2888:3888
    server.6=zk6:2888:3888
    #group.1=1:2:3
    #group.2=4:5:6
    3
    acks=all

    View Slide

  23. DC#2
    Quorum
    DC#1
    Cluster
    Stretched Cluster - 2.5 DCs -
    23
    Description
    This is one cluster stretched over 2 DCs + 1 DC for
    running a single Zookeeper.
    Zookeeper can maintain quorum across 3 DCs.
    RPO & RTO are 0 when a DC failure running a single
    Zookeeper.
    Note
    In terms of RPO, Consistency vs Availability
    consideration still exists when one DC running
    Kafka brokers.
    Broker Broker
    ZK
    Broker Broker
    ZK
    Producer Consumer
    Topic
    Producer Consumer
    Read
    Write Read
    Write
    DC#3
    ZK
    Topic

    View Slide

  24. DC#2
    DC#1 DC#3
    Quorum
    Cluster
    Stretched Cluster - 3 DCs -
    24
    Description
    This is one cluster stretched over 3 DCs.
    RTO & RPO are 0 when one DC failure. This pattern
    is the simplest and most robust among all patterns.
    Note
    This pattern is very common in public cloud (using
    multiple AZ) Broker Broker
    ZK
    Broker Broker
    ZK
    Producer Consumer
    Topic
    Producer Consumer
    Read
    Write Read
    Write
    ZK
    Broker Broker
    Producer Consumer
    Read
    Write
    Topic

    View Slide

  25. Cluster Design Pattern
    Multi Cluster Pattern

    View Slide

  26. Multi Cluster Pattern
    ● Multi Cluster Pattern is
    ○ Multiple clusters located in separate locations
    ○ Asynchronous replication between clusters
    ● This patterns are
    ○ Active - Passive
    ○ Active - Active
    ○ Aggregation
    26

    View Slide

  27. DC#2
    Cluster
    DC#1
    Quorum Quorum
    Cluster
    Active - Passive
    27
    Description
    Parimay cluster(Active) mirrors data to standby
    cluster(Passive) using MM2.
    When active site failure, You need to move Producer
    and Consumer applications to the standby site.
    RTO depends on how much Standby site warmed
    up. When it comes to RPO, data loss might happen
    because MM2 asynchronously copies data from
    active site to standby site.
    Note
    MM2 is running on the destination site.
    Applications are independently able to consumer
    mirrored data on the standby site.
    Broker Broker Broker
    ZK ZK ZK
    Broker Broker Broker
    ZK ZK ZK
    Producer Consumer Consumer
    Topic
    Topic Topic
    MM2
    DC#1 → DC#2
    Write Read Read

    View Slide

  28. DC#2
    Cluster
    DC#1
    Quorum Quorum
    Cluster
    Active - Active
    28
    Description
    Two clusters bidirectional mirroring.
    Messages produced at both clusters are mirrored to
    each other.
    RPO might be less than the Active-Passive pattern
    because both sites are hot status. But you need to
    decide which is active when a problem happens.
    Note
    Because MM2 make destination topic with prefix,
    consumer applications might have to specify topics
    by a prefix like *.Topic.
    Broker Broker Broker
    ZK ZK ZK
    Broker Broker Broker
    ZK ZK ZK
    Producer Consumer
    Topic
    Producer Consumer
    DC#1.Topic
    DC#2.Topic Topic
    Write Read Read
    Write
    MM2
    DC#1 → DC#2
    MM2
    DC#2 → DC#1

    View Slide

  29. Center DC
    Cluster
    Aggregation
    Description
    This pattern is to aggregate messages across
    multiple clusters into another cluster.
    This pattern can centrally analyze messages
    generated across multiple clusters.
    Note
    Hybrid & multi-cloud architecture is also similar to
    this pattern.
    Broker Broker Broker
    Consumer
    DC#1.Topic
    DC#2.Topic
    MM2 MM2
    DC#1
    Cluster
    Broker Broker Broker
    DC#1.Topic
    DC#2
    Cluster
    Broker Broker Broker
    DC#2.Topic
    Read

    View Slide

  30. How to care Consumer Offset?
    30
    ● In Multiple cluster layout, how can consumers resume from where they left off
    on the source cluster?
    ● MM2 supports to sync commit offset by RemoteClusterUtil API.
    ○ Transferring Commit Offset with MirrorMaker 2
    ● MirrorCheckpointConnector tracks offsets for consumer groups, and resume
    consuming on the destination cluster by using RemoteClusterUtil API.

    View Slide

  31. How to continue other components?
    31
    Kafka Streams & ksqlDB
    ● Kafka Streams & ksqlDB use internal topics to manage their state.
    ● Avoid mirroring these internal topics to another cluster because consistency can be corrupted
    when launching applications on the destination site.
    Kafka Connect
    ● Kafka Connect has to maintain consistency among related all settings and states (source system,
    destination system, Kafka Connect settings ,and running Connectors settings)
    Approach
    ● Single cluster pattern is much safer for these components.
    ● Running these components as standby on the destination site is another approach. (continue to
    update state information on the destination site and reduce RTO)

    View Slide

  32. Summary
    32
    ● Cluster layout pattern is categorized into mainly two ways.
    ○ Single Cluster Pattern
    ○ Multi Cluster Pattern
    ● Single Cluster Pattern is
    ○ The cluster stretched over multiple locations
    ○ Synchronous replication
    ○ Minimize RTO & RPO
    ○ Simpler & easier to operate and maintain
    ● Multi Cluster Pattern is
    ○ The multiple clusters located in separate locations
    ○ Asynchronous replication between clusters
    ○ Preserving commit offset should be cared if needed
    ○ Hybrid environments

    View Slide

  33. Thank you for your attention

    View Slide