Upgrade to Pro — share decks privately, control downloads, hide ads and more …

大量トラフィックを受けるトピックのレプリケーションボトルネック

Avatar for Wilson Li Wilson Li
April 09, 2019

 大量トラフィックを受けるトピックのレプリケーションボトルネック

Replication Bottleneck of topic receiving huge traffic

Avatar for Wilson Li

Wilson Li

April 09, 2019
Tweet

Other Decks in Technology

Transcript

  1. Who am I  Name: Li Yan Kit, Wilson 

    Company: LINE Corporation  Work: Development Engineer, Provides Company-Wide Kafka service
  2. Replication  Kafka replicates messages to multiple brokers  Default

    value is 3  Leader – Master Replica  Handles Clients Read (Consume) and Write (Produce) Requests  Keep tracks of Followers  Acknowledges produces after making sure Followers have caught up  Follower – Standby Replica  Reads from Leader and stores a copy
  3. Illustration  For simplicity, we consider the case with 3

    replicas, ack = -1 (all ISR) Leader Log msg1 msg2 Follower Follower Log msg1 msg2 Log msg1 msg2 HW User Purgatory
  4. 1. Leader Receives Produce Req.  Appends to local log

     Adds a DelayedProduce Operation to Produce Purgatory  with new offset  with timeout Leader User Produce, msg3 1 Log msg1 msg2 msg3 2 Purgatory msg3 3
  5. 2. Leader Receives FetchFollower Req.  Follower sends fetch request

    to leaders with its log end offset  Leader replies with the message  Leader updates its record of follower offset Leader Log msg1 msg2 msg3 Follower msg3 Fetch, off3 1 2
  6. 3. Leader Updates the high watermark  When all replicas

    have fetched the new message  Completes DelayedProduce in the Produce Purgatory with offset smaller than the high watermark  Replies client that produce is completed Leader Log msg1 msg2 msg3 Follower Follower Log msg1 msg2 msg3 Log msg1 msg2 msg3 HW HW 1 User Purgatory msg3 2 OK, msg3
  7. Assign Partitions to Brokers  For a topic with n

    partitions  Pick a random BrokerID (N)  Assign Leader of partition i to broker N + i  Pick a random BrokerID (M)  Assign Followers of partition to broker M + i, M + i + 1  Avoid same (leader, follower) pattern Partition Leader Follower 0 5 2, 3 1 0 3, 4 2 1 4, 5 3 2 5, 0 4 3 0, 1 5 4 1, 2 6 5 3, 4 7 0 4, 5
  8. Workload Distribution  Each broker is leader of 1~2 partitions

     Each broker is follower of 2~3 partitions  Evenly distributed within topic  Round Robin  Randomized to evenly distributed across many topics Partition Leader Follower 0 5 2, 3 1 0 3, 4 2 1 4, 5 3 2 5, 0 4 3 0, 1 5 4 1, 2 6 5 3, 4 7 0 4, 5
  9. Replica Fetcher Thread  Kafka spawns threads to perform the

    fetching in background  Sends Fetch Request to Leader  Receives New Messages  Stores New Messages to Log Leader Follower Fetch, off3 Fetch, off4 Fetch, off5 Fetch, off6
  10. num.replica.fetchers  Control the number of fetchers to one leader

     Responsible for fetching partitions from one broker  Total Number of Fetchers  (No. of brokers – 1) * num.replica.fetchers Broker2 Broker1 Fetch, p0 Fetch, p1 Fetch, p2 Fetch, p3 Fetch, p4 Fetch, p5
  11. Assign Partitions to Fetchers  For a topic with n

    partitions  Calculate hash of topic and modulo No. of Fetchers = K  Assign partition i to fetcher K + i Partition Fetcher 0 1 1 2 2 0 3 1 4 2 5 0 6 1 7 2
  12. Workload Distribution  Each fetcher is fetching 2~3 partitions 

    Evenly distributed within topic  Round Robin  Randomized to evenly distributed across many topics Partition Fetcher 0 1 1 2 2 0 3 1 4 2 5 0 6 1 7 2
  13. Partitions – Brokers + Fetcher - Brokers Partition Fetcher 0

    1 1 2 2 0 3 1 4 2 5 0 6 1 7 2 Partition Leader Follower 0 5 2, 3 1 0 3, 4 2 1 4, 5 3 2 5, 0 4 3 0, 1 5 4 1, 2 6 5 3, 4 7 0 4, 5
  14. Partitions – Brokers + Fetcher - Brokers Partition Leader Follower

    Fetcher 0 5 2 3 1 1 1 0 3 4 2 2 6 5 3 4 1 1 7 0 4 5 1 1
  15. Partitions – Brokers + Fetcher - Brokers Partition Leader Follower

    Fetcher 0 5 2 3 1 1 1 0 3 4 2 2 6 5 3 4 1 1 7 0 4 5 2 2
  16. Workload Distribution  Fetcher 1 in Broker 3 is responsible

    for 2 partitions but Fetcher 0 is idle  This breaks the load balancing  This is the two assignments are round robin and the modulus are multiple to each other Broker Fetcher Number of Partitions 3 0 0 1 2 2 1 4 0 1 1 0 2 2 5 0 1 1 1 2 1
  17. Workload Distribution  Fetcher 1 in Broker 3 is responsible

    for 2 partitions but Fetcher 0 is idle  This breaks the load balancing  This is the two assignments are round robin and the modulus are multiple to each other Broker Fetcher Number of Partitions 3 0 0 1 2 2 1 4 0 1 1 0 2 2 5 0 1 1 1 2 1 We have 3 fetchers, we want to assign partitions to all of them. But it only assigns to 1 fetcher
  18. Testing the theory  Large Topic (# of partitions >

    # of brokers) : 40 partitions, 25 brokers  Multiples (# of brokers is multiples of # of fetchers) 1 2 3 4 5 6 7 8 9 10
  19. Testing the theory  Large Topic (# of partitions >

    # of brokers) : 40 partitions, 25 brokers  Multiples (# of brokers is multiples of # of fetchers) 1 2 3 4 5 6 7 8 9 10
  20. Testing the theory  Large Topic (# of partitions >

    # of brokers) : 40 partitions, 25 brokers  Multiples (# of brokers is multiples of # of fetchers) 1 2 3 4 5 6 7 8 9 10 25 (5 x 5) brokers 5 fetchers
  21. Lesson Learned  Round Robin is not always perfect 

    Similar to multi-level load balancing preferably different hashing function  Avoid setting number of fetchers to be factor of number of brokers  Neither multiple of number of brokers