Pro Yearly is on sale from $80 to $50! »

大量トラフィックを受けるトピックのレプリケーションボトルネック

 大量トラフィックを受けるトピックのレプリケーションボトルネック

Replication Bottleneck of topic receiving huge traffic

77fe702c8506a32b73e204974c0328a1?s=128

Wilson Li

April 09, 2019
Tweet

Transcript

  1. 大量トラフィック を受けるトピックの レプリケーションボトルネック Li Yan Kit, Wilson LINE Corporation Replication

    Bottleneck of topic receiving huge traffic
  2. Self Introduction 自己紹介

  3. Who am I  Name: Li Yan Kit, Wilson 

    Company: LINE Corporation  Work: Development Engineer, Provides Company-Wide Kafka service
  4. Around 2019-01-01 00:00 2019年1月1日午前0時0分

  5. Happy New Year!

  6. Traffic Increased, Kafka Latency Spiked

  7. Replicas were out-of-sync Few partitions of the big topic now

    only had 2 replicas
  8. No Problems in Network, CPU, Disk CPU: <30% Network: <50%

    Disk: <10%
  9. Traffic Increased, Replica Lost Let’s check Kafka’s Replication

  10. Kafkaのレプリケーションの仕組み How Kafka Replication works

  11. Replication  Kafka replicates messages to multiple brokers  Default

    value is 3  Leader – Master Replica  Handles Clients Read (Consume) and Write (Produce) Requests  Keep tracks of Followers  Acknowledges produces after making sure Followers have caught up  Follower – Standby Replica  Reads from Leader and stores a copy
  12. Illustration  For simplicity, we consider the case with 3

    replicas, ack = -1 (all ISR) Leader Log msg1 msg2 Follower Follower Log msg1 msg2 Log msg1 msg2 HW User Purgatory
  13. 1. Leader Receives Produce Req.  Appends to local log

     Adds a DelayedProduce Operation to Produce Purgatory  with new offset  with timeout Leader User Produce, msg3 1 Log msg1 msg2 msg3 2 Purgatory msg3 3
  14. 2. Leader Receives FetchFollower Req.  Follower sends fetch request

    to leaders with its log end offset  Leader replies with the message  Leader updates its record of follower offset Leader Log msg1 msg2 msg3 Follower msg3 Fetch, off3 1 2
  15. 3. Leader Updates the high watermark  When all replicas

    have fetched the new message  Completes DelayedProduce in the Produce Purgatory with offset smaller than the high watermark  Replies client that produce is completed Leader Log msg1 msg2 msg3 Follower Follower Log msg1 msg2 msg3 Log msg1 msg2 msg3 HW HW 1 User Purgatory msg3 2 OK, msg3
  16. How partitions are assigned to brokers Brokerに対するPartition割り当て

  17. Assign Partitions to Brokers  For a topic with n

    partitions  Pick a random BrokerID (N)  Assign Leader of partition i to broker N + i  Pick a random BrokerID (M)  Assign Followers of partition to broker M + i, M + i + 1  Avoid same (leader, follower) pattern Partition Leader Follower 0 5 2, 3 1 0 3, 4 2 1 4, 5 3 2 5, 0 4 3 0, 1 5 4 1, 2 6 5 3, 4 7 0 4, 5
  18. Workload Distribution  Each broker is leader of 1~2 partitions

     Each broker is follower of 2~3 partitions  Evenly distributed within topic  Round Robin  Randomized to evenly distributed across many topics Partition Leader Follower 0 5 2, 3 1 0 3, 4 2 1 4, 5 3 2 5, 0 4 3 0, 1 5 4 1, 2 6 5 3, 4 7 0 4, 5
  19. Replica Fetcher Thread (Fetcher) Replica Fetcherスレッド (Fetcher)

  20. Replica Fetcher Thread  Kafka spawns threads to perform the

    fetching in background  Sends Fetch Request to Leader  Receives New Messages  Stores New Messages to Log Leader Follower Fetch, off3 Fetch, off4 Fetch, off5 Fetch, off6
  21. Throughput of single fetcher  Topic with 1 partition 

    Keep Producing ~1 Gbps
  22. How partitions are assigned to fetchers Fetcherに対するPartition割り当て

  23. num.replica.fetchers  Control the number of fetchers to one leader

     Responsible for fetching partitions from one broker  Total Number of Fetchers  (No. of brokers – 1) * num.replica.fetchers Broker2 Broker1 Fetch, p0 Fetch, p1 Fetch, p2 Fetch, p3 Fetch, p4 Fetch, p5
  24. Assign Partitions to Fetchers  For a topic with n

    partitions  Calculate hash of topic and modulo No. of Fetchers = K  Assign partition i to fetcher K + i Partition Fetcher 0 1 1 2 2 0 3 1 4 2 5 0 6 1 7 2
  25. Workload Distribution  Each fetcher is fetching 2~3 partitions 

    Evenly distributed within topic  Round Robin  Randomized to evenly distributed across many topics Partition Fetcher 0 1 1 2 2 0 3 1 4 2 5 0 6 1 7 2
  26. Collision of the two assignments 2つの割り当ての衝突

  27. Partitions – Brokers + Fetcher - Brokers Partition Fetcher 0

    1 1 2 2 0 3 1 4 2 5 0 6 1 7 2 Partition Leader Follower 0 5 2, 3 1 0 3, 4 2 1 4, 5 3 2 5, 0 4 3 0, 1 5 4 1, 2 6 5 3, 4 7 0 4, 5
  28. Partitions – Brokers + Fetcher - Brokers Partition Leader Follower

    Fetcher 0 5 2 3 1 1 1 0 3 4 2 2 6 5 3 4 1 1 7 0 4 5 1 1
  29. Partitions – Brokers + Fetcher - Brokers Partition Leader Follower

    Fetcher 0 5 2 3 1 1 1 0 3 4 2 2 6 5 3 4 1 1 7 0 4 5 2 2
  30. Workload Distribution  Fetcher 1 in Broker 3 is responsible

    for 2 partitions but Fetcher 0 is idle  This breaks the load balancing  This is the two assignments are round robin and the modulus are multiple to each other Broker Fetcher Number of Partitions 3 0 0 1 2 2 1 4 0 1 1 0 2 2 5 0 1 1 1 2 1
  31. Workload Distribution  Fetcher 1 in Broker 3 is responsible

    for 2 partitions but Fetcher 0 is idle  This breaks the load balancing  This is the two assignments are round robin and the modulus are multiple to each other Broker Fetcher Number of Partitions 3 0 0 1 2 2 1 4 0 1 1 0 2 2 5 0 1 1 1 2 1 We have 3 fetchers, we want to assign partitions to all of them. But it only assigns to 1 fetcher
  32. Verification 検証

  33. Testing the theory  Large Topic (# of partitions >

    # of brokers) : 40 partitions, 25 brokers  Multiples (# of brokers is multiples of # of fetchers) 1 2 3 4 5 6 7 8 9 10
  34. Testing the theory  Large Topic (# of partitions >

    # of brokers) : 40 partitions, 25 brokers  Multiples (# of brokers is multiples of # of fetchers) 1 2 3 4 5 6 7 8 9 10
  35. Testing the theory  Large Topic (# of partitions >

    # of brokers) : 40 partitions, 25 brokers  Multiples (# of brokers is multiples of # of fetchers) 1 2 3 4 5 6 7 8 9 10 25 (5 x 5) brokers 5 fetchers
  36. Summary まとめ

  37. Lesson Learned  Round Robin is not always perfect 

    Similar to multi-level load balancing preferably different hashing function  Avoid setting number of fetchers to be factor of number of brokers  Neither multiple of number of brokers
  38. Q & A