Upgrade to Pro — share decks privately, control downloads, hide ads and more …

LINE's messaging server architecture and engineering underlying large scale and highly reliable service / LINE Campus Talk in UC Berkeley by Yuto Kawamura

LINE's messaging server architecture and engineering underlying large scale and highly reliable service / LINE Campus Talk in UC Berkeley by Yuto Kawamura

19.02.2019 Campus Talk in UC Berkeley
Presented by Yuto Kawamura

LINE Developers

February 19, 2019
Tweet

More Decks by LINE Developers

Other Decks in Technology

Transcript

  1. About me • Yuto Kawamura • Senior software engineer •

    A LINE server developer • Apache Kafka contributor • Joined Apr, 2015
  2. Requirement for LINE server system • Fast delivery • Message

    must be delivered in nearly realtime • Reliable delivery • No message lost • Do the above for 5 billion messages /day from164 million users!
  3. Messaging System Architecture LINE Apps LEGY JP LEGY DE LEGY

    SG Thrift RPC/HTTP talk-server Distributed Data Store Distributed async task processing
  4. LEGY LINE Apps LEGY JP LEGY DE LEGY SG Thrift

    RPC/HTTP talk-server Distributed Data Store Distributed async task processing
  5. LEGY • API Gateway/Reverse Proxy • Deployed to many data

    centers all over the world • Developed from scratch at LINE for messaging service • Written in Erlang - used widely in Telecom systems • Zero latency code hot swapping w/o closing client connections • Durability thanks to Erlang process and message passing
  6. talk-server LINE Apps LEGY JP LEGY DE LEGY SG Thrift

    RPC/HTTP talk-server Distributed Data Store Distributed async task processing
  7. talk-server • Java based web application server • Implements most

    of business logics • Message delivery • Friendship management • User configuration • Java8 + Spring + Thrift RPC
  8. Data store LINE Apps LEGY JP LEGY DE LEGY SG

    Thrift RPC/HTTP talk-server Distributed Data Store Distributed async task processing
  9. Hybrid data store • Redis - in-memory DB • Fast,

    but volatile • Home-brew clustering • HBase - distributed key-value store based on Hadoop (HDFS) • Slower than Redis, but persistent • Cascading failure handling for consistency • Async write from background task processor • Data correction batch Primary/ Backup talk-server Cache/ Primary Dual write
  10. How realtime message delivery works LEGY LEGY talk-server Storage 1.

    Find nearest LEGY 2. sendMessage(“Bob”, “Hello!”) 3. Proxy request 4. Write to storage talk-server X. fetchOps() 6. Proxy request 7. Read message 8. Return fetchOps() with message 5. Notify message arrival Alice Bob
  11. Communication between internal systems • Communication for querying, transactional updates

    (needs response from peer) • Communication for data synchronization, update notification talk-server Auth Analytics Another Service HTTP/REST/RPC
  12. Apache Kafka • Distributed streaming platform • (narrow sense) Distributed

    persistent message queue which supports Pub-Sub model • Built-in load distribution • Built-in fail-over on both server(broker) and client
  13. Consumer GroupA Pub-Sub Brokers Consumer Topic Topic Consumer Consumer GroupB

    Consumer Consumer Records[A, B, C…] Records[A, B, C…] • Multiple consumer “groups” can independently consume a single topic
  14. So what’s our engineering here? • Develop facilities/client libraries for

    easier use • Consulting (architecture, usage, requirements) for services that wants to use Kafka • Reliability engineering for Kafka clusters • Troubleshooting when performance violates Service Level Objectives (SLO) • Patch Kafka to fix bug/performance issue • Enhancements for achieving higher availability/performance
  15. Detecting problem • 50x ~ 100x slower response time in

    99th %ile response time • Normal: ~20ms • Observed: 50ms ~ 200ms
  16. Reading code • Network IO thread multiplexes and handle IO

    with multiple client sockets asynchronously • It’s supposed to never block awaiting IO completion
  17. Guessing • When Network IO thread is busy, it means

    either: • 1. Really busy doing lots of work Many requests/ responses to read/write • 2. Blocked by some operations (which should not happen in event loop in general)
  18. Reading code (again) • Kafka uses sendfile(2) system call to

    transfer topic data to client • sendfile(2) system call might involves reading disk inside • sendfile is system call for copying on-disk data to client socket directly (zero-copy transfer) for efficient data transfer
  19. Guessing (again) • What if sendfile(2) call that involves disk

    read blocks network IO event-loop for long time? • Let’s check call duration of sendfile(2) system calls
  20. Inspecting • Using SystemTap • SystemTap: Dynamic tracing tool •

    Can be use to probe linux kernel behavior by writing few lines of script, with low overhead guarantee
  21. Solving problem fundamentally • Keep making fundamental improvements (!= workaround)

    makes system ultimately reliable in the end • Understanding problem and fixing fundamentally requires understanding of all stacks involved to run the software • Application code - e.g, Kafka • Language runtime - e.g, JVM • Operating system - e.g, Linux
  22. Unique challenges leads you to • Chances to encounter new

    problems that has never been addressed • If you could solve it, and feedback to community, you will receive a lot of prizes and satisfactions!
  23. So is LINE platform mature? • Definitely NO • LINE

    platform needs to be revised • More flexibility for more business integration • Higher availability • Multi iDC deployment • Make more “platforms” as shared building block for services
  24. Thanks for listening • More detailed explanation about Kafka performance

    issue troubleshooting: • LINE DEVELOPER DAY: https://youtu.be/6xBqTUUe0Sg • Kafka Summit SF 2018: https://www.confluent.io/kafka-summit-sf18/ kafka-multi-tenancy • Our Kafka upstream contribution examples: • https://issues.apache.org/jira/browse/KAFKA-7504 • https://issues.apache.org/jira/browse/KAFKA-4614