Upgrade to Pro — share decks privately, control downloads, hide ads and more …

Improve Processing Performance for PayPay Cashback

Improve Processing Performance for PayPay Cashback

PayPay Corporation.

October 27, 2021
Tweet

More Decks by PayPay Corporation.

Other Decks in Technology

Transcript

  1. Improve Processing Performance for PayPay Cashback How we 4x the

    throughput using Akka Stream XIAO Yang Oct, 2021 1
  2. 2 - Name: - 肖杨 (XIAO Yang) - From: -

    Chengdu, Sichuan, China - PayPay: - Since 2019.10 - Tech Lead in CLM team - Interest: - Functional Programming - Distributed system - Akka (Concurrent Toolset for JVM written in Scala) Self Introduction
  3. 3 - What is Cashback in PayPay - How Cashback

    is given in real-time - Performance issue happened - How we improved the performance ToC
  4. 6 - Akka Stream - Stream processing library - Code

    as same as Flow chart - Compositional building blocks - Back-Pressure Support - Alpakka Kafka - Kafka connector backed by Akka Stream - Fine tuned Kafka Consumer/Producer for high performance Akka Stream/Alpakka From: Reactive stream processing using Akka streams
  5. 9 - Cashback can’t show for some transactions after traffic

    rate is bigger than 250/s - Cashback process can’t support higher traffic - Time required from a transaction is made until cashback granted ⤴ - Some partition stopped - Consumers for topics affect each other What happened
  6. 10 - Visualization - Lag of each stage - Throughput

    of each stage 0. Identify the Bottleneck
  7. 11 - Monitor each Stage - Processing Lag - Throughput

    - External Dependencies - Response time - Throughput - Resource usage - CPU/Memory - DB Performance Dashboard
  8. 12 Asynchronous operator in Akka Stream - mapAsync: Accept Future

    function and Parallelism - Concurrent Processing - Up to n(parallelism) elements - Can use separated thread pool - Not block caller thread - In-Order Processing - Order can be kept when commit Kafka message 1. Optimize the Process
  9. 13 Original Configuration: Threads are blocked - High Parallelism -

    120 Futures will be created at same time - Default Executor - Java ForkJoinPool with 8 parallelism 1. Optimize the Process Type Parallelism Save Incoming Event DB Write (Blocking) 10 Cashback Evaluation API Call (Non-blocking) 50 Update Event + Save Cashback DB Write (Blocking) 10 SQS Enqueue API Call (Non-blocking) 50
  10. New Configuration: Less blocking - For Blocking Process - Reduce

    parallelism - Separated fix-sized thread-pool - For Non-blocking Process - Remain enough parallelism - Use default executor (no context switch) 14 1. Optimize the Process Type Latency Parallelism Theoretical Max Throughput 1000/Latency * Parallelism * 30 Save Incoming Event DB Write (Blocking) 20ms~30ms 2 2000 ~ 3000 TPS Cashback Evaluation API Call (Non-blocking) 100ms~300ms 20 2000 ~ 6000 TPS Update Event + Save Cashback DB Write (Blocking) 30ms~40ms 2 1500 ~ 2000 TPS
  11. 15 - Fully Handle 700 TPS in performance test -

    Whole process can finish within 2s - Up to 1200 TPS for Forward Stream - Show cashback result only - 4x+ vs. 250 TPS Mid Result
  12. 17 2. Remove Bottleneck Save and update Event in short

    interval SQS operation took time Update Cashback twice in short interval Read after write by ID
  13. 18 2. Remove Bottleneck Write once Update once Ne need

    to Read Retry in-progress events got lost
  14. - Fully Handle 2000+ TPS in performance test - Whole

    process can finish within 2s - 2500+ TPS for Forward Stream process - Show cashback result only - 48x+ vs. 250 TPS - And Now - Handle daily traffic from 40 million user base with no lag 19 Result
  15. 20 - Akka Stream/Alpakka Kafka helps - Easily control and

    tune the flow - Take care of back-pressure, in-order process, global error handling - Be careful about blocking operation - Asynchronous operator in Akka Stream again helps a lot - Reduce unnecessary operations before detail tuning - As user traffic continues to grow, there will always be new challenges Summary