Upgrade to Pro — share decks privately, control downloads, hide ads and more …

LINE LIVE: Features & optimizations to support artists & fans

LINE LIVE: Features & optimizations to support artists & fans

Eebedc2ee7ff95ffb9d9102c6d4a065c?s=128

LINE DevDay 2020

November 26, 2020
Tweet

Transcript

  1. None
  2. Agenda › Overview of LINE LIVE › 2 Challenge in

    2020 › New Features › LINE Face2Face › LINE LIVE-VIEWING › Load Testing for Performance Visualization
  3. › Video streaming and hosting service › Streams from celebrities,

    companies and other users › Chat communication LINE LIVE
  4. Traffic Characteristics › Spike access by popular broadcast 400000 18:50

    18:58 <- 32K RPS <- 3K RPS
  5. Object Storage Media Servers Host CDN LINE Talk Server Billing

    CMS API Server LINE LIVE Architecture RTMP Chat Server JSON API Upload HLS files Cache WebSocket Fetch .m3u8 and .ts LINE LIVE Player
  6. Object Storage Media Servers Host CDN LINE Talk Server LINE

    LIVE Player Billing CMS API Server LINE LIVE Architecture Chat Server JSON API Upload HLS files Cache WebSocket Fetch .m3u8 and .ts RTMP
  7. Object Storage Media Servers Host CDN LINE Talk Server LINE

    LIVE Player Billing CMS API Server LINE LIVE Architecture Chat Server JSON API Upload HLS files Cache WebSocket Fetch .m3u8 and .ts RTMP
  8. Object Storage Media Servers Host CDN LINE Talk Server Billing

    CMS API Server LINE LIVE Architecture Chat Server Upload HLS files Cache Fetch .m3u8 and .ts JSON API WebSocket LINE LIVE Player RTMP
  9. Object Storage Media Servers Host CDN LINE Talk Server Billing

    CMS API Server LINE LIVE Architecture Chat Server JSON API Upload HLS files Cache LINE App WebSocket Fetch .m3u8 and .ts RTMP
  10. Object Storage Media Servers Host CDN LINE Talk Server Billing

    CMS LINE LIVE Architecture RTMP Chat Server Upload HLS files Cache WebSocket Fetch .m3u8 and .ts LINE LIVE Player API Service API Servers Redis MySQL Kafka Consumer Servers JSON API
  11. History of LINE LIVE 2020 › Direct messaging › LINE

    LIVE-VIEWING › Face2Face 2019 › Paid broadcast › Subscription channel 2017 › Twitter login › Design renewal 2015 Dec. › Service release 2016 › Gift item sending › Broadcasts from end users 2018 › Collaboration broadcast
  12. 2 Challenges in COVID-19 Situation New Features Development Performance Improvement

  13. New Features and Behind the Scenes LINE Face2Face LINE LIVE-VIEWING

  14. › “Akushu-kai” on online › 1 on 1 video communication

    to celebrities › Pre-distributed codes are needed to use. LINE Face2Face
  15. Overview of Face2Face Idol Fan Queue Fan 15 s 30s

    45s 15s Pop queue › Fans can talk to their idols during their own time › Fans wait in the queue › Video call is implemented by using RTMP Video Call
  16. Video Call via RTMP › RTMP (Real Time Messaging Protocol)

    › On TCP › Use 2 RTMP streams Fan Idol Media Servers Upstream with RTMP URL for Idol Playback with RTMP URL for Fan Playback with RTMP URL for Idol Upstream with RTMP URL For Fan Give RTMP URLs Give RTMP URLs
  17. Video Call via RTMP › RTMP (Real Time Messaging Protocol)

    › On TCP › Use 2 RTMP streams Fan Idol Media Servers Upstream with RTMP URL for Idol Playback with RTMP URL for Fan Playback with RTMP URL for Idol Upstream with RTMP URL For Fan Give RTMP URLs Give RTMP URLs
  18. Video Call via RTMP › RTMP (Real Time Messaging Protocol)

    › On TCP › Use 2 RTMP streams Fan Idol Media Servers Upstream with RTMP URL for Idol Playback with RTMP URL for Fan Playback with RTMP URL for Idol Upstream with RTMP URL For Fan Give RTMP URLs Give RTMP URLs
  19. Backend of Face2Face 1. Enter to Fan-Queue (after token and

    other auth) 3. RTMP Resource Allocation & Sharing 4. Start Face2Face & Finish 2. Pop a fan from queue Flow of Face2Face
  20. 1. Enter to Fan-Queue (after token and other auth) Fan

    Idol Chat Servers Batch Servers API Servers Media Servers DB 2. Verify tickets 1. Submit tickets 3. Push to “Fan Queue” Flow of Face2Face Backend of Face2Face
  21. 1. Enter to Fan-Queue (after token and other auth) Fan

    Idol Chat Servers Batch Servers API Servers Media Servers DB Flow of Face2Face Backend of Face2Face 4. Pop a fan from Fan Queue 2. Pop a fan from queue
  22. 1. Enter to Fan-Queue (after token and other auth) Fan

    Idol Chat Servers Batch Servers API Servers Media Servers DB Flow of Face2Face Backend of Face2Face 2. Pop a fan from queue 5. Acquire RTMP stream for Idol and Fan 6. Save fan and stream info 7. Notify collaboration start and give RTMP stream url of Idol 3. RTMP Resource Allocation & Sharing
  23. 1. Enter to Fan-Queue (after token and other auth) Fan

    Idol Chat Servers Batch Servers API Servers Media Servers DB Flow of Face2Face Backend of Face2Face 2. Pop a fan from queue 3. RTMP Resource Allocation & Sharing 4. Start Face2Face & Finish Loop until queue size is 0 8. Stop Face2Face 9. Re-pop a fan from Fan Queue
  24. Development of Face2Face › Utilizing existing function › “Collaboration Broadcast”

    › “Task-force style” development › Developers join planning more aggressively than usual
  25. › Online ticket-based streaming › Users can watch a broadcast

    by pre- purchasing an online ticket LINE LIVE-VIEWING
  26. Overview of LINE LIVE-VIEWING API Servers 1. Buy the ticket

    MySQL 2. Authorize the user Ticket sales site 3. Request HLS files 4. Check permission › The LINE user account is authorized by buying a ticket › User can watch the video via LINE, LINE LIVE App and LINE LIVE Web LINE LIVE App
  27. Traffic Spike Problem in LINE LIVE-VIEWING › In ticket sales

    site › Spike traffic is come after sales start because of limitation of num of tickets › In broadcast playback › Spike traffic is come after broadcast start
  28. High Volume Traffic Handling in Ticket Site › Caching with

    Redis › In-memory key-value data store › RARELY not touch MySQL for read during users’ purchase flow Home Detail › Referred to ISUCON8
  29. Cache for “Home” Batch Fetch live list from MySQL per

    1min API Cache as “Sorted List”
  30. Leave footprint of the user Check permission High Volume Traffic

    Handling in broadcast playback › Caching and async write › Cache all authorized user IDs on Redis before broadcast finished › Update ticket status after broadcast finished API Get broadcast url Batch Fetch footprint Update status After broadcast finishing › Not use Kafka because a lot of kafka job will created
  31. Leave footprint of the user Check permission High Volume Traffic

    Handling in broadcast playback › Caching and async write › Cache all authorized user IDs on Redis before broadcast finished › Update ticket status after broadcast finished API Get broadcast url Batch Fetch footprint Update status After broadcast finishing › Not use Kafka because a lot of kafka job will created
  32. Performance Testing of Ticket Purchase Flow › Performed ticket purchase

    spike test › Scenario-based test /home => /detail => /reserve => /purchase API Server GET /home GET /detail POST /reserve › Can handle over 200 purchase / second per 1 server
  33. Load Testing for Performance Visualization

  34. Phase Change of LINE LIVE › New feature: LINE LIVE-VIEWING

    › “Pre-purchase” model › Traffic Increase › Service growth and COVID-19 situation › Big Incident › Because of technical debt and lack of performance measurement Increasing demand for high reliability
  35. Incident in early 2020 › 1 hour service down time

    › Because of MySQL connection handling problem + Redis network bandwidth limitation Not only normal testing, but also load testing is needed. API Server Connection Pool for MySQL Don’t return until API response returned Get connection Blocked to get cache because of limitation of network bandwidth
  36. Why Load Testing › Goal Make the system handle large-scale

    broadcast stably › Process to achieve the goal Analyze Results Find Defects Fix Defects Large Workload
  37. Why Load Testing › Goal Make the system handle large-scale

    broadcast stably Daily Access Irregular Spike Access Artificial Spike by Load Testing › Insufficient load › Long interval › Sufficient load › No interval
  38. Why Load Testing › Goal Make the system handle large-scale

    broadcast stably Daily Access Irregular Spike Access Artificial Spike by Load Testing › Insufficient load › Long interval › Sufficient load › No interval
  39. Why Load Testing › Goal Make the system handle large-scale

    broadcast stably Daily Access Irregular Spike Access Artificial Spike by Load Testing › Insufficient load › Long interval › Sufficient load › No interval
  40. Requirements for Load Testing Repeatability - easy to re-execute tests

    by everyone in both of same and different conditions Edit-ability - easy to edit, create and share scenarios Understandability - easy to understand the result right after the test execution
  41. Requirements for Load Testing Repeatability - easy to re-execute tests

    by everyone in both of same and different conditions Edit-ability - easy to edit and create scenarios Understandability - easy to understand the result right after the test execution
  42. Requirements for Load Testing Repeatability - easy to re-execute tests

    by everyone in both of same and different conditions Understandability - easy to understand the result right after the test execution Edit-ability - easy to edit and create scenarios
  43. Requirements for Load Testing Repeatability - easy to re-execute tests

    by everyone in both of same and different conditions Edit-ability - easy to edit and create scenarios Understandability - easy to understand the result right after the test execution
  44. › Engineers can execute test via slack › Engineers can

    specify “load variable” like › simultaneous viewers num “.stress” for Load Testing Repeatability
  45. › The result is notified to Slack › consists of

    3 type of info Result Notification Understandability
  46. › Abstract of the result is notified to Slack Result

    Notification: Binary Result Understandability Test Passed Test Failed
  47. › Defined as › All HTTP status code is 200

    › 99 percentile of API response time is lower than threshold What is ‘Passed’? Understandability
  48. › Client-side metrics are notified › RPS › Response time

    of each API Result Notification: Client-side Metrics Understandability
  49. › Server-side metrics are notified as Grafana dashboard › Loadavg

    › CPU usage › Active MySQL connection count › etc. Result Notification: Sever-side Metrics Understandability
  50. › Only push to scenario repository Senario Updating Edit-ability sync

  51. Open Source Load Testing Engine: k6 › CLI based open

    source load testing tool › Scripting scenarios in JavaScript ES2015/ES6 https://k6.io/
  52. Load Testing Architecture Slack Bot API Lode Test Manager GitHub

    Sync scenario files Load Test Node Load Test Node Load Test Node Target Servers with exporters .stress and scenario ID Alert Manager Detect dashboard urls and send images Datasource
  53. Load Testing Architecture Slack Bot API Lode Test Manager GitHub

    Sync scenario files Load Test Node Load Test Node Load Test Node Target Servers with exporters .stress and scenario ID Alert Manager Detect dashboard urls and send images Datasource
  54. Load Testing Architecture Slack Bot API Lode Test Manager GitHub

    Sync scenario files Load Test Node Load Test Node Load Test Node Target Servers with exporters .stress and scenario ID Alert Manager Detect dashboard urls and send images Datasource
  55. Load Testing Architecture Slack Bot API Lode Test Manager GitHub

    Sync scenario files Load Test Node Load Test Node Load Test Node Target Servers with exporters .stress and scenario ID Alert Manager Detect dashboard urls and send images Datasource
  56. 2 Type Load Testing › Single API Test › To

    visualize single API’s performance › Not all API, but frequently called API › Scenario based Test › To visualize system performance against the particular scenario › ex. LIVE-VIEWING broadcast playback spike from App users Broadcast Detail API Authentication API Channel API Broadcast Detail API …
  57. Test Environment Real Env >200 Load Test Env 10% of

    real › 10% servers of each components › 100% is costly › 1 instance can’t place a stress on MySQL and Redis enough
  58. Example of performance degradation detection › N+1 query executions in

    frequently called API Load Test Result Detected defect (psuedo code)
  59. Example of performance degradation detection › MySQL connection handling problem

    AGAIN
  60. Summary › New Features: LINE Face2Face and LINE LIVE-VIEWING ›

    Load Testing in LINE LIVE › Repeatability, Understandability and Edit-ability
  61. Thank you!