Upgrade to Pro — share decks privately, control downloads, hide ads and more …

LINE LIVE: Features & optimizations to support ...

LINE LIVE: Features & optimizations to support artists & fans

LINE DevDay 2020

November 26, 2020
Tweet

More Decks by LINE DevDay 2020

Other Decks in Technology

Transcript

  1. Agenda › Overview of LINE LIVE › 2 Challenge in

    2020 › New Features › LINE Face2Face › LINE LIVE-VIEWING › Load Testing for Performance Visualization
  2. › Video streaming and hosting service › Streams from celebrities,

    companies and other users › Chat communication LINE LIVE
  3. Object Storage Media Servers Host CDN LINE Talk Server Billing

    CMS API Server LINE LIVE Architecture RTMP Chat Server JSON API Upload HLS files Cache WebSocket Fetch .m3u8 and .ts LINE LIVE Player
  4. Object Storage Media Servers Host CDN LINE Talk Server LINE

    LIVE Player Billing CMS API Server LINE LIVE Architecture Chat Server JSON API Upload HLS files Cache WebSocket Fetch .m3u8 and .ts RTMP
  5. Object Storage Media Servers Host CDN LINE Talk Server LINE

    LIVE Player Billing CMS API Server LINE LIVE Architecture Chat Server JSON API Upload HLS files Cache WebSocket Fetch .m3u8 and .ts RTMP
  6. Object Storage Media Servers Host CDN LINE Talk Server Billing

    CMS API Server LINE LIVE Architecture Chat Server Upload HLS files Cache Fetch .m3u8 and .ts JSON API WebSocket LINE LIVE Player RTMP
  7. Object Storage Media Servers Host CDN LINE Talk Server Billing

    CMS API Server LINE LIVE Architecture Chat Server JSON API Upload HLS files Cache LINE App WebSocket Fetch .m3u8 and .ts RTMP
  8. Object Storage Media Servers Host CDN LINE Talk Server Billing

    CMS LINE LIVE Architecture RTMP Chat Server Upload HLS files Cache WebSocket Fetch .m3u8 and .ts LINE LIVE Player API Service API Servers Redis MySQL Kafka Consumer Servers JSON API
  9. History of LINE LIVE 2020 › Direct messaging › LINE

    LIVE-VIEWING › Face2Face 2019 › Paid broadcast › Subscription channel 2017 › Twitter login › Design renewal 2015 Dec. › Service release 2016 › Gift item sending › Broadcasts from end users 2018 › Collaboration broadcast
  10. › “Akushu-kai” on online › 1 on 1 video communication

    to celebrities › Pre-distributed codes are needed to use. LINE Face2Face
  11. Overview of Face2Face Idol Fan Queue Fan 15 s 30s

    45s 15s Pop queue › Fans can talk to their idols during their own time › Fans wait in the queue › Video call is implemented by using RTMP Video Call
  12. Video Call via RTMP › RTMP (Real Time Messaging Protocol)

    › On TCP › Use 2 RTMP streams Fan Idol Media Servers Upstream with RTMP URL for Idol Playback with RTMP URL for Fan Playback with RTMP URL for Idol Upstream with RTMP URL For Fan Give RTMP URLs Give RTMP URLs
  13. Video Call via RTMP › RTMP (Real Time Messaging Protocol)

    › On TCP › Use 2 RTMP streams Fan Idol Media Servers Upstream with RTMP URL for Idol Playback with RTMP URL for Fan Playback with RTMP URL for Idol Upstream with RTMP URL For Fan Give RTMP URLs Give RTMP URLs
  14. Video Call via RTMP › RTMP (Real Time Messaging Protocol)

    › On TCP › Use 2 RTMP streams Fan Idol Media Servers Upstream with RTMP URL for Idol Playback with RTMP URL for Fan Playback with RTMP URL for Idol Upstream with RTMP URL For Fan Give RTMP URLs Give RTMP URLs
  15. Backend of Face2Face 1. Enter to Fan-Queue (after token and

    other auth) 3. RTMP Resource Allocation & Sharing 4. Start Face2Face & Finish 2. Pop a fan from queue Flow of Face2Face
  16. 1. Enter to Fan-Queue (after token and other auth) Fan

    Idol Chat Servers Batch Servers API Servers Media Servers DB 2. Verify tickets 1. Submit tickets 3. Push to “Fan Queue” Flow of Face2Face Backend of Face2Face
  17. 1. Enter to Fan-Queue (after token and other auth) Fan

    Idol Chat Servers Batch Servers API Servers Media Servers DB Flow of Face2Face Backend of Face2Face 4. Pop a fan from Fan Queue 2. Pop a fan from queue
  18. 1. Enter to Fan-Queue (after token and other auth) Fan

    Idol Chat Servers Batch Servers API Servers Media Servers DB Flow of Face2Face Backend of Face2Face 2. Pop a fan from queue 5. Acquire RTMP stream for Idol and Fan 6. Save fan and stream info 7. Notify collaboration start and give RTMP stream url of Idol 3. RTMP Resource Allocation & Sharing
  19. 1. Enter to Fan-Queue (after token and other auth) Fan

    Idol Chat Servers Batch Servers API Servers Media Servers DB Flow of Face2Face Backend of Face2Face 2. Pop a fan from queue 3. RTMP Resource Allocation & Sharing 4. Start Face2Face & Finish Loop until queue size is 0 8. Stop Face2Face 9. Re-pop a fan from Fan Queue
  20. Development of Face2Face › Utilizing existing function › “Collaboration Broadcast”

    › “Task-force style” development › Developers join planning more aggressively than usual
  21. › Online ticket-based streaming › Users can watch a broadcast

    by pre- purchasing an online ticket LINE LIVE-VIEWING
  22. Overview of LINE LIVE-VIEWING API Servers 1. Buy the ticket

    MySQL 2. Authorize the user Ticket sales site 3. Request HLS files 4. Check permission › The LINE user account is authorized by buying a ticket › User can watch the video via LINE, LINE LIVE App and LINE LIVE Web LINE LIVE App
  23. Traffic Spike Problem in LINE LIVE-VIEWING › In ticket sales

    site › Spike traffic is come after sales start because of limitation of num of tickets › In broadcast playback › Spike traffic is come after broadcast start
  24. High Volume Traffic Handling in Ticket Site › Caching with

    Redis › In-memory key-value data store › RARELY not touch MySQL for read during users’ purchase flow Home Detail › Referred to ISUCON8
  25. Cache for “Home” Batch Fetch live list from MySQL per

    1min API Cache as “Sorted List”
  26. Leave footprint of the user Check permission High Volume Traffic

    Handling in broadcast playback › Caching and async write › Cache all authorized user IDs on Redis before broadcast finished › Update ticket status after broadcast finished API Get broadcast url Batch Fetch footprint Update status After broadcast finishing › Not use Kafka because a lot of kafka job will created
  27. Leave footprint of the user Check permission High Volume Traffic

    Handling in broadcast playback › Caching and async write › Cache all authorized user IDs on Redis before broadcast finished › Update ticket status after broadcast finished API Get broadcast url Batch Fetch footprint Update status After broadcast finishing › Not use Kafka because a lot of kafka job will created
  28. Performance Testing of Ticket Purchase Flow › Performed ticket purchase

    spike test › Scenario-based test /home => /detail => /reserve => /purchase API Server GET /home GET /detail POST /reserve › Can handle over 200 purchase / second per 1 server
  29. Phase Change of LINE LIVE › New feature: LINE LIVE-VIEWING

    › “Pre-purchase” model › Traffic Increase › Service growth and COVID-19 situation › Big Incident › Because of technical debt and lack of performance measurement Increasing demand for high reliability
  30. Incident in early 2020 › 1 hour service down time

    › Because of MySQL connection handling problem + Redis network bandwidth limitation Not only normal testing, but also load testing is needed. API Server Connection Pool for MySQL Don’t return until API response returned Get connection Blocked to get cache because of limitation of network bandwidth
  31. Why Load Testing › Goal Make the system handle large-scale

    broadcast stably › Process to achieve the goal Analyze Results Find Defects Fix Defects Large Workload
  32. Why Load Testing › Goal Make the system handle large-scale

    broadcast stably Daily Access Irregular Spike Access Artificial Spike by Load Testing › Insufficient load › Long interval › Sufficient load › No interval
  33. Why Load Testing › Goal Make the system handle large-scale

    broadcast stably Daily Access Irregular Spike Access Artificial Spike by Load Testing › Insufficient load › Long interval › Sufficient load › No interval
  34. Why Load Testing › Goal Make the system handle large-scale

    broadcast stably Daily Access Irregular Spike Access Artificial Spike by Load Testing › Insufficient load › Long interval › Sufficient load › No interval
  35. Requirements for Load Testing Repeatability - easy to re-execute tests

    by everyone in both of same and different conditions Edit-ability - easy to edit, create and share scenarios Understandability - easy to understand the result right after the test execution
  36. Requirements for Load Testing Repeatability - easy to re-execute tests

    by everyone in both of same and different conditions Edit-ability - easy to edit and create scenarios Understandability - easy to understand the result right after the test execution
  37. Requirements for Load Testing Repeatability - easy to re-execute tests

    by everyone in both of same and different conditions Understandability - easy to understand the result right after the test execution Edit-ability - easy to edit and create scenarios
  38. Requirements for Load Testing Repeatability - easy to re-execute tests

    by everyone in both of same and different conditions Edit-ability - easy to edit and create scenarios Understandability - easy to understand the result right after the test execution
  39. › Engineers can execute test via slack › Engineers can

    specify “load variable” like › simultaneous viewers num “.stress” for Load Testing Repeatability
  40. › The result is notified to Slack › consists of

    3 type of info Result Notification Understandability
  41. › Abstract of the result is notified to Slack Result

    Notification: Binary Result Understandability Test Passed Test Failed
  42. › Defined as › All HTTP status code is 200

    › 99 percentile of API response time is lower than threshold What is ‘Passed’? Understandability
  43. › Client-side metrics are notified › RPS › Response time

    of each API Result Notification: Client-side Metrics Understandability
  44. › Server-side metrics are notified as Grafana dashboard › Loadavg

    › CPU usage › Active MySQL connection count › etc. Result Notification: Sever-side Metrics Understandability
  45. Open Source Load Testing Engine: k6 › CLI based open

    source load testing tool › Scripting scenarios in JavaScript ES2015/ES6 https://k6.io/
  46. Load Testing Architecture Slack Bot API Lode Test Manager GitHub

    Sync scenario files Load Test Node Load Test Node Load Test Node Target Servers with exporters .stress and scenario ID Alert Manager Detect dashboard urls and send images Datasource
  47. Load Testing Architecture Slack Bot API Lode Test Manager GitHub

    Sync scenario files Load Test Node Load Test Node Load Test Node Target Servers with exporters .stress and scenario ID Alert Manager Detect dashboard urls and send images Datasource
  48. Load Testing Architecture Slack Bot API Lode Test Manager GitHub

    Sync scenario files Load Test Node Load Test Node Load Test Node Target Servers with exporters .stress and scenario ID Alert Manager Detect dashboard urls and send images Datasource
  49. Load Testing Architecture Slack Bot API Lode Test Manager GitHub

    Sync scenario files Load Test Node Load Test Node Load Test Node Target Servers with exporters .stress and scenario ID Alert Manager Detect dashboard urls and send images Datasource
  50. 2 Type Load Testing › Single API Test › To

    visualize single API’s performance › Not all API, but frequently called API › Scenario based Test › To visualize system performance against the particular scenario › ex. LIVE-VIEWING broadcast playback spike from App users Broadcast Detail API Authentication API Channel API Broadcast Detail API …
  51. Test Environment Real Env >200 Load Test Env 10% of

    real › 10% servers of each components › 100% is costly › 1 instance can’t place a stress on MySQL and Redis enough
  52. Example of performance degradation detection › N+1 query executions in

    frequently called API Load Test Result Detected defect (psuedo code)
  53. Summary › New Features: LINE Face2Face and LINE LIVE-VIEWING ›

    Load Testing in LINE LIVE › Repeatability, Understandability and Edit-ability