Slide 1

Slide 1 text

No content

Slide 2

Slide 2 text

Agenda › Overview of LINE LIVE › 2 Challenge in 2020 › New Features › LINE Face2Face › LINE LIVE-VIEWING › Load Testing for Performance Visualization

Slide 3

Slide 3 text

› Video streaming and hosting service › Streams from celebrities, companies and other users › Chat communication LINE LIVE

Slide 4

Slide 4 text

Traffic Characteristics › Spike access by popular broadcast 400000 18:50 18:58 <- 32K RPS <- 3K RPS

Slide 5

Slide 5 text

Object Storage Media Servers Host CDN LINE Talk Server Billing CMS API Server LINE LIVE Architecture RTMP Chat Server JSON API Upload HLS files Cache WebSocket Fetch .m3u8 and .ts LINE LIVE Player

Slide 6

Slide 6 text

Object Storage Media Servers Host CDN LINE Talk Server LINE LIVE Player Billing CMS API Server LINE LIVE Architecture Chat Server JSON API Upload HLS files Cache WebSocket Fetch .m3u8 and .ts RTMP

Slide 7

Slide 7 text

Object Storage Media Servers Host CDN LINE Talk Server LINE LIVE Player Billing CMS API Server LINE LIVE Architecture Chat Server JSON API Upload HLS files Cache WebSocket Fetch .m3u8 and .ts RTMP

Slide 8

Slide 8 text

Object Storage Media Servers Host CDN LINE Talk Server Billing CMS API Server LINE LIVE Architecture Chat Server Upload HLS files Cache Fetch .m3u8 and .ts JSON API WebSocket LINE LIVE Player RTMP

Slide 9

Slide 9 text

Object Storage Media Servers Host CDN LINE Talk Server Billing CMS API Server LINE LIVE Architecture Chat Server JSON API Upload HLS files Cache LINE App WebSocket Fetch .m3u8 and .ts RTMP

Slide 10

Slide 10 text

Object Storage Media Servers Host CDN LINE Talk Server Billing CMS LINE LIVE Architecture RTMP Chat Server Upload HLS files Cache WebSocket Fetch .m3u8 and .ts LINE LIVE Player API Service API Servers Redis MySQL Kafka Consumer Servers JSON API

Slide 11

Slide 11 text

History of LINE LIVE 2020 › Direct messaging › LINE LIVE-VIEWING › Face2Face 2019 › Paid broadcast › Subscription channel 2017 › Twitter login › Design renewal 2015 Dec. › Service release 2016 › Gift item sending › Broadcasts from end users 2018 › Collaboration broadcast

Slide 12

Slide 12 text

2 Challenges in COVID-19 Situation New Features Development Performance Improvement

Slide 13

Slide 13 text

New Features and Behind the Scenes LINE Face2Face LINE LIVE-VIEWING

Slide 14

Slide 14 text

› “Akushu-kai” on online › 1 on 1 video communication to celebrities › Pre-distributed codes are needed to use. LINE Face2Face

Slide 15

Slide 15 text

Overview of Face2Face Idol Fan Queue Fan 15 s 30s 45s 15s Pop queue › Fans can talk to their idols during their own time › Fans wait in the queue › Video call is implemented by using RTMP Video Call

Slide 16

Slide 16 text

Video Call via RTMP › RTMP (Real Time Messaging Protocol) › On TCP › Use 2 RTMP streams Fan Idol Media Servers Upstream with RTMP URL for Idol Playback with RTMP URL for Fan Playback with RTMP URL for Idol Upstream with RTMP URL For Fan Give RTMP URLs Give RTMP URLs

Slide 17

Slide 17 text

Video Call via RTMP › RTMP (Real Time Messaging Protocol) › On TCP › Use 2 RTMP streams Fan Idol Media Servers Upstream with RTMP URL for Idol Playback with RTMP URL for Fan Playback with RTMP URL for Idol Upstream with RTMP URL For Fan Give RTMP URLs Give RTMP URLs

Slide 18

Slide 18 text

Video Call via RTMP › RTMP (Real Time Messaging Protocol) › On TCP › Use 2 RTMP streams Fan Idol Media Servers Upstream with RTMP URL for Idol Playback with RTMP URL for Fan Playback with RTMP URL for Idol Upstream with RTMP URL For Fan Give RTMP URLs Give RTMP URLs

Slide 19

Slide 19 text

Backend of Face2Face 1. Enter to Fan-Queue (after token and other auth) 3. RTMP Resource Allocation & Sharing 4. Start Face2Face & Finish 2. Pop a fan from queue Flow of Face2Face

Slide 20

Slide 20 text

1. Enter to Fan-Queue (after token and other auth) Fan Idol Chat Servers Batch Servers API Servers Media Servers DB 2. Verify tickets 1. Submit tickets 3. Push to “Fan Queue” Flow of Face2Face Backend of Face2Face

Slide 21

Slide 21 text

1. Enter to Fan-Queue (after token and other auth) Fan Idol Chat Servers Batch Servers API Servers Media Servers DB Flow of Face2Face Backend of Face2Face 4. Pop a fan from Fan Queue 2. Pop a fan from queue

Slide 22

Slide 22 text

1. Enter to Fan-Queue (after token and other auth) Fan Idol Chat Servers Batch Servers API Servers Media Servers DB Flow of Face2Face Backend of Face2Face 2. Pop a fan from queue 5. Acquire RTMP stream for Idol and Fan 6. Save fan and stream info 7. Notify collaboration start and give RTMP stream url of Idol 3. RTMP Resource Allocation & Sharing

Slide 23

Slide 23 text

1. Enter to Fan-Queue (after token and other auth) Fan Idol Chat Servers Batch Servers API Servers Media Servers DB Flow of Face2Face Backend of Face2Face 2. Pop a fan from queue 3. RTMP Resource Allocation & Sharing 4. Start Face2Face & Finish Loop until queue size is 0 8. Stop Face2Face 9. Re-pop a fan from Fan Queue

Slide 24

Slide 24 text

Development of Face2Face › Utilizing existing function › “Collaboration Broadcast” › “Task-force style” development › Developers join planning more aggressively than usual

Slide 25

Slide 25 text

› Online ticket-based streaming › Users can watch a broadcast by pre- purchasing an online ticket LINE LIVE-VIEWING

Slide 26

Slide 26 text

Overview of LINE LIVE-VIEWING API Servers 1. Buy the ticket MySQL 2. Authorize the user Ticket sales site 3. Request HLS files 4. Check permission › The LINE user account is authorized by buying a ticket › User can watch the video via LINE, LINE LIVE App and LINE LIVE Web LINE LIVE App

Slide 27

Slide 27 text

Traffic Spike Problem in LINE LIVE-VIEWING › In ticket sales site › Spike traffic is come after sales start because of limitation of num of tickets › In broadcast playback › Spike traffic is come after broadcast start

Slide 28

Slide 28 text

High Volume Traffic Handling in Ticket Site › Caching with Redis › In-memory key-value data store › RARELY not touch MySQL for read during users’ purchase flow Home Detail › Referred to ISUCON8

Slide 29

Slide 29 text

Cache for “Home” Batch Fetch live list from MySQL per 1min API Cache as “Sorted List”

Slide 30

Slide 30 text

Leave footprint of the user Check permission High Volume Traffic Handling in broadcast playback › Caching and async write › Cache all authorized user IDs on Redis before broadcast finished › Update ticket status after broadcast finished API Get broadcast url Batch Fetch footprint Update status After broadcast finishing › Not use Kafka because a lot of kafka job will created

Slide 31

Slide 31 text

Leave footprint of the user Check permission High Volume Traffic Handling in broadcast playback › Caching and async write › Cache all authorized user IDs on Redis before broadcast finished › Update ticket status after broadcast finished API Get broadcast url Batch Fetch footprint Update status After broadcast finishing › Not use Kafka because a lot of kafka job will created

Slide 32

Slide 32 text

Performance Testing of Ticket Purchase Flow › Performed ticket purchase spike test › Scenario-based test /home => /detail => /reserve => /purchase API Server GET /home GET /detail POST /reserve › Can handle over 200 purchase / second per 1 server

Slide 33

Slide 33 text

Load Testing for Performance Visualization

Slide 34

Slide 34 text

Phase Change of LINE LIVE › New feature: LINE LIVE-VIEWING › “Pre-purchase” model › Traffic Increase › Service growth and COVID-19 situation › Big Incident › Because of technical debt and lack of performance measurement Increasing demand for high reliability

Slide 35

Slide 35 text

Incident in early 2020 › 1 hour service down time › Because of MySQL connection handling problem + Redis network bandwidth limitation Not only normal testing, but also load testing is needed. API Server Connection Pool for MySQL Don’t return until API response returned Get connection Blocked to get cache because of limitation of network bandwidth

Slide 36

Slide 36 text

Why Load Testing › Goal Make the system handle large-scale broadcast stably › Process to achieve the goal Analyze Results Find Defects Fix Defects Large Workload

Slide 37

Slide 37 text

Why Load Testing › Goal Make the system handle large-scale broadcast stably Daily Access Irregular Spike Access Artificial Spike by Load Testing › Insufficient load › Long interval › Sufficient load › No interval

Slide 38

Slide 38 text

Why Load Testing › Goal Make the system handle large-scale broadcast stably Daily Access Irregular Spike Access Artificial Spike by Load Testing › Insufficient load › Long interval › Sufficient load › No interval

Slide 39

Slide 39 text

Why Load Testing › Goal Make the system handle large-scale broadcast stably Daily Access Irregular Spike Access Artificial Spike by Load Testing › Insufficient load › Long interval › Sufficient load › No interval

Slide 40

Slide 40 text

Requirements for Load Testing Repeatability - easy to re-execute tests by everyone in both of same and different conditions Edit-ability - easy to edit, create and share scenarios Understandability - easy to understand the result right after the test execution

Slide 41

Slide 41 text

Requirements for Load Testing Repeatability - easy to re-execute tests by everyone in both of same and different conditions Edit-ability - easy to edit and create scenarios Understandability - easy to understand the result right after the test execution

Slide 42

Slide 42 text

Requirements for Load Testing Repeatability - easy to re-execute tests by everyone in both of same and different conditions Understandability - easy to understand the result right after the test execution Edit-ability - easy to edit and create scenarios

Slide 43

Slide 43 text

Requirements for Load Testing Repeatability - easy to re-execute tests by everyone in both of same and different conditions Edit-ability - easy to edit and create scenarios Understandability - easy to understand the result right after the test execution

Slide 44

Slide 44 text

› Engineers can execute test via slack › Engineers can specify “load variable” like › simultaneous viewers num “.stress” for Load Testing Repeatability

Slide 45

Slide 45 text

› The result is notified to Slack › consists of 3 type of info Result Notification Understandability

Slide 46

Slide 46 text

› Abstract of the result is notified to Slack Result Notification: Binary Result Understandability Test Passed Test Failed

Slide 47

Slide 47 text

› Defined as › All HTTP status code is 200 › 99 percentile of API response time is lower than threshold What is ‘Passed’? Understandability

Slide 48

Slide 48 text

› Client-side metrics are notified › RPS › Response time of each API Result Notification: Client-side Metrics Understandability

Slide 49

Slide 49 text

› Server-side metrics are notified as Grafana dashboard › Loadavg › CPU usage › Active MySQL connection count › etc. Result Notification: Sever-side Metrics Understandability

Slide 50

Slide 50 text

› Only push to scenario repository Senario Updating Edit-ability sync

Slide 51

Slide 51 text

Open Source Load Testing Engine: k6 › CLI based open source load testing tool › Scripting scenarios in JavaScript ES2015/ES6 https://k6.io/

Slide 52

Slide 52 text

Load Testing Architecture Slack Bot API Lode Test Manager GitHub Sync scenario files Load Test Node Load Test Node Load Test Node Target Servers with exporters .stress and scenario ID Alert Manager Detect dashboard urls and send images Datasource

Slide 53

Slide 53 text

Load Testing Architecture Slack Bot API Lode Test Manager GitHub Sync scenario files Load Test Node Load Test Node Load Test Node Target Servers with exporters .stress and scenario ID Alert Manager Detect dashboard urls and send images Datasource

Slide 54

Slide 54 text

Load Testing Architecture Slack Bot API Lode Test Manager GitHub Sync scenario files Load Test Node Load Test Node Load Test Node Target Servers with exporters .stress and scenario ID Alert Manager Detect dashboard urls and send images Datasource

Slide 55

Slide 55 text

Load Testing Architecture Slack Bot API Lode Test Manager GitHub Sync scenario files Load Test Node Load Test Node Load Test Node Target Servers with exporters .stress and scenario ID Alert Manager Detect dashboard urls and send images Datasource

Slide 56

Slide 56 text

2 Type Load Testing › Single API Test › To visualize single API’s performance › Not all API, but frequently called API › Scenario based Test › To visualize system performance against the particular scenario › ex. LIVE-VIEWING broadcast playback spike from App users Broadcast Detail API Authentication API Channel API Broadcast Detail API …

Slide 57

Slide 57 text

Test Environment Real Env >200 Load Test Env 10% of real › 10% servers of each components › 100% is costly › 1 instance can’t place a stress on MySQL and Redis enough

Slide 58

Slide 58 text

Example of performance degradation detection › N+1 query executions in frequently called API Load Test Result Detected defect (psuedo code)

Slide 59

Slide 59 text

Example of performance degradation detection › MySQL connection handling problem AGAIN

Slide 60

Slide 60 text

Summary › New Features: LINE Face2Face and LINE LIVE-VIEWING › Load Testing in LINE LIVE › Repeatability, Understandability and Edit-ability

Slide 61

Slide 61 text

Thank you!