Upgrade to Pro — share decks privately, control downloads, hide ads and more …

last N: Relevance-Based Selectivity for Forwarding Video in Multimedia Conferences

last N: Relevance-Based Selectivity for Forwarding Video in Multimedia Conferences

Presented by Boris Grozev, jitsi.org at NOSSDAV 2015.

Varun Singh

March 20, 2015
Tweet

More Decks by Varun Singh

Other Decks in Research

Transcript

  1. Last N: Relevance-Based Selectivity for
    Forwarding Video in Multimedia Conferences
    1
    Boris Grozev, Jitsi.org, University of Strasbourg
    Lyubomir Marinov, Jitsi.org
    Varun Singh, Aalto University, Finland
    Emil Ivov, Jitsi.org
    20/03/15

    View full-size slide

  2. SCALING VIDEO CONFERENCES
    •  Goals:
    – Bigger conferences
    – More on one server
    2
    20/03/15
    •  What kind of conf.:
    – Centralized
    – WebRTC
    – RTP
    – Meetings, Presentations
    – Dynamic, Interactive

    View full-size slide

  3. 3
    C
    A
    B
    videobridge
    jitsi
    MIX
    Forwarding
    (SFU)
    Mixing
    (MCU)
    16/03/15

    View full-size slide

  4. 4
    C
    A
    B
    videobridge
    jitsi
    SFU
    15/03/15
    JITSI
    VIDEOBRIDGE
    •  WebRTC-compatible video router
    •  ICE; DTLS-SRTP; SRTP; SCTP;
    • RTCP Termination
    •  RR, REMB
    •  SR
    • SRTP Termination
    •  (Relatively) Expensive

    View full-size slide

  5. PROBLEMS(1): ON THE CLIENTS
    15/03/15 5
    1.  Bandwidth (down): 99 streams * 1Mbps
    2.  CPU usage: live decoding of 99 streams
    3.  User interface: no space for 99 video elements
    For a conf of 100:

    View full-size slide

  6. 17/03/15 6
    1.  Downstream bandwidth
    •  Proportional to K
    •  100 endpoints = 100Mbps
    2.  Upstream bandwidth
    •  Proportional to K2
    •  100 endpoints = 9900 streams = 9.9Gbps
    PROBLEMS (2): ON THE BRIDGE
    C
    A
    B
    videobridge
    jitsi
    3. CPU
    •  Proportional to the total bitrate
    •  Proportional to K2
    For a conf of 100

    View full-size slide

  7. 7
    LAST N
    16/03/15
    videobridge
    jitsi
    N=2

    View full-size slide

  8. 8
    LAST N: EFFECT
    16/03/15
    C
    B
    videobridge
    jitsi
    K = 100; N = 5;
    Clients:
    Receive/Decode/Render 5 streams (was 99)
    SFU:
    Downstream: still 100 streams (100Mbps)
    Upstream: K * N
    500Mbps (was 9.9Gbps)
    N is constant:
    Linear with K (was K2)

    View full-size slide

  9. 9
    LAST N: PAUSING
    16/03/15
    videobridge
    jitsi
    Clients:
    No encoding, no upstream
    SFU:
    Downstream: N+1 instead of K

    View full-size slide

  10. DOMINANT SPEAKER IDENTIFICATION
    •  Requirement for Last N
    •  The naïve approach doesn’t work
    – Different microphones / configuration
    – Different sound levels in the environment
    – Coughs
    20/03/15 10

    View full-size slide

  11. DOMINANT SPEAKER IDENTIFICATION
    •  SotA: I. Volfin and I. Cohen 2013[1]
    •  Maintains a dominant speaker (DS)
    –  Others compete
    –  Detects changes to the DS
    •  Computes scores over intervals with different length
    –  Short, medium, long
    –  Thresholds at each interval
    •  Works with audio in the frequency domain
    –  Requires decoding
    [1] Dominant speaker identification for multipoint videoconferencing, Computer Speech and Language
    Volume 27 Issue 4, June, 2013
    20/03/15 11

    View full-size slide

  12. DOMINANT SPEAKER IDENTIFICATION
    •  RFC6464: Client-to-Mixer Audio Level Indication
    – RTP header extension
    – 7 bits that indicate the level of the audio in an RTP packet
    •  Adapt Volfin and Cohen 2013
    – Same competition model
    – Same intervals
    •  Short (20ms), medium (100ms) and long (1000ms)
    – Same division in sub-bands
    20/03/15 12

    View full-size slide

  13. TESTBED
    16/03/15 13
    videobridge
    jitsi
    RECV ONLY
    JITSI HAMMER
    QUAD-CORE XEON
    3.7Ghz

    K

    View full-size slide

  14. K = 10, 15, 20, 25, 29, 33
    0
    5
    10
    15
    20
    25
    0 100 200 300 400 500 600
    0
    5
    10
    15
    20
    25
    CPU usage (%)
    Bitrate (Mbps)
    (47.6Mbps, 3.1%)
    (110.3Mbps, 5.1%)
    (199.4Mbps, 8.0%)
    (314.7Mbps, 11.7%)
    (425.5Mbps, 15.7%)
    (550.4Mbps, 20.3%)
    15/03/15 14

    View full-size slide

  15. 0
    50
    100
    150
    200
    250
    300
    350
    400
    450
    10 15 20 25 30
    0
    50
    100
    150
    200
    250
    300
    350
    400
    450
    Mbps outbound
    Number of endpoints (K)
    n=3
    n=5
    n=8
    n=-1
    15/03/15 15

    View full-size slide

  16. Pausing
    0
    50
    100
    150
    200
    250
    300
    350
    400
    450
    0
    5
    10
    15
    20
    25
    30
    0
    50
    100
    150
    200
    250
    300
    350
    400
    450
    Bitrate (Mbps)
    Last N
    video not paused
    video paused
    15/03/15 16

    View full-size slide

  17. 0
    1
    2
    3
    4
    5
    6
    7
    8
    9
    10
    11
    12
    13
    14
    0
    5
    10
    15
    20
    25
    30
    0
    1
    2
    3
    4
    5
    6
    7
    8
    9
    10
    11
    12
    13
    14
    CPU usage (%)
    Last N
    video not paused
    video paused
    16/03/15 17
    Pausing

    View full-size slide

  18. CONCLUSION
    •  Conferences with forwarding use a lot of bandwidth
    •  The bottleneck at the server is either the CPU or the network
    –  In most cases the network
    •  Cutting down the number of forwarded streams to a constant
    works as expected
    •  DSI can be used to maintain interactivity in the conference
    •  DSI can be performed without decoding audio
    •  Future work
    –  Adaptive Last N
    15/03/15 18

    View full-size slide

  19. THANK YOU!
    16/03/15 19

    View full-size slide