Slide 1

Slide 1 text

Why do broadcasters suddenly care about IP – it’s not why you think! Kieran Kunhya – [email protected] @openbroadcastsy

Slide 2

Slide 2 text

Who am I, who are we? • I work on FFmpeg, x264 and others… • A lot related to professional video in OSS, probably has my fingerprints on it • At $job, Open Broadcast Systems builds software for broadcasters mainly around video point to point encoding/decoding for news/sport etc... • Not to be confused with:

Slide 3

Slide 3 text

What I will talk about • Moving live broadcast video production (long before the home) to IP. • Lots of stuff available about standards bodies, working groups, industry alliances, roadmaps, procedures. • Talk about engineering, not bureaucracy • The crossroads of high performance video programming and high performance network programming.

Slide 4

Slide 4 text

Live broadcast production processes (1) • Processes in black boxes, e.g: Routing, graphics, switching, mixing, recording, monitoring, playout, subtitling, standards conversion etc… • Infrastructure as complex if not more complex than delivery

Slide 5

Slide 5 text

Live broadcast production processes (2) • Heavily hardware (FPGA/DSP) centric. • Fixed function, black-box products • Low-latency processes in studio • “Video Lines” of latency – order of 10-100 us. • Uncompressed video - high data rates, many Gbps. • Legacy usage of satellite, fibre, SDI, ASI • Includes premium live web video!

Slide 6

Slide 6 text

There’s more to IP video than just web video! • A push to move these processes to IP • Allows for lower costs and innovation in live broadcasting • Broadcasters building facilities that quickly or immediately become obsolete (but still day-to- day usable). • More detail later…

Slide 7

Slide 7 text

Not just changing the cables! • Hardware vendors aiming to just change cables • Put converter before/after product. Done. • Invent the internet and use it to make more phone calls and send more faxes! • New vendors: Move to a software based architecture, allow scalability, reduce costs through economies of scale. (tl;dr move fast and break things).

Slide 8

Slide 8 text

The two cultures • Software • Asynchronous processing • Wide timings (ms) • Hardware • Deterministic processes • Precise timings (ns, us)

Slide 9

Slide 9 text

Video contribution • Getting content from a remote place to one or more central places, often studio or aggregation centre • Most often performed in compressed domain • A microcosm for uncompressed environment but simpler to understand.

Slide 10

Slide 10 text

Traditional Video contribution • Satellite, unidirectional, single-feed. • Expensive but very reliable • Fibre • Often using legacy/proprietary telco protocols (DTM) • Optical networking • IP relatively new, often via MPLS

Slide 11

Slide 11 text

Video contribution protocols (1) • UDP was meant for (professional) video! • Allows multicast in closed network • Low latency (no waiting for retransmits) • No TCP throughput issues, can use own ratecontrol, retransmit, FEC, whatever. • TCP Web video is a bizarre anachronism • QUIC protocol finally getting it… • Even more relevant with cellular, Wi-Fi etc. • Will BBR save TCP?

Slide 12

Slide 12 text

Video contribution protocols (2) • MPEG-TS is go-to container in professional world • Allows exact signalling of VBV for defined buffering and latency • Timing model relatively precise, not single-frame like WebRTC, or $finger_in_air like RTMP/HLS/DASH • PCR (clock reference) used to resync and resample audio to facility-clock. Can use audio clock (hacky). • Easier to carry legacy stuff.

Slide 13

Slide 13 text

Video contribution protocols (3) • Latency • ~300ms straightforward, glass-to-glass in software. • Hardware can get down to ~20 ms with JPEG2000 or VC-2, encoding before entire frame has fully arrived. • Work in FFmpeg on sub-frame latency. • Many protocols/services for unmanaged IP transport:

Slide 14

Slide 14 text

Using unmanaged IP for contribution (1) • SMPTE 2022-1/2 FEC • XOR based matrix (adds 2 * matrix latency) • Basic but wide support (albeit many broken implementations) Row FEC Column FEC

Slide 15

Slide 15 text

Using unmanaged IP for contribution (2) • Retransmits (aka ARQ) • Receiver requests sender to transmit a copy of lost packet. • Affected by round-trip latency • Negative acknowledgment Sender Receiver

Slide 16

Slide 16 text

Using unmanaged IP for contribution (3) • Dual-pathing (SMPTE 2022-7) • Hitless Switching Path1 Path2 Out

Slide 17

Slide 17 text

Example: NFL European Feed • Using software-based infrastructure for encoding, delivery and decoding. • Transported using multiple generic IP connections • Delivered NFL European Feed (inc SuperBowl) at 40Mbit/s

Slide 18

Slide 18 text

Remote (at-home) production IP Network, not the cloud! • Send all content back to base, and produce remotely • Saves lots of $$$ on hotels, planes etc • The groundwork for personalised live events • Can’t do that with traditional single “world-feed” broadcasts

Slide 19

Slide 19 text

The live production environment • Largely SDI (coax) based • Unidirectional, Gbps video • Latency on order of ~video lines • Single video down cable • Traditional SDI crosspoint routers • Maximum 1152x1152 • Limited cable lengths (esp. UHD) • Routers getting full, so much content, quad-link UHD etc…

Slide 20

Slide 20 text

From SDI to SFPs • Economies of scale in Networking world • 10Gbps ubiquitous, 100Gbps affordable • UHD-upgradable, much larger than SDI routers. • Some false-starts, media specific network switches…

Slide 21

Slide 21 text

Going SDI-over-IP in software • Map SDI datastream directly to IP packets (SMPTE 2022-6) • Audio packets on left • NTSC fields (separate!) • Old blanking intervals • Analogue pulses • Hipster container format?

Slide 22

Slide 22 text

Software SDI Challenges Where to start… • CRC not software-centric (10-bit data, 25-bit polynomial) • A pixel can span a packet… • Very tedious to build frame correctly, lots of legacy • Difficulty to verify, tools all hardware-based • (and lots of other implementation details)

Slide 23

Slide 23 text

Pixel formats Only YUV 4:2:2 domain (as example)! • Planar 10b – main working format • Planar 8b - preview quality • UYVY 10b (16-bit aligned) – SDI datastream • Apple v210 – hardware • Contiguous 10-bit – 2022-6/RFC4175 packing Tricky to work with in software.

Slide 24

Slide 24 text

Handwritten (no intrinsics!) SIMD for every mapping (and others). • 5-15x speed improvements compared to C • Do it once, make it fast once and for all (until new CPU…) • Generic conversion library a difficult problem • Intermediate pixel format(s) always a compromise • Add special cases until you’ve done them all! Pixel formats

Slide 25

Slide 25 text

No content

Slide 26

Slide 26 text

Kernel Bypass (1) Bypass the operating system - DPDK, Netmap, Registered I/O…and others Revisiting network I/O APIs: The netmap framework (Rizzo, Communications of the ACM, March 2012, Vol 55, No.3) • Ethernet speeds growing much faster than CPU clock speeds. • Costly OS dynamically allocated buffers, copying, system calls. • Userspace refcounting costly

Slide 27

Slide 27 text

Kernel Bypass (2) Revisiting network I/O APIs: The netmap framework (Rizzo, Communications of the ACM, March 2012, Vol 55, No.3) • Netmap allocates simpler, statically allocated packet buffers and descriptors • A simple, lower-level interface to NIC ring buffers. • Lets user batch packets, per system call.

Slide 28

Slide 28 text

Kernel Bypass (3) Zero-copy SIMD straight into DMA’d Network Card memory. 50-100x speed improvements over naïve implementations! https://www.youtube.com/watch?v=yLL8wl8Y UwA No network stack – You are the network stack • Craft Ethernet, IP, UDP headers yourself • No ARP, hardcoded MAC • Handle most of this in userspace, no separation

Slide 29

Slide 29 text

Packet Spacing Tiny buffers in hardware • Not specified, in practice 100us, about 15KB. • Measure packet arrivals to nanoseconds, understand NIC distribution and behaviour • Careful NIC programming, measurement and selection. Packet delta (ns)

Slide 30

Slide 30 text

Sudden outbreak of common sense? Recent proposals to use RFC4175 • Software-centric, no packets spanning packets • No more transmitting legacy blanking regions • No more CRC!

Slide 31

Slide 31 text

Nope… Most recent timing model • 4 packet buffer specified! • Packet gaps for blanking (yes, they remove them and effectively put them back!) • Tight for software implementations

Slide 32

Slide 32 text

The code V210 specific SIMD merged into FFmpeg http://obe.tv/about-us/obe-blog/item/21-faster-professional- 10-bit-video-conversions Broadcast specific stuff being merged into Upipe: http://github.com/cmassiot/upipe and http://github.com/kierank/upipe

Slide 33

Slide 33 text

Conclusion • Broadcast industry going through production transformation as big as move from analogue to digital (independent of any delivery changes). • It is possible to use standard servers to deliver precisely timed, uncompressed video signals. • Changes in the way live content is produced will allow for new viewer experiences!

Slide 34

Slide 34 text

Thanks • Colleagues Rafaël Carré, James Darnley and (formerly) Rostislav Pehlivanov. • Netmap team for discussions • BBC R&D IP Studio team for discussions and laying groundwork in this field, especially Jonathan Rosser and Stuart Grace.