Slide 1

Slide 1 text

Catching up Trends in Audio App Development @[email protected] The slides for this session is also available at https://speakerdeck.com/atsushieno

Slide 2

Slide 2 text

What's this? Audio development is fun and hard...! We rarely see this kind of technical session talk, so I'm giving it a try! The slides for this session is also available at https://speakerdeck.com/atsushieno

Slide 3

Slide 3 text

What can be "audio" topics? Not limited to these, but I don't observe everything...! ● Music creative tools (DAWs, synths, effectors...) ● Technology (MIDI, 3D audio, musicology and analysis, voice synthesis...) ● DSP, algorithms, mathematics and tools (FFT, FIR/IIR, MATLAB...) ● Programming tips for audio (C++ standards, realtime safety...) ● Audio plugin development (VST / AudioUnit / LV2 / CLAP, ARA, JUCE, ...) with "on the Web", "using AI", "in Rust" ...

Slide 4

Slide 4 text

What are the hot topics in these years ? Not limited to these, but I cannot cover everything, especially in 30 min. talk...! ● realtime processing in C++ (we've been discussing it forever!) ● AI utilization ● WebView in audio apps (including plugins) ● Changes in plugin development trends: JUCE, CLAP, choc, ... ● (moderate) MIDI2 adoption

Slide 5

Slide 5 text

Audio and C++: RT safety C++ is still dominant and audio is driving part of C++ innovations. Why C++? ● It is one of very few realtime-safe languages ● there are plenty of existing quality libraries (JUCE, choc, etc.)

Slide 6

Slide 6 text

Audio and C++: RT safety RT safety is hard to achieve, because you have to follow these principles: ● no allocations (no garbage collections) ● no locks (even std::try_lock is no-go) ● no time-unbound operations (such as system calls, JIT compiler, ...)

Slide 7

Slide 7 text

We consolidate audio thread to perform DSP and I/O work without ^. Then we need concurrency between the audio thread and other threads. DAW Code Plugin Code Audio and C++: audio threads audio thread audio device I/O file I/O worker network worker UI thread plugin DSP plugin processing Plugin UI DAW UI

Slide 8

Slide 8 text

Realtime safety: following the rules ● Real-time audio programming 101: time waits for nothing ● [ADC19] Real-time 101 - part I: Investigating the real-time problem space (and part II) Atomics and lock-free FIFOs

Slide 9

Slide 9 text

Realtime safety: challenging the rules ● What if we can allocate memory in RT threads? ➡ std::pmr efforts [ADC22] ● Concurrency needs atomics but it's not available for complex objects. But what if we can use std::shared_ptr in atomics? ➡ atomic shared pointers [CPPCON23] Related: deferred reclamation, hazard pointer (C++26?)

Slide 10

Slide 10 text

AI utilization: usage scenarios ● text description to song (suno, udio, musicfx) ● complementary trackmaking (various DAWs) ● voice/vocal synthesis (voicevox, NEUTRINO, Vocaloid AI, Synthesizer V) ● audio source separation (spleeter, demucs, many others, often vocal only; seealso: mvsep.com) ● pitch detection, sequence extraction (basic-pitch) ● mastering (LANDR, iZotope) ● to apply them in audio plugins not just to create anything *instead* of you, but rather to *assist* your creative works.

Slide 11

Slide 11 text

AI model processing runtimes Utilizing them in audio apps: we need the AI model runtime(s) embedded, in C++ ● Tensorflow, TFLite (, Magenta, DDSP) ● ONNX Runtime onnx: we often need to compile models to .ort and load w/o compilation at runtime. ort-builder is often used to ease this build process (if it works / on some platforms)

Slide 12

Slide 12 text

Realtime AI model processing Do we need RT safety? ➡ It depends on the task "Real-Time Inference of Neural Networks: A Guide for DSP Engineers" RTNeural : useful when we need RT-safe AI inferences "Building Neural Audio Plugins with RTNeural" plugin use cases: BYOD, GuitarML, NeuralNote, gRainbow

Slide 13

Slide 13 text

WebView as (plugin) UI Can you name any C++ GUI framework that... ● works everywhere ● supports hot reloading (Flutter, React Native, or even Titanium Mobile from 10 years ago...) ● supports CJK inputs and rendering ○ including Linux ○ including Emojis and IVS-es ● supports accessibility ● supports basic widgets, or even basic layout engines Web UI is a (super) popular option for us. (note: not about running plugins ON the Web)

Slide 14

Slide 14 text

WebView as (plugin) UI: Challenges ● data control flow between app/audio thread and WebView ○ in performant manner - particularly, audio buffers ● Bundling WebView2 DLL on Windows (especially as in Evergreen mode) ● Build complication: you have to incorporate JS build tasks into CMake etc. ● potentially controversial community reaction [ADC23] Build a High Performance Audio App With a Web GUI & C++ Audio Engine (Output inc.) Challengers: DPF-webui, Cmajor, JUCE8

Slide 15

Slide 15 text

WebView as plugin UI PoC demo (atsushieno/jeq8) based on teropa/weq8

Slide 16

Slide 16 text

Audio Plugin formats in 2024 What are the recent plugin formats? ● VST2 (Steinberg) is killed by Steinberg itself in 2018 ● VST3 (Steinberg) is dominant on Windows ● AUv2 (Apple) is primary on macOS (VST3 needs to be wrapped anyway) ● AUv3 (Apple) rules on iOS (not as performant as v2 on macOS) ● LV2 has expanded beyond Linux (JUCE 7.0+, Reaper 6.24+) ● CLAP (Bitwig) emerged in 2022 ○ ISC license, *proper* MIDI 1.0/2.0 support ○ the brightest hope so far

Slide 17

Slide 17 text

Releasing audio plugins in multiple formats Releasing audio plugins in multiple formats will remain mandatory because ● Cubase (by VST3 developer) will never support CLAP ● Logic Pro (by AudioUnit developer) will never support VST3 ● Bitwig Studio (by CLAP developer) will never support LV2 Supporting multiple formats in traditional approach: use JUCE, DPF, iPlug2, ...

Slide 18

Slide 18 text

CLAP first? WHAT IF we only build CLAP plugin, and offer in other formats via wrappers? -> free-audio/clap-wrapper e.g. SoundStacks/cmajor added direct CLAP support (without JUCE) no JUCE = more flexibility, liberal license (until you use vst3sdk...) There are still useful libraries like Tracktion/choc so it's getting realistic Problem: CLAP does not work on mobiles (JUCE is better), no real-world adoption yet

Slide 19

Slide 19 text

MIDI 2.0 adoption: platform API MIDI 2.0 had big revamp in 2023, and many platforms support *that* version. 2023 version: MIDI-CI protocol negotiation is dead ALSA 1.2.10 with Linux Kernel 6.5, Android 13 (usb) / 15 (all) Everyone is waiting for microsoft/MIDI (Windows MIDI Services), to be released "in 2024" celtera/libremidi doesn't wait (already supports it) cf. Web MIDI 2.0? Apple, Google and Microsoft Implementations of MIDI 2.0

Slide 20

Slide 20 text

MIDI 2.0 UMPs in Audio Plugin API We should be able to consume UMPs (MIDI2 messages) in `process` function 🙂 AudioUnit V2 and V3 have support for MIDI2 protocol. 🙂 CLAP has dedicated MIDI 2.0 events in their event channel. 😑 VST3 team claims it supports MIDI 2.0 by mapping UMPs to VST3 events, but their MIDI support is not very welcomed (even in MIDI 1.0). 😑 LV2 can use whatever in lv2:atom in our own URI, but we'd need a "standard" 🤔 JUCE started considering it for v9.

Slide 21

Slide 21 text

MIDI 2.0 adoption: MIDI-CI MIDI-CI: not commonly used yet Some concrete profile and property specifications in 2023 and 2024 and they are of certain interests. They could replace plugin API functionality, for example: ● Property Exchange: Get and Set Device State (state API) ● Property Exchange: Controller resource (parameters API) ● Property Exchange: ProgramList resource (presets API / MIDNAM) Is adopting MIDI 2.0 good here? ➡ Maybe? like YAMAHA DX7 Carts (presets) are still used by various DX7 emulators

Slide 22

Slide 22 text

That's all so far! There are many technical fields - got interested in any of those? Let's discuss together :-)