Upgrade to Pro — share decks privately, control downloads, hide ads and more …

[COSCUP2024] Catching up Trends in Audio App D...

[COSCUP2024] Catching up Trends in Audio App Development

Atsushi Eno

August 02, 2024
Tweet

More Decks by Atsushi Eno

Other Decks in Technology

Transcript

  1. Catching up Trends in Audio App Development @[email protected] The slides

    for this session is also available at https://speakerdeck.com/atsushieno
  2. What's this? Audio development is fun and hard...! We rarely

    see this kind of technical session talk, so I'm giving it a try! The slides for this session is also available at https://speakerdeck.com/atsushieno
  3. What can be "audio" topics? Not limited to these, but

    I don't observe everything...! • Music creative tools (DAWs, synths, effectors...) • Technology (MIDI, 3D audio, musicology and analysis, voice synthesis...) • DSP, algorithms, mathematics and tools (FFT, FIR/IIR, MATLAB...) • Programming tips for audio (C++ standards, realtime safety...) • Audio plugin development (VST / AudioUnit / LV2 / CLAP, ARA, JUCE, ...) with "on the Web", "using AI", "in Rust" ...
  4. What are the hot topics in these years ? Not

    limited to these, but I cannot cover everything, especially in 30 min. talk...! • realtime processing in C++ (we've been discussing it forever!) • AI utilization • WebView in audio apps (including plugins) • Changes in plugin development trends: JUCE, CLAP, choc, ... • (moderate) MIDI2 adoption
  5. Audio and C++: RT safety C++ is still dominant and

    audio is driving part of C++ innovations. Why C++? • It is one of very few realtime-safe languages • there are plenty of existing quality libraries (JUCE, choc, etc.)
  6. Audio and C++: RT safety RT safety is hard to

    achieve, because you have to follow these principles: • no allocations (no garbage collections) • no locks (even std::try_lock is no-go) • no time-unbound operations (such as system calls, JIT compiler, ...)
  7. We consolidate audio thread to perform DSP and I/O work

    without ^. Then we need concurrency between the audio thread and other threads. DAW Code Plugin Code Audio and C++: audio threads audio thread audio device I/O file I/O worker network worker UI thread plugin DSP plugin processing Plugin UI DAW UI
  8. Realtime safety: following the rules • Real-time audio programming 101:

    time waits for nothing • [ADC19] Real-time 101 - part I: Investigating the real-time problem space (and part II) Atomics and lock-free FIFOs
  9. Realtime safety: challenging the rules • What if we can

    allocate memory in RT threads? ➡ std::pmr efforts [ADC22] • Concurrency needs atomics but it's not available for complex objects. But what if we can use std::shared_ptr in atomics? ➡ atomic shared pointers [CPPCON23] Related: deferred reclamation, hazard pointer (C++26?)
  10. AI utilization: usage scenarios • text description to song (suno,

    udio, musicfx) • complementary trackmaking (various DAWs) • voice/vocal synthesis (voicevox, NEUTRINO, Vocaloid AI, Synthesizer V) • audio source separation (spleeter, demucs, many others, often vocal only; seealso: mvsep.com) • pitch detection, sequence extraction (basic-pitch) • mastering (LANDR, iZotope) • to apply them in audio plugins not just to create anything *instead* of you, but rather to *assist* your creative works.
  11. AI model processing runtimes Utilizing them in audio apps: we

    need the AI model runtime(s) embedded, in C++ • Tensorflow, TFLite (, Magenta, DDSP) • ONNX Runtime onnx: we often need to compile models to .ort and load w/o compilation at runtime. ort-builder is often used to ease this build process (if it works / on some platforms)
  12. Realtime AI model processing Do we need RT safety? ➡

    It depends on the task "Real-Time Inference of Neural Networks: A Guide for DSP Engineers" RTNeural : useful when we need RT-safe AI inferences "Building Neural Audio Plugins with RTNeural" plugin use cases: BYOD, GuitarML, NeuralNote, gRainbow
  13. WebView as (plugin) UI Can you name any C++ GUI

    framework that... • works everywhere • supports hot reloading (Flutter, React Native, or even Titanium Mobile from 10 years ago...) • supports CJK inputs and rendering ◦ including Linux ◦ including Emojis and IVS-es • supports accessibility • supports basic widgets, or even basic layout engines Web UI is a (super) popular option for us. (note: not about running plugins ON the Web)
  14. WebView as (plugin) UI: Challenges • data control flow between

    app/audio thread and WebView ◦ in performant manner - particularly, audio buffers • Bundling WebView2 DLL on Windows (especially as in Evergreen mode) • Build complication: you have to incorporate JS build tasks into CMake etc. • potentially controversial community reaction [ADC23] Build a High Performance Audio App With a Web GUI & C++ Audio Engine (Output inc.) Challengers: DPF-webui, Cmajor, JUCE8
  15. Audio Plugin formats in 2024 What are the recent plugin

    formats? • VST2 (Steinberg) is killed by Steinberg itself in 2018 • VST3 (Steinberg) is dominant on Windows • AUv2 (Apple) is primary on macOS (VST3 needs to be wrapped anyway) • AUv3 (Apple) rules on iOS (not as performant as v2 on macOS) • LV2 has expanded beyond Linux (JUCE 7.0+, Reaper 6.24+) • CLAP (Bitwig) emerged in 2022 ◦ ISC license, *proper* MIDI 1.0/2.0 support ◦ the brightest hope so far
  16. Releasing audio plugins in multiple formats Releasing audio plugins in

    multiple formats will remain mandatory because • Cubase (by VST3 developer) will never support CLAP • Logic Pro (by AudioUnit developer) will never support VST3 • Bitwig Studio (by CLAP developer) will never support LV2 Supporting multiple formats in traditional approach: use JUCE, DPF, iPlug2, ...
  17. CLAP first? WHAT IF we only build CLAP plugin, and

    offer in other formats via wrappers? -> free-audio/clap-wrapper e.g. SoundStacks/cmajor added direct CLAP support (without JUCE) no JUCE = more flexibility, liberal license (until you use vst3sdk...) There are still useful libraries like Tracktion/choc so it's getting realistic Problem: CLAP does not work on mobiles (JUCE is better), no real-world adoption yet
  18. MIDI 2.0 adoption: platform API MIDI 2.0 had big revamp

    in 2023, and many platforms support *that* version. 2023 version: MIDI-CI protocol negotiation is dead ALSA 1.2.10 with Linux Kernel 6.5, Android 13 (usb) / 15 (all) Everyone is waiting for microsoft/MIDI (Windows MIDI Services), to be released "in 2024" celtera/libremidi doesn't wait (already supports it) cf. Web MIDI 2.0? Apple, Google and Microsoft Implementations of MIDI 2.0
  19. MIDI 2.0 UMPs in Audio Plugin API We should be

    able to consume UMPs (MIDI2 messages) in `process` function 🙂 AudioUnit V2 and V3 have support for MIDI2 protocol. 🙂 CLAP has dedicated MIDI 2.0 events in their event channel. 😑 VST3 team claims it supports MIDI 2.0 by mapping UMPs to VST3 events, but their MIDI support is not very welcomed (even in MIDI 1.0). 😑 LV2 can use whatever in lv2:atom in our own URI, but we'd need a "standard" 🤔 JUCE started considering it for v9.
  20. MIDI 2.0 adoption: MIDI-CI MIDI-CI: not commonly used yet Some

    concrete profile and property specifications in 2023 and 2024 and they are of certain interests. They could replace plugin API functionality, for example: • Property Exchange: Get and Set Device State (state API) • Property Exchange: Controller resource (parameters API) • Property Exchange: ProgramList resource (presets API / MIDNAM) Is adopting MIDI 2.0 good here? ➡ Maybe? like YAMAHA DX7 Carts (presets) are still used by various DX7 emulators
  21. That's all so far! There are many technical fields -

    got interested in any of those? Let's discuss together :-)