Slide 1

Slide 1 text

AI in the Browser Smarter Angular apps with WebGPU and WebNN Christian Liebel @christianliebel Consultant

Slide 2

Slide 2 text

Hello, it’s me. AI in the Browser Smarter Angular apps with WebGPU and WebNN Christian Liebel W3C WebML WG & CG TAG Associate X: @christianliebel Bluesky: @christianliebel.com Angular, PWA & Generative AI Microsoft MVP & Google GDE (Angular, Web)

Slide 3

Slide 3 text

Rule-based algorithms are limited in their capabilities. AI in the Browser Smarter Angular apps with WebGPU and WebNN Why should you care?

Slide 4

Slide 4 text

AI in the Browser Smarter Angular apps with WebGPU and WebNN Why should you care? DEMO

Slide 5

Slide 5 text

Use AI to implement use cases that are difficult or impossible to implement using rule-based algorithms. AI in the Browser Smarter Angular apps with WebGPU and WebNN Why should you care?

Slide 6

Slide 6 text

Data Training Trained Model Inference/ Prediction Output AI in the Browser Smarter Angular apps with WebGPU and WebNN Schema

Slide 7

Slide 7 text

AI in the Browser Smarter Angular apps with WebGPU and WebNN Generative AI everywhere Source: https://www.apple.com/chde/apple-intelligence/

Slide 8

Slide 8 text

Overview AI in the Browser Smarter Angular apps with WebGPU and WebNN Generative AI Text OpenAI GPT Mistral … Audio/Music Musico Soundraw … Images DALL·E Firefly … Video Sora Runway … Speech Whisper tortoise-tts …

Slide 9

Slide 9 text

Overview AI in the Browser Smarter Angular apps with WebGPU and WebNN Generative AI Text OpenAI GPT Mistral … Audio/Music Musico Soundraw … Images DALL·E Firefly … Video Sora Runway … Speech Whisper tortoise-tts …

Slide 10

Slide 10 text

AI in the Browser Smarter Angular apps with WebGPU and WebNN Media DEMO

Slide 11

Slide 11 text

Overview AI in the Browser Smarter Angular apps with WebGPU and WebNN Generative AI Text OpenAI GPT Mistral … Audio/Music Musico Soundraw … Images DALL·E Firefly … Video Sora Runway … Speech Whisper tortoise-tts …

Slide 12

Slide 12 text

Cloud-backed app AI in the Browser Smarter Angular apps with WebGPU and WebNN ChatGPT

Slide 13

Slide 13 text

Examples AI in the Browser Smarter Angular apps with WebGPU and WebNN Generative AI Cloud Providers

Slide 14

Slide 14 text

Drawbacks AI in the Browser Smarter Angular apps with WebGPU and WebNN Generative AI Cloud Providers Require a (stable) internet connection Subject to network latency and server availability Data is transferred to the cloud service Require a subscription

Slide 15

Slide 15 text

Can we run GenAI models locally? AI in the Browser Smarter Angular apps with WebGPU and WebNN

Slide 16

Slide 16 text

https://webllm.mlc.ai/ AI in the Browser Smarter Angular apps with WebGPU and WebNN WebLLM DEMO

Slide 17

Slide 17 text

On NPM AI in the Browser Smarter Angular apps with WebGPU and WebNN WebLLM

Slide 18

Slide 18 text

AI in the Browser Smarter Angular apps with WebGPU and WebNN WebAssembly (Wasm) – Bytecode for the web – Compile target for arbitrary languages – Can be faster than JavaScript – WebLLM uses a model- specific Wasm library to accelerate model computations

Slide 19

Slide 19 text

AI in the Browser Smarter Angular apps with WebGPU and WebNN WebGPU – Grants low-level access to the Graphics Processing Unit (GPU) – Near native performance for machine learning applications – Supported by Chromium-based browsers on Windows and macOS from version 113

Slide 20

Slide 20 text

– Grants web apps access to the device’s CPU, GPU and Neural Processing Unit (NPU) – In specification by the WebML Working Group at W3C – Implementation in progress in Chromium (behind a flag) – Better performance for specific workloads AI in the Browser Smarter Angular apps with WebGPU and WebNN WebNN Source: https://webmachinelearning.github.io/webnn-intro/ DEMO

Slide 21

Slide 21 text

AI in the Browser Smarter Angular apps with WebGPU and WebNN WebNN Source: https://github.com/webmachinelearning/webnn/issues/375#issuecomment-2720701672

Slide 22

Slide 22 text

Storing model files locally AI in the Browser Smarter Angular apps with WebGPU and WebNN WebLLM Internet Website HTML/JS Cache with model files Hugging Face Note: Due to the Same-Origin Policy, models cannot be shared across origins.

Slide 23

Slide 23 text

Model Size Comparison Model:Parameters Size phi3:3b 2.2 GB mistral:7b 4.1 GB llama3:8b 4.7 GB gemma2:9b 5.4 GB gemma2:27b 16 GB llama3:70b 40 GB AI in the Browser Smarter Angular apps with WebGPU and WebNN WebLLM

Slide 24

Slide 24 text

Drawbacks AI in the Browser Smarter Angular apps with WebGPU and WebNN WebLLM Models can’t be shared across origins Inference is fast, but doesn’t reach full native speed

Slide 25

Slide 25 text

Can we improve on-device inference? AI in the Browser Smarter Angular apps with WebGPU and WebNN

Slide 26

Slide 26 text

– Initiative by Google Chrome – Exploratory APIs for local experiments and use case determination – Downloads AI models into Google Chrome – Models are shared across origins – Uses native APIs directly (full performance) AI in the Browser Smarter Angular apps with WebGPU and WebNN Built-in AI https://developer.chrome.com/docs/ai/built-in

Slide 27

Slide 27 text

Incubated by the WebML CG AI in the Browser Smarter Angular apps with WebGPU and WebNN Built-in AI APIs https://webmachinelearning.github.io/incubations/

Slide 28

Slide 28 text

AI in the Browser Smarter Angular apps with WebGPU and WebNN Built-in AI APIs Operating System Website HTML/JS Browser Internet Apple Intelligence Gemini Nano

Slide 29

Slide 29 text

about://on-device-internals https://www.google.com/chrome/canary/ about://flags Enables optimization guide on device à EnabledBypassPerfRequirement (API) for Gemini Nano à Enabled AI in the Browser Smarter Angular apps with WebGPU and WebNN Built-in AI APIs

Slide 30

Slide 30 text

TypeScript Definitions AI in the Browser Smarter Angular apps with WebGPU and WebNN Built-in AI APIs

Slide 31

Slide 31 text

Adjust the implementations of runPrompt()/fillForm(): const session = await window.ai.languageModel.create({ systemPrompt }); const reply = await session.prompt(value); // runPrompt(): this.reply.set(reply); // fillForm(): this.formGroup.setValue(JSON.parse(reply)); AI in the Browser Smarter Angular apps with WebGPU and WebNN Prompt API

Slide 32

Slide 32 text

AI in the Browser Smarter Angular apps with WebGPU and WebNN Chatbots DEMO

Slide 33

Slide 33 text

AI in the Browser Smarter Angular apps with WebGPU and WebNN Categorization DEMO

Slide 34

Slide 34 text

AI in the Browser Smarter Angular apps with WebGPU and WebNN RAG DEMO

Slide 35

Slide 35 text

Pros & Cons + Data does not leave the browser (privacy) + High availability (offline support) + Low latency + Stability (no external API changes) + Low cost – Lower response quality – Less capable – High system (RAM, GPU) and bandwidth requirements – Large model size, models cannot always be shared – Model initialization and inference are relatively slow – APIs are experimental AI in the Browser Smarter Angular apps with WebGPU and WebNN On-device AI Models

Slide 36

Slide 36 text

Cloud-based AI in the Browser Smarter Angular apps with WebGPU and WebNN Multimodal Realtime Models DEMO

Slide 37

Slide 37 text

Thank you for your kind attention! Christian Liebel @christianliebel [email protected]