Slide 1

Slide 1 text

AI in the Browser Smarter Angular apps with WebGPU and WebNN Christian Liebel @christianliebel Consultant

Slide 2

Slide 2 text

AI in the Browser Smarter Angular Apps with WebGPU and WebNN Generative AI everywhere Source: https://www.apple.com/chde/apple-intelligence/

Slide 3

Slide 3 text

Overview AI in the Browser Smarter Angular Apps with WebGPU and WebNN Generative AI Text OpenAI GPT Mistral … Speech OpenAI Whisper tortoise-tts … Images DALL·E Stable Diffusion … Audio/Music Musico Soundraw …

Slide 4

Slide 4 text

Overview AI in the Browser Smarter Angular Apps with WebGPU and WebNN Generative AI Text OpenAI GPT Mistral … Speech OpenAI Whisper tortoise-tts … Images DALL·E Stable Diffusion … Audio/Music Musico Soundraw …

Slide 5

Slide 5 text

Examples AI in the Browser Smarter Angular Apps with WebGPU and WebNN Generative AI Cloud Providers

Slide 6

Slide 6 text

AI in the Browser Smarter Angular Apps with WebGPU and WebNN DEMO

Slide 7

Slide 7 text

Drawbacks AI in the Browser Smarter Angular Apps with WebGPU and WebNN Generative AI Cloud Providers Require a (stable) internet connection Subject to network latency and server availability Data is transferred to the cloud service Require a subscription

Slide 8

Slide 8 text

Can we run GenAI models locally? AI in the Browser Smarter Angular Apps with WebGPU and WebNN

Slide 9

Slide 9 text

Large: Trained on lots of data Language: Process and generate text Models: Programs/neural networks Examples: – GPT (ChatGPT, Bing Chat, …) – Gemini, Gemma (Google) – LLaMa (Meta AI) AI in the Browser Smarter Angular Apps with WebGPU and WebNN Large Language Models

Slide 10

Slide 10 text

Size Comparison Model:Parameters Size phi3:3b 2.2 GB mistral:7b 4.1 GB llama3:8b 4.7 GB gemma2:9b 5.4 GB gemma2:27b 16 GB llama3:70b 40 GB AI in the Browser Smarter Angular Apps with WebGPU and WebNN Large Language Models

Slide 11

Slide 11 text

https://webllm.mlc.ai/ AI in the Browser Smarter Angular Apps with WebGPU and WebNN WebLLM DEMO

Slide 12

Slide 12 text

On NPM AI in the Browser Smarter Angular Apps with WebGPU and WebNN WebLLM

Slide 13

Slide 13 text

Impact on Software Architecture AI in the Browser Smarter Angular Apps with WebGPU and WebNN Large Language Models Prompts serve as the universal interface for users and developers Paradigm shift Natural language becomes a first-class citizen Caveats Non-determinism, hallucinations, prompt injection

Slide 14

Slide 14 text

Demo AI in the Browser Smarter Angular Apps with WebGPU and WebNN WebLLM DEMO

Slide 15

Slide 15 text

Storing model files locally AI in the Browser Smarter Angular Apps with WebGPU and WebNN Cache API Internet Website HTML/JS Cache with model files Hugging Face Note: Due to the Same-Origin Policy, models cannot be shared across origins.

Slide 16

Slide 16 text

AI in the Browser Smarter Angular Apps with WebGPU and WebNN WebAssembly (Wasm) – Bytecode for the web – Compile target for arbitrary languages – Can be faster than JavaScript – WebLLM uses a model-specific Wasm library to accelerate model computations

Slide 17

Slide 17 text

AI in the Browser Smarter Angular Apps with WebGPU and WebNN WebGPU – Grants low-level access to the Graphics Processing Unit (GPU) – Near native performance for machine learning applications – Supported by Chromium-based browsers on Windows and macOS from version 113

Slide 18

Slide 18 text

– Grants web apps access to the device’s CPU, GPU and Neural Processing Unit (NPU) – In specification by the WebML Working Group at W3C – Implementation in progress in Chromium (behind a flag) – Even better performance compared to WebGPU AI in the Browser Smarter Angular Apps with WebGPU and WebNN WebNN Source: https://webmachinelearning.github.io/webnn-intro/ DEMO

Slide 19

Slide 19 text

AI in the Browser Smarter Angular Apps with WebGPU and WebNN WebNN: Near-native inference performance Source: Intel. Browser: Chrome Canary 118.0.5943.0, DUT: Dell/Linux/i7-1260P, single p-core, Workloads: MediaPipe solution models (FP32, batch=1)

Slide 20

Slide 20 text

AI in the Browser Smarter Angular Apps with WebGPU and WebNN Prompt API Operating System Website HTML/JS Browser Internet Apple Intelligence Gemini Nano

Slide 21

Slide 21 text

Part of Chrome’s Built-In AI initiative – Exploratory API for local experiments and use case determination – Downloads Gemini Nano into Google Chrome – Model is shared across origins – Uses native APIs directly – Related APIs: Translation API, Writing Assistance APIs AI in the Browser Smarter Angular Apps with WebGPU and WebNN Prompt API https://developer.chrome.com/docs/ai/built-in

Slide 22

Slide 22 text

Demo: Smart Form Filler AI in the Browser Smarter Angular Apps with WebGPU and WebNN Prompt API DEMO

Slide 23

Slide 23 text

Comparison 22.98 33.96 19.08 38.75 564.63 0 100 200 300 400 500 600 WebLLM (Mistral-7b, M1) WebLLM (Mistral-7b, M3) OpenAI (GPT-4) Azure OpenAI (GPT-4) Groq (Mixtral-8x7b) Tokens/sec AI in the Browser Smarter Angular Apps with WebGPU and WebNN Performance WebLLM/Groq: Own tests (23.03.2024), OpenAI/Azure OpenAI: https://mcplusa.com/comparing-performance-of-openai-gpt-4-and-microsoft-azure-gpt-4/ (31.08.2023)

Slide 24

Slide 24 text

– Open-source text-to-image model – Generates 512x512px images from a prompt – WebSD: special version of Stable Diffusion for the web (2 GB in size) – No npm package this time AI in the Browser Smarter Angular Apps with WebGPU and WebNN Stable Diffusion Prompt: A guinea pig eating a watermelon

Slide 25

Slide 25 text

https://websd.mlc.ai/ AI in the Browser Smarter Angular Apps with WebGPU and WebNN Web Stable Diffusion DEMO

Slide 26

Slide 26 text

Pros & Cons + Data does not leave the browser (privacy) + High availability (offline support) + Low latency + Stability (no external API changes) + Low cost – Lower quality – High system (RAM, GPU) and bandwidth requirements – Large model size, models cannot always be shared – Model initialization and inference are relatively slow – APIs are experimental AI in the Browser Smarter Angular Apps with WebGPU and WebNN Local AI Models

Slide 27

Slide 27 text

Transformers.js – Pre-trained, specialized, significantly smaller models beyond GenAI – JavaScript library to run Hugging Face transformers in the browser – Supports most of the models https://xenova.github.io/transformers.js/ AI in the Browser Smarter Angular Apps with WebGPU and WebNN Alternatives

Slide 28

Slide 28 text

– Cloud-based models remain the most powerful models – Due to their size and high system requirements, local generative AI models are currently rather interesting for very special scenarios (e.g., high privacy demands, offline availability) – Small, specialized models are an interesting alternative (if available) – Large language models are becoming more compact and efficient – Vendors start shipping AI models with their devices – Devices are becoming more powerful for running AI tasks – Experiment with the AI APIs and make your Angular App smarter! AI in the Browser Smarter Angular Apps with WebGPU and WebNN Summary

Slide 29

Slide 29 text

AI in the Browser Smarter Angular Apps with WebGPU and WebNN

Slide 30

Slide 30 text

Thank you for your kind attention! Christian Liebel @christianliebel [email protected]