Slide 1

Slide 1 text

The AI Revolution in the Browser? Making Single-Page Apps Smarter Christian Liebel @christianliebel Consultant

Slide 2

Slide 2 text

Hello, it’s me. The AI Revolution in the Browser? Making Single-Page Apps Smarter Christian Liebel X: @christianliebel Email: christian.liebel @thinktecture.com Angular & PWA Slides: thinktecture.com /christian-liebel

Slide 3

Slide 3 text

The AI Revolution in the Browser? Making Single-Page Apps Smarter Generative AI everywhere

Slide 4

Slide 4 text

The AI Revolution in the Browser? Making Single-Page Apps Smarter DEMO

Slide 5

Slide 5 text

Speech OpenAI Whisper tortoise-tts … Overview The AI Revolution in the Browser? Making Single-Page Apps Smarter Generative AI Images Midjourney DALL·E Stable Diffusion … Audio/Music Musico Soundraw … Text OpenAI GPT LLaMa Vicuna …

Slide 6

Slide 6 text

Speech OpenAI Whisper tortoise-tts … Overview The AI Revolution in the Browser? Making Single-Page Apps Smarter Generative AI Images Midjourney DALL·E Stable Diffusion … Audio/Music Musico Soundraw … Text OpenAI GPT LLaMa Vicuna …

Slide 7

Slide 7 text

Examples The AI Revolution in the Browser? Making Single-Page Apps Smarter Generative AI Cloud Providers

Slide 8

Slide 8 text

Drawbacks – Require an active internet connection – Affected by network latency and server availability – Data is transferred to the cloud service – Require a subscription → Can we run models locally? The AI Revolution in the Browser? Making Single-Page Apps Smarter Generative AI Cloud Providers

Slide 9

Slide 9 text

Size Comparison Model:Parameters Size phi3:3b 2.2 GB mistral:7b 4.1 GB llama3:8b 4.7 GB gemma2:9b 5.4 GB gemma2:27b 16 GB llama3:70b 40 GB The AI Revolution in the Browser? Making Single-Page Apps Smarter Large Language Models

Slide 10

Slide 10 text

https://webllm.mlc.ai/ The AI Revolution in the Browser? Making Single-Page Apps Smarter WebLLM DEMO

Slide 11

Slide 11 text

On NPM The AI Revolution in the Browser? Making Single-Page Apps Smarter WebLLM

Slide 12

Slide 12 text

Storing model files locally The AI Revolution in the Browser? Making Single-Page Apps Smarter Cache API Internet Website HTML/JS Cache with model files Hugging Face

Slide 13

Slide 13 text

Parameter cache The AI Revolution in the Browser? Making Single-Page Apps Smarter Cache API

Slide 14

Slide 14 text

The AI Revolution in the Browser? Making Single-Page Apps Smarter WebAssembly (Wasm) Bytecode for the web Compile target for arbitrary languages Can be faster than JavaScript WebLLM needs the model and a Wasm library to accelerate model computations

Slide 15

Slide 15 text

The AI Revolution in the Browser? Making Single-Page Apps Smarter WebGPU Grants low-level access to the Graphics Processing Unit (GPU) Near native performance for machine learning applications Supported by Chromium-based browsers on Windows and macOS from version 113

Slide 16

Slide 16 text

Grants web applications access to the Neural Processing Unit (NPU) of the system via platform-specific machine learning services (e.g., ML Compute on macOS/iOS, DirectML on Windows, …) Even better performance compared to WebGPU Currently in specification by the WebML Working Group at W3C Implementation in progress for Chromium-based browsers https://webmachinelearning.github.io/webnn-intro/ The AI Revolution in the Browser? Making Single-Page Apps Smarter Outlook: WebNN

Slide 17

Slide 17 text

The AI Revolution in the Browser? Making Single-Page Apps Smarter WebNN: near-native inference performance Source: Intel. Browser: Chrome Canary 118.0.5943.0, DUT: Dell/Linux/i7-1260P, single p-core, Workloads: MediaPipe solution models (FP32, batch=1)

Slide 18

Slide 18 text

Caveats – Due to the Same-Origin Policy, models can’t be shared across origins (i.e., https://example.org cannot access https://test.example.org). – Downloading LLMs multiple times leads to very high storage consumption. The AI Revolution in the Browser? Making Single-Page Apps Smarter WebLLM

Slide 19

Slide 19 text

The AI Revolution in the Browser? Making Single-Page Apps Smarter Prompt API Operating System Website HTML/JS Browser Internet Apple Intelligence Gemini Nano

Slide 20

Slide 20 text

Part of Chrome’s Built-In AI initiative – Exploratory API for local experiments and use case determination – Downloads Gemini Nano into Google Chrome – Model can be shared across origins – Uses native APIs directly – Fine-tuning API might follow in the future The AI Revolution in the Browser? Making Single-Page Apps Smarter Prompt API https://developer.chrome.com/docs/ai/built-in

Slide 21

Slide 21 text

First Glance The AI Revolution in the Browser? Making Single-Page Apps Smarter Prompt API

Slide 22

Slide 22 text

Demo: Smart Form Filler The AI Revolution in the Browser? Making Single-Page Apps Smarter Prompt API DEMO

Slide 23

Slide 23 text

Additional APIs – Prompt API – Assistant – Translator API – Translator – Language Detector – Writing Assistance APIs – Summarizer – Writer – Rewriter The AI Revolution in the Browser? Making Single-Page Apps Smarter Built-in AI

Slide 24

Slide 24 text

Comparison 22.98 33.96 19.08 38.75 564.63 0 100 200 300 400 500 600 WebLLM (Mistral-7b, M1) WebLLM (Mistral-7b, M3) OpenAI (GPT-4) Azure OpenAI (GPT-4) Groq (Mixtral-8x7b) Tokens/sec The AI Revolution in the Browser? Making Single-Page Apps Smarter Performance WebLLM/Groq: Own tests (23.03.2024), OpenAI/Azure OpenAI: https://mcplusa.com/comparing-performance-of-openai-gpt-4-and-microsoft-azure-gpt-4/ (31.08.2023)

Slide 25

Slide 25 text

Text-to-image model Generates 512x512px images from a prompt Runs on “commodity” hardware (with 8 GB VRAM) Open-source The AI Revolution in the Browser? Making Single-Page Apps Smarter Stable Diffusion Prompt: A guinea pig eating a watermelon

Slide 26

Slide 26 text

Specialized version of the Stable Diffusion model for the web 2 GB in size Subject to usage conditions: https://huggingface.co/runwayml/stable-diffusion-v1- 5#uses No npm package this time Currently incompatible with Angular & esbuild due to Wasm imports The AI Revolution in the Browser? Making Single-Page Apps Smarter Web Stable Diffusion

Slide 27

Slide 27 text

https://websd.mlc.ai/ The AI Revolution in the Browser? Making Single-Page Apps Smarter Web Stable Diffusion DEMO

Slide 28

Slide 28 text

Advantages – Data does not leave the browser – High availability (offline support) – Low latency – Stability (external API changes) – Low cost The AI Revolution in the Browser? Making Single-Page Apps Smarter Local AI Models

Slide 29

Slide 29 text

Disadvantages – Lower quality than closed-source models – High system requirements (RAM, GPU) – Large model size, high initial bandwidth requirements, models cannot be shared across origins – Model initialization and inference are relatively slow – WebGPU and WebNN are currently only supported by Chromium-based browsers on macOS and Windows (WebNN only behind a flag) – Prompt API is only an exploratory API The AI Revolution in the Browser? Making Single-Page Apps Smarter Local AI Models

Slide 30

Slide 30 text

– Cloud-based models (especially OpenAI/GPT) remain the most potent models and are easier to integrate (for now) – Due to their size and high system requirements, local generative AI models are currently rather interesting for very special scenarios (e.g., high privacy demands, offline availability) – Small, specialized models are an interesting alternative (if available) – Open-source GenAI models are becoming more compact and efficient – Vendors are beginning to ship AI models with their devices – Devices are becoming more powerful for AI tasks The AI Revolution in the Browser? Making Single-Page Apps Smarter Summary

Slide 31

Slide 31 text

The AI Revolution in the Browser?

Slide 32

Slide 32 text

Thank you for your kind attention! Christian Liebel @christianliebel [email protected]