The AI Revolution in the Browser? Making Single-Page Apps Smarter

The AI Revolution in the Browser? Making Single-Page Apps Smarter
Christian Liebel @christianliebel Consultant

Hello, it’s me. The AI Revolution in the Browser? Making
Single-Page Apps Smarter Christian Liebel X: @christianliebel Email: christian.liebel @thinktecture.com Angular & PWA Slides: thinktecture.com /christian-liebel

Generative AI everywhere

DEMO

Speech OpenAI Whisper tortoise-tts … Overview The AI Revolution in
the Browser? Making Single-Page Apps Smarter Generative AI Images Midjourney DALL·E Stable Diffusion … Audio/Music Musico Soundraw … Text OpenAI GPT LLaMa Vicuna …

Examples The AI Revolution in the Browser? Making Single-Page Apps
Smarter Generative AI Cloud Providers

Drawbacks – Require an active internet connection – Affected by
network latency and server availability – Data is transferred to the cloud service – Require a subscription → Can we run models locally? The AI Revolution in the Browser? Making Single-Page Apps Smarter Generative AI Cloud Providers

Size Comparison Model:Parameters Size phi3:3b 2.2 GB mistral:7b 4.1 GB
llama3:8b 4.7 GB gemma2:9b 5.4 GB gemma2:27b 16 GB llama3:70b 40 GB The AI Revolution in the Browser? Making Single-Page Apps Smarter Large Language Models

https://webllm.mlc.ai/ The AI Revolution in the Browser? Making Single-Page Apps
Smarter WebLLM DEMO

On NPM The AI Revolution in the Browser? Making Single-Page
Apps Smarter WebLLM

Storing model files locally The AI Revolution in the Browser?
Making Single-Page Apps Smarter Cache API Internet Website HTML/JS Cache with model files Hugging Face

Parameter cache The AI Revolution in the Browser? Making Single-Page
Apps Smarter Cache API

WebAssembly (Wasm) Bytecode for the web Compile target for arbitrary languages Can be faster than JavaScript WebLLM needs the model and a Wasm library to accelerate model computations

WebGPU Grants low-level access to the Graphics Processing Unit (GPU) Near native performance for machine learning applications Supported by Chromium-based browsers on Windows and macOS from version 113

Grants web applications access to the Neural Processing Unit (NPU)
of the system via platform-specific machine learning services (e.g., ML Compute on macOS/iOS, DirectML on Windows, …) Even better performance compared to WebGPU Currently in specification by the WebML Working Group at W3C Implementation in progress for Chromium-based browsers https://webmachinelearning.github.io/webnn-intro/ The AI Revolution in the Browser? Making Single-Page Apps Smarter Outlook: WebNN

WebNN: near-native inference performance Source: Intel. Browser: Chrome Canary 118.0.5943.0, DUT: Dell/Linux/i7-1260P, single p-core, Workloads: MediaPipe solution models (FP32, batch=1)

Caveats – Due to the Same-Origin Policy, models can’t be
shared across origins (i.e., https://example.org cannot access https://test.example.org). – Downloading LLMs multiple times leads to very high storage consumption. The AI Revolution in the Browser? Making Single-Page Apps Smarter WebLLM

Prompt API Operating System Website HTML/JS Browser Internet Apple Intelligence Gemini Nano

Part of Chrome’s Built-In AI initiative – Exploratory API for
local experiments and use case determination – Downloads Gemini Nano into Google Chrome – Model can be shared across origins – Uses native APIs directly – Fine-tuning API might follow in the future The AI Revolution in the Browser? Making Single-Page Apps Smarter Prompt API https://developer.chrome.com/docs/ai/built-in

First Glance The AI Revolution in the Browser? Making Single-Page
Apps Smarter Prompt API

Demo: Smart Form Filler The AI Revolution in the Browser?
Making Single-Page Apps Smarter Prompt API DEMO

Additional APIs – Prompt API – Assistant – Translator API
– Translator – Language Detector – Writing Assistance APIs – Summarizer – Writer – Rewriter The AI Revolution in the Browser? Making Single-Page Apps Smarter Built-in AI

Comparison 22.98 33.96 19.08 38.75 564.63 0 100 200 300
400 500 600 WebLLM (Mistral-7b, M1) WebLLM (Mistral-7b, M3) OpenAI (GPT-4) Azure OpenAI (GPT-4) Groq (Mixtral-8x7b) Tokens/sec The AI Revolution in the Browser? Making Single-Page Apps Smarter Performance WebLLM/Groq: Own tests (23.03.2024), OpenAI/Azure OpenAI: https://mcplusa.com/comparing-performance-of-openai-gpt-4-and-microsoft-azure-gpt-4/ (31.08.2023)

Text-to-image model Generates 512x512px images from a prompt Runs on
“commodity” hardware (with 8 GB VRAM) Open-source The AI Revolution in the Browser? Making Single-Page Apps Smarter Stable Diffusion Prompt: A guinea pig eating a watermelon

Specialized version of the Stable Diffusion model for the web
2 GB in size Subject to usage conditions: https://huggingface.co/runwayml/stable-diffusion-v1- 5#uses No npm package this time Currently incompatible with Angular & esbuild due to Wasm imports The AI Revolution in the Browser? Making Single-Page Apps Smarter Web Stable Diffusion

https://websd.mlc.ai/ The AI Revolution in the Browser? Making Single-Page Apps
Smarter Web Stable Diffusion DEMO

Advantages – Data does not leave the browser – High
availability (offline support) – Low latency – Stability (external API changes) – Low cost The AI Revolution in the Browser? Making Single-Page Apps Smarter Local AI Models

Disadvantages – Lower quality than closed-source models – High system
requirements (RAM, GPU) – Large model size, high initial bandwidth requirements, models cannot be shared across origins – Model initialization and inference are relatively slow – WebGPU and WebNN are currently only supported by Chromium-based browsers on macOS and Windows (WebNN only behind a flag) – Prompt API is only an exploratory API The AI Revolution in the Browser? Making Single-Page Apps Smarter Local AI Models

– Cloud-based models (especially OpenAI/GPT) remain the most potent models
and are easier to integrate (for now) – Due to their size and high system requirements, local generative AI models are currently rather interesting for very special scenarios (e.g., high privacy demands, offline availability) – Small, specialized models are an interesting alternative (if available) – Open-source GenAI models are becoming more compact and efficient – Vendors are beginning to ship AI models with their devices – Devices are becoming more powerful for AI tasks The AI Revolution in the Browser? Making Single-Page Apps Smarter Summary

The AI Revolution in the Browser?

Thank you for your kind attention! Christian Liebel @christianliebel [email protected]

The AI Revolution in the Browser? Making Single...

The AI Revolution in the Browser? Making Single-Page Apps Smarter

Christian Liebel
PRO

More Decks by Christian Liebel

Other Decks in Programming

Featured

Transcript