Slide 1

Slide 1 text

Built-in AI Die AI-Revolution direkt im Browser Christian Liebel @christianliebel Consultant

Slide 2

Slide 2 text

Built-in AI Die AI-Revolution direkt im Browser Generative AI everywhere Source: https://www.apple.com/chde/apple-intelligence/

Slide 3

Slide 3 text

Overview Built-in AI Die AI-Revolution direkt im Browser Generative AI Text OpenAI GPT Mistral … Audio/Music Musico Soundraw … Images DALL·E Firefly … Video Sora Runway … Speech Whisper tortoise-tts …

Slide 4

Slide 4 text

Cloud-backed app Built-in AI Die AI-Revolution direkt im Browser ChatGPT

Slide 5

Slide 5 text

Examples Built-in AI Die AI-Revolution direkt im Browser Generative AI Cloud Providers

Slide 6

Slide 6 text

Drawbacks Built-in AI Die AI-Revolution direkt im Browser Generative AI Cloud Providers Require a (stable) internet connection Subject to network latency and server availability Data is transferred to the cloud service Require a subscription

Slide 7

Slide 7 text

Can we run GenAI models locally? Built-in AI Die AI-Revolution direkt im Browser

Slide 8

Slide 8 text

https://webllm.mlc.ai/ Built-in AI Die AI-Revolution direkt im Browser WebLLM DEMO

Slide 9

Slide 9 text

On NPM Built-in AI Die AI-Revolution direkt im Browser WebLLM

Slide 10

Slide 10 text

Built-in AI Die AI-Revolution direkt im Browser WebAssembly (Wasm) – Bytecode for the web – Compile target for arbitrary languages – Can be faster than JavaScript – WebLLM uses a model- specific Wasm library to accelerate model computations

Slide 11

Slide 11 text

Built-in AI Die AI-Revolution direkt im Browser WebGPU – Grants low-level access to the Graphics Processing Unit (GPU) – Near native performance for machine learning applications – Supported by Chromium-based browsers on Windows and macOS from version 113

Slide 12

Slide 12 text

– Grants web apps access to the device’s CPU, GPU and Neural Processing Unit (NPU) – In specification by the WebML Working Group at W3C – Implementation in progress in Chromium (behind a flag) – Better performance for specific workloads Built-in AI Die AI-Revolution direkt im Browser WebNN Source: https://webmachinelearning.github.io/webnn-intro/ DEMO

Slide 13

Slide 13 text

Storing model files locally Built-in AI Die AI-Revolution direkt im Browser WebLLM Internet Website HTML/JS Cache with model files Hugging Face Note: Due to the Same-Origin Policy, models cannot be shared across origins.

Slide 14

Slide 14 text

Model Size Comparison Model:Parameters Size phi3:3b 2.2 GB mistral:7b 4.1 GB llama3:8b 4.7 GB gemma2:9b 5.4 GB gemma2:27b 16 GB llama3:70b 40 GB Built-in AI Die AI-Revolution direkt im Browser WebLLM

Slide 15

Slide 15 text

Drawbacks Built-in AI Die AI-Revolution direkt im Browser WebLLM Models can’t be shared across origins Inference is fast, but doesn’t reach full native speed

Slide 16

Slide 16 text

Can we improve on-device inference? Built-in AI Die AI-Revolution direkt im Browser

Slide 17

Slide 17 text

– Initiative by Google Chrome – Exploratory APIs for local experiments and use case determination – Downloads AI models into Google Chrome – Models are shared across origins – Uses native APIs directly (full performance) Built-in AI Die AI-Revolution direkt im Browser Built-in AI https://developer.chrome.com/docs/ai/built-in

Slide 18

Slide 18 text

Incubated by the WebML CG Built-in AI Die AI-Revolution direkt im Browser Built-in AI APIs https://webmachinelearning.github.io/incubations/

Slide 19

Slide 19 text

Built-in AI Die AI-Revolution direkt im Browser Built-in AI APIs Operating System Website HTML/JS Browser Internet Apple Intelligence Gemini Nano

Slide 20

Slide 20 text

Prompt API – Generally usable interface for executing LLM conversations – Uses Gemini Nano 2 (3.25 B parameters) – In Origin Trial for extensions (Chrome 131–136) Built-in AI Die AI-Revolution direkt im Browser Built-in AI APIs

Slide 21

Slide 21 text

Writing Assistance APIs – Summarizer API: Summarizes text. – Writer API: Writes text based on a certain prompt. – Rewriter API: Rewrites a text based on certain criteria. – Uses Gemini Nano 2 (3.25 B parameters) Built-in AI Die AI-Revolution direkt im Browser Built-in AI APIs

Slide 22

Slide 22 text

Translator and Language Detector APIs – Translator API: Translates text from one language into another. – Language Detector API: Recognizes the language(s) in which a text is written. – Uses other AI models Built-in AI Die AI-Revolution direkt im Browser Built-in AI APIs

Slide 23

Slide 23 text

TypeScript Definitions Built-in AI Die AI-Revolution direkt im Browser Built-in AI APIs

Slide 24

Slide 24 text

Built-in AI Die AI-Revolution direkt im Browser Built-in AI Preview Program

Slide 25

Slide 25 text

https://www.google.com/chrome/canary/ about://flags Enables optimization guide on device à EnabledBypassPerfRequirement (API) for Gemini Nano à Enabled Built-in AI Die AI-Revolution direkt im Browser Built-in AI APIs

Slide 26

Slide 26 text

– Chatbots – Sentiment analysis – Data extraction – Summarization – Translation – Characterization – Proofreading – Rephrasing Built-in AI Die AI-Revolution direkt im Browser Use Cases

Slide 27

Slide 27 text

Built-in AI Die AI-Revolution direkt im Browser Chatbots DEMO

Slide 28

Slide 28 text

Built-in AI Die AI-Revolution direkt im Browser Categorization DEMO

Slide 29

Slide 29 text

Built-in AI Die AI-Revolution direkt im Browser RAG DEMO

Slide 30

Slide 30 text

Alternatives: Ollama – Local runner for AI models – Offers a local server a website can connect to à allows sharing models across origins – Supported on macOS and Linux (Windows in Preview) https://ollama.ai/ Built-in AI Die AI-Revolution direkt im Browser Local AI Models

Slide 31

Slide 31 text

Alternatives: Hugging Face Transformers Pre-trained, specialized, significantly smaller models beyond GenAI Examples: – Text generation – Image classification – Translation – Speech recognition – Image-to-text Built-in AI Die AI-Revolution direkt im Browser Local AI Models

Slide 32

Slide 32 text

Alternatives: Transformers.js – Pre-trained, specialized, significantly smaller models beyond GenAI – JavaScript library to run Hugging Face transformers in the browser – Supports most of the models https://huggingface.co/collections/Xenova/transformersjs-demos-64f9c4f49c099d93dbc611df Built-in AI Die AI-Revolution direkt im Browser Local AI Models

Slide 33

Slide 33 text

Pros & Cons + Data does not leave the browser (privacy) + High availability (offline support) + Low latency + Stability (no external API changes) + Low cost – Lower quality – High system (RAM, GPU) and bandwidth requirements – Large model size, models cannot always be shared – Model initialization and inference are relatively slow – APIs are experimental Built-in AI Die AI-Revolution direkt im Browser On-device AI Models

Slide 34

Slide 34 text

– Cloud-based models remain the most powerful models – Due to their size and high system requirements, local generative AI models are currently rather interesting for very special scenarios (e.g., high privacy demands, offline availability) – Small, specialized models are an interesting alternative (if available) – Large language models are becoming more compact and efficient – Vendors start shipping AI models with their devices – Devices are becoming more powerful for running AI tasks – Experiment with the AI APIs and make your Angular App smarter! Built-in AI Die AI-Revolution direkt im Browser Summary

Slide 35

Slide 35 text

Thank you for your kind attention! Christian Liebel @christianliebel [email protected]