Upgrade to Pro — share decks privately, control downloads, hide ads and more …

AI in the browser: Smarter Angular apps with We...

AI in the browser: Smarter Angular apps with WebGPU and WebNN

In this session, we will explore the integration of Generative AI functions into Angular applications using WebGPU API and Web Neural Network (WebNN) API. These APIs enable the execution of Large Language Models (LLM) and Stable Diffusion models on the user’s device. The primary benefits of local execution include offline availability and data security, provided that the user’s device has sufficient power to run the AI models. During the presentation, we will discuss different use cases and compare the advantages and disadvantages of each solution. Join us to learn how to make your Angular app smarter.

Christian Liebel

October 10, 2024
Tweet

More Decks by Christian Liebel

Other Decks in Programming

Transcript

  1. AI in the Browser Smarter Angular apps with WebGPU and

    WebNN Christian Liebel @christianliebel Consultant
  2. AI in the Browser Smarter Angular Apps with WebGPU and

    WebNN Generative AI everywhere Source: https://www.apple.com/chde/apple-intelligence/
  3. Overview AI in the Browser Smarter Angular Apps with WebGPU

    and WebNN Generative AI Text OpenAI GPT Mistral … Speech OpenAI Whisper tortoise-tts … Images DALL·E Stable Diffusion … Audio/Music Musico Soundraw …
  4. Overview AI in the Browser Smarter Angular Apps with WebGPU

    and WebNN Generative AI Text OpenAI GPT Mistral … Speech OpenAI Whisper tortoise-tts … Images DALL·E Stable Diffusion … Audio/Music Musico Soundraw …
  5. Examples AI in the Browser Smarter Angular Apps with WebGPU

    and WebNN Generative AI Cloud Providers
  6. Drawbacks AI in the Browser Smarter Angular Apps with WebGPU

    and WebNN Generative AI Cloud Providers Require a (stable) internet connection Subject to network latency and server availability Data is transferred to the cloud service Require a subscription
  7. Can we run GenAI models locally? AI in the Browser

    Smarter Angular Apps with WebGPU and WebNN
  8. Large: Trained on lots of data Language: Process and generate

    text Models: Programs/neural networks Examples: – GPT (ChatGPT, Bing Chat, …) – Gemini, Gemma (Google) – LLaMa (Meta AI) AI in the Browser Smarter Angular Apps with WebGPU and WebNN Large Language Models
  9. Size Comparison Model:Parameters Size phi3:3b 2.2 GB mistral:7b 4.1 GB

    llama3:8b 4.7 GB gemma2:9b 5.4 GB gemma2:27b 16 GB llama3:70b 40 GB AI in the Browser Smarter Angular Apps with WebGPU and WebNN Large Language Models
  10. Impact on Software Architecture AI in the Browser Smarter Angular

    Apps with WebGPU and WebNN Large Language Models Prompts serve as the universal interface for users and developers Paradigm shift Natural language becomes a first-class citizen Caveats Non-determinism, hallucinations, prompt injection
  11. Storing model files locally AI in the Browser Smarter Angular

    Apps with WebGPU and WebNN Cache API Internet Website HTML/JS Cache with model files Hugging Face Note: Due to the Same-Origin Policy, models cannot be shared across origins.
  12. AI in the Browser Smarter Angular Apps with WebGPU and

    WebNN WebAssembly (Wasm) – Bytecode for the web – Compile target for arbitrary languages – Can be faster than JavaScript – WebLLM uses a model-specific Wasm library to accelerate model computations
  13. AI in the Browser Smarter Angular Apps with WebGPU and

    WebNN WebGPU – Grants low-level access to the Graphics Processing Unit (GPU) – Near native performance for machine learning applications – Supported by Chromium-based browsers on Windows and macOS from version 113
  14. – Grants web apps access to the device’s CPU, GPU

    and Neural Processing Unit (NPU) – In specification by the WebML Working Group at W3C – Implementation in progress in Chromium (behind a flag) – Even better performance compared to WebGPU AI in the Browser Smarter Angular Apps with WebGPU and WebNN WebNN Source: https://webmachinelearning.github.io/webnn-intro/ DEMO
  15. AI in the Browser Smarter Angular Apps with WebGPU and

    WebNN WebNN: Near-native inference performance Source: Intel. Browser: Chrome Canary 118.0.5943.0, DUT: Dell/Linux/i7-1260P, single p-core, Workloads: MediaPipe solution models (FP32, batch=1)
  16. AI in the Browser Smarter Angular Apps with WebGPU and

    WebNN Prompt API Operating System Website HTML/JS Browser Internet Apple Intelligence Gemini Nano
  17. Part of Chrome’s Built-In AI initiative – Exploratory API for

    local experiments and use case determination – Downloads Gemini Nano into Google Chrome – Model is shared across origins – Uses native APIs directly – Related APIs: Translation API, Writing Assistance APIs AI in the Browser Smarter Angular Apps with WebGPU and WebNN Prompt API https://developer.chrome.com/docs/ai/built-in
  18. Demo: Smart Form Filler AI in the Browser Smarter Angular

    Apps with WebGPU and WebNN Prompt API DEMO
  19. Comparison 22.98 33.96 19.08 38.75 564.63 0 100 200 300

    400 500 600 WebLLM (Mistral-7b, M1) WebLLM (Mistral-7b, M3) OpenAI (GPT-4) Azure OpenAI (GPT-4) Groq (Mixtral-8x7b) Tokens/sec AI in the Browser Smarter Angular Apps with WebGPU and WebNN Performance WebLLM/Groq: Own tests (23.03.2024), OpenAI/Azure OpenAI: https://mcplusa.com/comparing-performance-of-openai-gpt-4-and-microsoft-azure-gpt-4/ (31.08.2023)
  20. – Open-source text-to-image model – Generates 512x512px images from a

    prompt – WebSD: special version of Stable Diffusion for the web (2 GB in size) – No npm package this time AI in the Browser Smarter Angular Apps with WebGPU and WebNN Stable Diffusion Prompt: A guinea pig eating a watermelon
  21. Pros & Cons + Data does not leave the browser

    (privacy) + High availability (offline support) + Low latency + Stability (no external API changes) + Low cost – Lower quality – High system (RAM, GPU) and bandwidth requirements – Large model size, models cannot always be shared – Model initialization and inference are relatively slow – APIs are experimental AI in the Browser Smarter Angular Apps with WebGPU and WebNN Local AI Models
  22. Transformers.js – Pre-trained, specialized, significantly smaller models beyond GenAI –

    JavaScript library to run Hugging Face transformers in the browser – Supports most of the models https://xenova.github.io/transformers.js/ AI in the Browser Smarter Angular Apps with WebGPU and WebNN Alternatives
  23. – Cloud-based models remain the most powerful models – Due

    to their size and high system requirements, local generative AI models are currently rather interesting for very special scenarios (e.g., high privacy demands, offline availability) – Small, specialized models are an interesting alternative (if available) – Large language models are becoming more compact and efficient – Vendors start shipping AI models with their devices – Devices are becoming more powerful for running AI tasks – Experiment with the AI APIs and make your Angular App smarter! AI in the Browser Smarter Angular Apps with WebGPU and WebNN Summary