Upgrade to Pro — share decks privately, control downloads, hide ads and more …

Prompt API & WebNN: The AI Revolution Right in ...

Prompt API & WebNN: The AI Revolution Right in Your Browser

More and more developers intend to integrate Generative AI features into their applications. Until now, this path has practically always led to the cloud—but it doesn't have to be like that! Currently, various promising approaches exist to running AI models directly on the user's computer. With WebLLM and Chromium's new Prompt API, we can now bring Large Language Models to your Angular app: locally and offline-capable. The W3C's Web Neural Network API (WebNN) will grant AI models access to the device's Neural Processing Unit (NPU). The advantages of these approaches are obvious: Locally executed AI models are available offline, the user data does not leave the device, and all this is even free of charge thanks to open-source models. In this talk, Christian Liebel, Thinktecture's representative at W3C, will present the approaches to make your single-page app smarter. We will discuss use cases and show the advantages and disadvantages of each solution.

Avatar for Christian Liebel

Christian Liebel

July 11, 2025
Tweet

More Decks by Christian Liebel

Other Decks in Programming

Transcript

  1. Prompt API & WebNN The AI Revolution Right in Your

    Browser Christian Liebel @christianliebel Consultant
  2. Drawbacks Prompt API & WebNN Generative AI Cloud Providers Require

    a (stable) internet connection Subject to network latency and server availability Data is transferred to the cloud service Require a subscription The AI Revolution Right in Your Browser
  3. Can we run GenAI models locally? Prompt API & WebNN

    The AI Revolution Right in Your Browser
  4. Bring Your Own AI (BYOAI) – Libraries – WebLLM –

    Transfomers.js – Frameworks – ONNX Runtime – TensorFlow.js – APIs – WebNN – Cross-Origin Storage Built-in AI (BIAI) – Writing Assistance APIs – Summarizer API – Writer API – Rewriter API – Translator & Language Detector APIs – Prompt API Prompt API & WebNN Local AI Inference The AI Revolution Right in Your Browser
  5. Storing model files locally Prompt API & WebNN WebLLM Internet

    Website HTML/JS Cache with model files Hugging Face Note: Due to the Same-Origin Policy, models cannot be shared across origins. The AI Revolution Right in Your Browser
  6. Model Size Comparison Model:Parameters Size phi3:3b 2.2 GB mistral:7b 4.1

    GB llama3:8b 4.7 GB gemma2:9b 5.4 GB gemma2:27b 16 GB llama3:70b 40 GB Prompt API & WebNN WebLLM The AI Revolution Right in Your Browser
  7. – Grants web apps access to the device’s CPU, GPU

    and Neural Processing Unit (NPU) – In specification by the WebML Working Group at W3C – Implementation in progress in Chromium (behind a flag) – Better performance for specific workloads Prompt API & WebNN WebNN Source: https://webmachinelearning.github.io/webnn-intro/ DEMO The AI Revolution Right in Your Browser
  8. about://flags Enables WebNN API à Enabled Enables experimental WebNN API

    features à Enabled Prompt API & WebNN WebNN The AI Revolution Right in Your Browser
  9. Drawbacks Prompt API & WebNN WebNN Models can’t be shared

    across origins Inference is fast, but doesn’t reach full native speed The AI Revolution Right in Your Browser
  10. – Initiative by Google Chrome – Exploratory APIs for local

    experiments and use case determination – Downloads AI models into Google Chrome – Models are shared across origins – Uses native APIs directly (full performance) Prompt API & WebNN Built-in AI https://developer.chrome.com/docs/ai/built-in The AI Revolution Right in Your Browser
  11. Incubated by the WebML CG Prompt API & WebNN Built-in

    AI APIs https://webmachinelearning.github.io/incubations/ DEMO The AI Revolution Right in Your Browser
  12. Prompt API & WebNN Built-in AI APIs Operating System Website

    HTML/JS Browser Internet Apple Intelligence Gemini Nano The AI Revolution Right in Your Browser
  13. about://on-device-internals https://www.google.com/chrome/canary/ about://flags Enables optimization guide on device à EnabledBypassPerfRequirement

    (API) for Gemini Nano à Enabled Prompt API & WebNN Built-in AI APIs The AI Revolution Right in Your Browser
  14. Rule-based algorithms are limited in their capabilities. Prompt API &

    WebNN Why should you care? The AI Revolution Right in Your Browser
  15. Use AI to implement use cases that are difficult or

    impossible to implement using rule-based algorithms. Prompt API & WebNN Why should you care? The AI Revolution Right in Your Browser
  16. Pros & Cons + Data does not leave the browser

    (privacy) + High availability (offline support) + Low latency + Stability (no external API changes) + Low cost – Lower response quality – Less capable – High system (RAM, GPU) and bandwidth requirements – Large model size, models cannot always be shared – Model initialization and inference are relatively slow – APIs are experimental Prompt API & WebNN On-device AI Models The AI Revolution Right in Your Browser