Upgrade to Pro — share decks privately, control downloads, hide ads and more …

Built-in AI: Die AI-Revolution direkt im Browser

Built-in AI: Die AI-Revolution direkt im Browser

Immer mehr Entwickler beabsichtigen, Generative-AI-Funktionen in ihre Anwendungen zu integrieren. Bislang führte dieser Weg praktisch immer in die Cloud – aber das muss nicht so sein! Plattform- und Browserhersteller sind dazu übergegangen, KI-Modelle direkt mit ihren Betriebssystem auszuliefern: So läuft Apple Intelligence auf dem eigenen Gerät und auch Google liefert auf leistungsstarken Android-Smartphones sein Mdoell Gemini Nano aus. Aktuell implementieren Microsoft und Chromium die Built-in-AI-Schnittstellen in Chrome und Edge, die Zugriff auf ein lokal installiertes Large Language Modell (LLM) gewähren. Die Vorteile liegen auf der Hand: Die Anwenderdaten verlassen das Gerät nicht, alles funktioniert auch bei schwacher oder komplett ohne Internetverbindung und es muss kein extra Modell heruntergeladen werden, da einfach das lokal vorhandene genutzt wird. In diesem Webinar zeigt Christian Liebel, Google GDE und Microsoft MVP, welche Use Cases die Built-in AI APIs abdecken und wie auch Sie Ihre Webanwendung dank Built-in AI smarter machen.

Christian Liebel

December 19, 2024
Tweet

More Decks by Christian Liebel

Other Decks in Programming

Transcript

  1. Built-in AI Die AI-Revolution direkt im Browser Generative AI everywhere

    Source: https://www.apple.com/chde/apple-intelligence/
  2. Overview Built-in AI Die AI-Revolution direkt im Browser Generative AI

    Text OpenAI GPT Mistral … Audio/Music Musico Soundraw … Images DALL·E Firefly … Video Sora Runway … Speech Whisper tortoise-tts …
  3. Drawbacks Built-in AI Die AI-Revolution direkt im Browser Generative AI

    Cloud Providers Require a (stable) internet connection Subject to network latency and server availability Data is transferred to the cloud service Require a subscription
  4. Built-in AI Die AI-Revolution direkt im Browser WebAssembly (Wasm) –

    Bytecode for the web – Compile target for arbitrary languages – Can be faster than JavaScript – WebLLM uses a model- specific Wasm library to accelerate model computations
  5. Built-in AI Die AI-Revolution direkt im Browser WebGPU – Grants

    low-level access to the Graphics Processing Unit (GPU) – Near native performance for machine learning applications – Supported by Chromium-based browsers on Windows and macOS from version 113
  6. – Grants web apps access to the device’s CPU, GPU

    and Neural Processing Unit (NPU) – In specification by the WebML Working Group at W3C – Implementation in progress in Chromium (behind a flag) – Better performance for specific workloads Built-in AI Die AI-Revolution direkt im Browser WebNN Source: https://webmachinelearning.github.io/webnn-intro/ DEMO
  7. Storing model files locally Built-in AI Die AI-Revolution direkt im

    Browser WebLLM Internet Website HTML/JS Cache with model files Hugging Face Note: Due to the Same-Origin Policy, models cannot be shared across origins.
  8. Model Size Comparison Model:Parameters Size phi3:3b 2.2 GB mistral:7b 4.1

    GB llama3:8b 4.7 GB gemma2:9b 5.4 GB gemma2:27b 16 GB llama3:70b 40 GB Built-in AI Die AI-Revolution direkt im Browser WebLLM
  9. Drawbacks Built-in AI Die AI-Revolution direkt im Browser WebLLM Models

    can’t be shared across origins Inference is fast, but doesn’t reach full native speed
  10. – Initiative by Google Chrome – Exploratory APIs for local

    experiments and use case determination – Downloads AI models into Google Chrome – Models are shared across origins – Uses native APIs directly (full performance) Built-in AI Die AI-Revolution direkt im Browser Built-in AI https://developer.chrome.com/docs/ai/built-in
  11. Incubated by the WebML CG Built-in AI Die AI-Revolution direkt

    im Browser Built-in AI APIs https://webmachinelearning.github.io/incubations/
  12. Built-in AI Die AI-Revolution direkt im Browser Built-in AI APIs

    Operating System Website HTML/JS Browser Internet Apple Intelligence Gemini Nano
  13. Prompt API – Generally usable interface for executing LLM conversations

    – Uses Gemini Nano 2 (3.25 B parameters) – In Origin Trial for extensions (Chrome 131–136) Built-in AI Die AI-Revolution direkt im Browser Built-in AI APIs
  14. Writing Assistance APIs – Summarizer API: Summarizes text. – Writer

    API: Writes text based on a certain prompt. – Rewriter API: Rewrites a text based on certain criteria. – Uses Gemini Nano 2 (3.25 B parameters) Built-in AI Die AI-Revolution direkt im Browser Built-in AI APIs
  15. Translator and Language Detector APIs – Translator API: Translates text

    from one language into another. – Language Detector API: Recognizes the language(s) in which a text is written. – Uses other AI models Built-in AI Die AI-Revolution direkt im Browser Built-in AI APIs
  16. https://www.google.com/chrome/canary/ about://flags Enables optimization guide on device à EnabledBypassPerfRequirement (API)

    for Gemini Nano à Enabled Built-in AI Die AI-Revolution direkt im Browser Built-in AI APIs
  17. – Chatbots – Sentiment analysis – Data extraction – Summarization

    – Translation – Characterization – Proofreading – Rephrasing Built-in AI Die AI-Revolution direkt im Browser Use Cases
  18. Alternatives: Ollama – Local runner for AI models – Offers

    a local server a website can connect to à allows sharing models across origins – Supported on macOS and Linux (Windows in Preview) https://ollama.ai/ Built-in AI Die AI-Revolution direkt im Browser Local AI Models
  19. Alternatives: Hugging Face Transformers Pre-trained, specialized, significantly smaller models beyond

    GenAI Examples: – Text generation – Image classification – Translation – Speech recognition – Image-to-text Built-in AI Die AI-Revolution direkt im Browser Local AI Models
  20. Alternatives: Transformers.js – Pre-trained, specialized, significantly smaller models beyond GenAI

    – JavaScript library to run Hugging Face transformers in the browser – Supports most of the models https://huggingface.co/collections/Xenova/transformersjs-demos-64f9c4f49c099d93dbc611df Built-in AI Die AI-Revolution direkt im Browser Local AI Models
  21. Pros & Cons + Data does not leave the browser

    (privacy) + High availability (offline support) + Low latency + Stability (no external API changes) + Low cost – Lower quality – High system (RAM, GPU) and bandwidth requirements – Large model size, models cannot always be shared – Model initialization and inference are relatively slow – APIs are experimental Built-in AI Die AI-Revolution direkt im Browser On-device AI Models
  22. – Cloud-based models remain the most powerful models – Due

    to their size and high system requirements, local generative AI models are currently rather interesting for very special scenarios (e.g., high privacy demands, offline availability) – Small, specialized models are an interesting alternative (if available) – Large language models are becoming more compact and efficient – Vendors start shipping AI models with their devices – Devices are becoming more powerful for running AI tasks – Experiment with the AI APIs and make your Angular App smarter! Built-in AI Die AI-Revolution direkt im Browser Summary