Generative-AI-Power im Web: Progressive Web Apps smarter machen

Generative-AI-Power im Web Progressive Web Apps smarter machen Christian Liebel
@christianliebel Consultant

Hello, it’s me. Generative-AI-Power im Web Progressive Web Apps smarter
machen Christian Liebel X: @christianliebel Bluesky: @christianliebel.com Email: christian.liebel @thinktecture.com Angular, PWA & Generative AI Slides: thinktecture.com /christian-liebel

Generative-AI-Power im Web Progressive Web Apps smarter machen DEMO

Generative-AI-Power im Web Progressive Web Apps smarter machen Generative AI
everywhere Source: https://www.apple.com/chde/apple-intelligence/

Run locally on the user’s system Generative-AI-Power im Web Progressive
Web Apps smarter machen Single-Page Applications Server- Logik Web API Push Service Web API DBs HTML, JS, CSS, Assets Webserver Webbrowser SPA Client- Logik View HTML/CSS View HTML/CSS View HTML/CSS HTTPS WebSockets HTTPS HTTPS

Make SPAs offline-capable Generative-AI-Power im Web Progressive Web Apps smarter
machen Progressive Web Apps Service Worker Internet Website HTML/JS Cache fetch

Overview Generative-AI-Power im Web Progressive Web Apps smarter machen Generative
AI Text OpenAI GPT Mistral … Speech OpenAI Whisper tortoise-tts … Images DALL·E Stable Diffusion … Audio/Music Musico Soundraw …

Examples Generative-AI-Power im Web Progressive Web Apps smarter machen Generative
AI Cloud Providers

Drawbacks Generative-AI-Power im Web Progressive Web Apps smarter machen Generative
AI Cloud Providers Require a (stable) internet connection Subject to network latency and server availability Data is transferred to the cloud service Require a subscription

Can we run GenAI models locally? Generative-AI-Power im Web Progressive
Web Apps smarter machen

Large: Trained on lots of data Language: Process and generate
text Models: Programs/neural networks Examples: – GPT (ChatGPT, Bing Chat, …) – Gemini, Gemma (Google) – LLaMa (Meta AI) Generative-AI-Power im Web Progressive Web Apps smarter machen Large Language Models

Token A meaningful unit of text (e.g., a word, a
part of a word, a character). Context Window The maximum amount of tokens the model can process. Parameters/weights Internal variables learned during training, used to make predictions. Generative-AI-Power im Web Progressive Web Apps smarter machen Large Language Models

Prompts serve as the universal interface Unstructured text conveying specific
semantics Paradigm shift in software architecture Natural language becomes a first-class citizen Caveats Non-determinism and hallucination, prompt injections Generative-AI-Power im Web Progressive Web Apps smarter machen Large Language Models

Size Comparison Model:Parameters Size phi3:3b 2.2 GB mistral:7b 4.1 GB
llama3:8b 4.7 GB gemma2:9b 5.4 GB gemma2:27b 16 GB llama3:70b 40 GB Generative-AI-Power im Web Progressive Web Apps smarter machen Large Language Models

https://webllm.mlc.ai/ Generative-AI-Power im Web Progressive Web Apps smarter machen WebLLM
DEMO

On NPM Generative-AI-Power im Web Progressive Web Apps smarter machen
WebLLM

Storing model files locally Generative-AI-Power im Web Progressive Web Apps
smarter machen Cache API Internet Website HTML/JS Cache with model files Hugging Face Note: Due to the Same-Origin Policy, models cannot be shared across origins.

Parameter cache Generative-AI-Power im Web Progressive Web Apps smarter machen
Cache API

Generative-AI-Power im Web Progressive Web Apps smarter machen WebAssembly (Wasm)
– Bytecode for the web – Compile target for arbitrary languages – Can be faster than JavaScript – WebLLM uses a model- specific Wasm library to accelerate model computations

Generative-AI-Power im Web Progressive Web Apps smarter machen WebGPU –
Grants low-level access to the Graphics Processing Unit (GPU) – Near native performance for machine learning applications – Supported by Chromium-based browsers on Windows and macOS from version 113

– Grants web apps access to the device’s CPU, GPU
and Neural Processing Unit (NPU) – In specification by the WebML Working Group at W3C – Implementation in progress in Chromium (behind a flag) – Even better performance compared to WebGPU Generative-AI-Power im Web Progressive Web Apps smarter machen WebNN Source: https://webmachinelearning.github.io/webnn-intro/ DEMO

Generative-AI-Power im Web Progressive Web Apps smarter machen WebNN: near-native
inference performance Source: Intel. Browser: Chrome Canary 118.0.5943.0, DUT: Dell/Linux/i7-1260P, single p-core, Workloads: MediaPipe solution models (FP32, batch=1)

Comparison 22.98 33.96 19.08 38.75 564.63 0 100 200 300
400 500 600 WebLLM (Mistral-7b, M1) WebLLM (Mistral-7b, M3) OpenAI (GPT-4) Azure OpenAI (GPT-4) Groq (Mixtral-8x7b) Tokens/sec Generative-AI-Power im Web Progressive Web Apps smarter machen Performance WebLLM/Groq: Own tests (23.03.2024), OpenAI/Azure OpenAI: https://mcplusa.com/comparing-performance-of-openai-gpt-4-and-microsoft-azure-gpt-4/ (31.08.2023)

– Open-source text-to-image model – Generates 512x512px images from a
prompt – WebSD: special version of Stable Diffusion for the web (2 GB in size) – No npm package this time Generative-AI-Power im Web Progressive Web Apps smarter machen Stable Diffusion Prompt: A guinea pig eating a watermelon

https://websd.mlc.ai/ Generative-AI-Power im Web Progressive Web Apps smarter machen Web
Stable Diffusion DEMO

Pros & Cons + Data does not leave the browser
(privacy) + High availability (offline support) + Low latency + Stability (no external API changes) + Low cost – Lower quality – High system (RAM, GPU) and bandwidth requirements – Large model size, models cannot always be shared – Model initialization and inference are relatively slow – APIs are experimental Generative-AI-Power im Web Progressive Web Apps smarter machen Local AI Models

Mitigations Download model in the background if the user is
not on a metered connection Helpful APIs: – Network Information API to estimate the network quality/determine data saver (negative standards position by Apple and Mozilla) – Storage Manager API to estimate the available free disk space Generative-AI-Power im Web Progressive Web Apps smarter machen Local AI Models

Mitigations Hybrid modes: – Allow the user to switch between
cloud/local execution (availability, system requirements) – Deploy OSS model on internal/enterprise infrastructure (privacy) Generative-AI-Power im Web Progressive Web Apps smarter machen Local AI Models

Alternatives: Prompt API Generative-AI-Power im Web Progressive Web Apps smarter
machen Local AI Models Operating System Website HTML/JS Browser Internet Apple Intelligence Gemini Nano

Alternatives: Prompt API – Exploratory API for local experiments and
use case determination – Downloads Gemini Nano into Google Chrome – Model is shared across origins – Uses native APIs directly – Related APIs: Translation API, Writing Assistance APIs Generative-AI-Power im Web Progressive Web Apps smarter machen Local AI Models https://developer.chrome.com/docs/ai/built-in DEMO

Alternatives: Ollama – Local runner for AI models – Offers
a local server a website can connect to à allows sharing models across origins – Supported on macOS and Linux (Windows in Preview) https://webml-demo.vercel.app/ https://ollama.ai/ Generative-AI-Power im Web Progressive Web Apps smarter machen Local AI Models

Alternatives: Hugging Face Transformers Pre-trained, specialized, significantly smaller models beyond
GenAI Examples: – Text generation – Image classification – Translation – Speech recognition – Image-to-text Generative-AI-Power im Web Progressive Web Apps smarter machen Local AI Models

Alternatives: Transformers.js – Pre-trained, specialized, significantly smaller models beyond GenAI
– JavaScript library to run Hugging Face transformers in the browser – Supports most of the models https://xenova.github.io/transformers.js/ Generative-AI-Power im Web Progressive Web Apps smarter machen Local AI Models

– Cloud-based models remain the most powerful models – Due
to their size and high system requirements, local generative AI models are currently rather interesting for very special scenarios (e.g., high privacy demands, offline availability) – Small language models are becoming more powerful – Vendors start shipping AI models with their devices – Devices are becoming more powerful for running AI tasks – Experiment with the AI APIs and make your web apps smarter! Generative-AI-Power im Web Progressive Web Apps smarter machen Summary

Thank you for your kind attention! Christian Liebel @christianliebel [email protected]

Generative-AI-Power im Web: Progressive Web App...

Generative-AI-Power im Web: Progressive Web Apps smarter machen

More Decks by Christian Liebel

Other Decks in Programming

Featured

Transcript