Upgrade to Pro — share decks privately, control downloads, hide ads and more …

Angular-Apps smarter machen mit Generativer KI:...

Angular-Apps smarter machen mit Generativer KI: Lokal und offlinefähig (Hands-on)

Generative KI ist in aller Munde: AI-gestützte Tools sind bereits zu integralen Bestandteilen unterschiedlicher Alltagsanwendungen geworden, Windows und Office eingeschlossen. Mit WebLLM und der Prompt API bringen wir Generative KI auch in Ihre Angular-Anwendung: lokal und offlinefähig. Wir fügen einer Todo-Anwendung einen Chatbot hinzu und lassen Formulare sich ganz einfach selbst ausfüllen. Und Sie entwickeln mit! Wenn Sie mögen.

Avatar for Christian Liebel

Christian Liebel PRO

September 26, 2025
Tweet

More Decks by Christian Liebel

Other Decks in Programming

Transcript

  1. Hello, it’s me. Angular-Apps smarter machen mit Generativer KI Lokal

    und offlinefähig (Hands-on) Christian Liebel X: @christianliebel Bluesky: @christianliebel.com Email: christian.liebel @thinktecture.com Angular, PWA & Generative AI Slides: thinktecture.com /christian-liebel
  2. Original 09:00–10:30 Block 1 10:30–11:00 Coffee Break 11:00–12:30 Block 2

    12:30–13:30 Lunch Break 13:30–15:00 Block 3 15:00–15:30 Coffee Break 15:30–17:00 Block 4 Angular-Apps smarter machen mit Generativer KI Lokal und offlinefähig (Hands-on) Timetable
  3. Proposal 09:00–10:30 Block 1 10:30–10:50 Coffee Break 10:50–12:30 Block 2

    12:30–13:20 Lunch Break 13:20–15:00 Block 3 15:00–15:20 Coffee Break 15:30–16:30 Block 4 Angular-Apps smarter machen mit Generativer KI Lokal und offlinefähig (Hands-on) Timetable
  4. What to expect Focus on web app development Focus on

    Generative AI Up-to-date insights: the ML/AI field is evolving fast Live demos on real hardware 17 hands-on labs What not to expect Deep dive into AI specifics, RAG, model finetuning or training Stable libraries or specifications Angular-Apps smarter machen mit Generativer KI Lokal und offlinefähig (Hands-on) Expectations Huge downloads! High requirements! Things may break!
  5. Setup complete? (Node.js, Google Chrome, Editor, Git, macOS/Windows, 20 GB

    free disk space, 6 GB VRAM) Angular-Apps smarter machen mit Generativer KI Lokal und offlinefähig (Hands-on) Setup
  6. git clone https://github.com/thinktecture/basta- 2025-genai.git cd basta-2025-genai npm i npm start

    -- --open Angular-Apps smarter machen mit Generativer KI Lokal und offlinefähig (Hands-on) Setup LAB #0
  7. Angular-Apps smarter machen mit Generativer KI Lokal und offlinefähig (Hands-on)

    Generative AI everywhere Source: https://www.apple.com/chde/apple-intelligence/
  8. Run locally on the user’s system Angular-Apps smarter machen mit

    Generativer KI Lokal und offlinefähig (Hands-on) Single-Page Applications Server- Logik Web API Push Service Web API DBs HTML, JS, CSS, Assets Webserver Webbrowser SPA Client- Logik View HTML/CSS View HTML/CSS View HTML/CSS HTTPS WebSockets HTTPS HTTPS
  9. Make SPAs offline-capable Angular-Apps smarter machen mit Generativer KI Lokal

    und offlinefähig (Hands-on) Progressive Web Apps Service Worker Internet Website HTML/JS Cache fetch
  10. Overview Angular-Apps smarter machen mit Generativer KI Lokal und offlinefähig

    (Hands-on) Generative AI Text OpenAI GPT Mistral … Audio/Music Musico Soundraw … Images DALL·E Firefly … Video Sora Runway … Speech Whisper tortoise-tts …
  11. Overview Angular-Apps smarter machen mit Generativer KI Lokal und offlinefähig

    (Hands-on) Generative AI Text OpenAI GPT Mistral … Audio/Music Musico Soundraw … Images DALL·E Firefly … Video Sora Runway … Speech Whisper tortoise-tts …
  12. Drawbacks Angular-Apps smarter machen mit Generativer KI Lokal und offlinefähig

    (Hands-on) Generative AI Cloud Providers Require a (stable) internet connection Subject to network latency and server availability Data is transferred to the cloud service Require a subscription
  13. Can we run GenAI models locally? Angular-Apps smarter machen mit

    Generativer KI Lokal und offlinefähig (Hands-on)
  14. Large: Trained on lots of data Language: Process and generate

    text Models: Programs/neural networks Examples: – GPT (ChatGPT, Microsoft Copilot, …) – Gemini, Gemma (Google) – LLaMa (Meta AI) Angular-Apps smarter machen mit Generativer KI Lokal und offlinefähig (Hands-on) Large Language Models
  15. Token A meaningful unit of text (e.g., a word, a

    part of a word, a character). Context Window The maximum amount of tokens the model can process. Parameters/weights Internal variables learned during training, used to make predictions. Angular-Apps smarter machen mit Generativer KI Lokal und offlinefähig (Hands-on) Large Language Models
  16. Prompts serve as the universal interface Unstructured text conveying specific

    semantics Paradigm shift in software architecture Natural language becomes a first-class citizen Caveats Non-determinism and hallucination, prompt injections Angular-Apps smarter machen mit Generativer KI Lokal und offlinefähig (Hands-on) Large Language Models
  17. Size Comparison Model:Parameters Size phi3:3.8b 2.2 GB mistral:7b 4.1 GB

    deepseek-r1:8b 5.2 GB gemma3n:e4b 7.5 GB gemma3:12b 8.1 GB llama4:16x17b 67 GB Angular-Apps smarter machen mit Generativer KI Lokal und offlinefähig (Hands-on) Large Language Models
  18. npm i @mlc-ai/web-llm npm start -- --open Angular-Apps smarter machen

    mit Generativer KI Lokal und offlinefähig (Hands-on) LAB #1
  19. (1/4) In src/app/todo/todo.ts, add the following lines at the top

    of the class: protected readonly progress = signal(0); protected readonly ready = signal(false); protected engine?: MLCEngine; Angular-Apps smarter machen mit Generativer KI Lokal und offlinefähig (Hands-on) Downloading a model LAB #2
  20. (2/4) In todo.ts (ngOnInit()), add the following lines: const model

    = 'Llama-3.2-3B-Instruct-q4f32_1-MLC'; this.engine = await CreateMLCEngine(model, { initProgressCallback: ({ progress }) => this.progress.set(progress) }); this.ready.set(true); Angular-Apps smarter machen mit Generativer KI Lokal und offlinefähig (Hands-on) Downloading a model LAB #2
  21. (3/4) In todo.html, change the following lines: @if(!ready()) { <mat-progress-bar

    mode="determinate" [value]="progress() * 100"></mat-progress-bar> } <button mat-raised-button (click)="runPrompt(prompt.value, langModel.value)" [disabled]="!ready()"> Angular-Apps smarter machen mit Generativer KI Lokal und offlinefähig (Hands-on) Downloading a model LAB #2
  22. (4/4) Launch the app via npm start. The progress bar

    should begin to move. Angular-Apps smarter machen mit Generativer KI Lokal und offlinefähig (Hands-on) Downloading a model LAB #2
  23. Storing model files locally Angular-Apps smarter machen mit Generativer KI

    Lokal und offlinefähig (Hands-on) Cache API Internet Website HTML/JS Cache with model files Hugging Face Note: Due to the Same-Origin Policy, models cannot be shared across origins.
  24. Angular-Apps smarter machen mit Generativer KI Lokal und offlinefähig (Hands-on)

    WebAssembly (Wasm) – Bytecode for the web – Compile target for arbitrary languages – Can be faster than JavaScript – WebLLM uses a model- specific Wasm library to accelerate model computations
  25. Angular-Apps smarter machen mit Generativer KI Lokal und offlinefähig (Hands-on)

    WebGPU – Grants low-level access to the Graphics Processing Unit (GPU) – Near native performance for machine learning applications – Supported by Chromium-based browsers on Windows and macOS from version 113, Safari 26, and Firefox 141 on Windows
  26. – Grants web apps access to the device’s CPU, GPU

    and Neural Processing Unit (NPU) – In specification by the WebML Working Group at W3C – Implementation in progress in Chromium (behind a flag) – Even better performance compared to WebGPU Angular-Apps smarter machen mit Generativer KI Lokal und offlinefähig (Hands-on) WebNN Source: https://webmachinelearning.github.io/webnn-intro/ DEMO
  27. Angular-Apps smarter machen mit Generativer KI Lokal und offlinefähig (Hands-on)

    WebNN: near-native inference performance Source: Intel. Browser: Chrome Canary 118.0.5943.0, DUT: Dell/Linux/i7-1260P, single p-core, Workloads: MediaPipe solution models (FP32, batch=1)
  28. (1/4) In todo.ts, add the following lines at the top

    of the class: protected readonly reply = signal(''); Angular-Apps smarter machen mit Generativer KI Lokal und offlinefähig (Hands-on) Model inference LAB #3
  29. (2/4) In the runPrompt() method, add the following code: this.reply.set('…');

    const chunks = languageModel === 'webllm' ? this.inferWebLLM(userPrompt) : this.inferPromptApi(userPrompt); for await (const chunk of chunks) { this.reply.set(chunk); } Angular-Apps smarter machen mit Generativer KI Lokal und offlinefähig (Hands-on) Model inference LAB #3
  30. (3/4) In the inferWebLLM() method, add the following code: await

    this.engine!.resetChat(); const messages: ChatCompletionMessageParam[] = [{role: "user", content: userPrompt}]; const chunks = await this.engine!.chat.completions.create({messages, stream: true}); let reply = ''; for await (const chunk of chunks) { reply += chunk.choices[0]?.delta.content ?? ''; yield reply; } Angular-Apps smarter machen mit Generativer KI Lokal und offlinefähig (Hands-on) Model inference LAB #3
  31. (4/4) In todo.html, change the following line: <pre>{{ reply() }}</pre>

    You should now be able to send prompts to the model and see the responses in the template. Angular-Apps smarter machen mit Generativer KI Lokal und offlinefähig (Hands-on) Model inference LAB #3
  32. Stop the development server (Ctrl+C) and run npm run build

    Angular-Apps smarter machen mit Generativer KI Lokal und offlinefähig (Hands-on) LAB #4
  33. 1. In angular.json, increase the bundle size for the Angular

    project (property architect.build.configurations.production.budgets[0] .maximumError) to 10MB. 2. Then, run npm run build again. This time, the build should succeed. 3. If you stopped the development server, don’t forget to bring it back up again (npm start). Angular-Apps smarter machen mit Generativer KI Lokal und offlinefähig (Hands-on) Build issues LAB #4
  34. (1/2) In todo.ts, add the following signal at the top:

    protected readonly todos = signal<TodoDto[]>([]); Add the following line to the addTodo() method: const text = prompt() ?? ''; this.todos.update(todos => [...todos, { done: false, text }]); Angular-Apps smarter machen mit Generativer KI Lokal und offlinefähig (Hands-on) Todo management LAB #5
  35. (2/2) In todo.html, add the following lines to add todos

    from the UI: @for (todo of todos(); track $index) { <mat-list-option>{{ todo.text }}</mat-list-option> } Angular-Apps smarter machen mit Generativer KI Lokal und offlinefähig (Hands-on) Todo management LAB #5
  36. @for (todo of todos(); track $index) { <mat-list-option [(selected)]="todo.done"> {{

    todo.text }} </mat-list-option> } ⚠ Boo! This pattern is not recommended. Instead, you should set the changed values on the signal. But this messes up with Angular Material… Angular-Apps smarter machen mit Generativer KI Lokal und offlinefähig (Hands-on) Todo management (extended) LAB #6
  37. Concept and limitations The todo data has to be converted

    into natural language. For the sake of simplicity, we will add all TODOs to the prompt. Remember: LLMs have a context window (Mistral-7B: 8K). If you need to chat with larger sets of text, refer to Retrieval Augmented Generation (RAG). These are the todos: * Wash clothes * Pet the dog * Take out the trash Angular-Apps smarter machen mit Generativer KI Lokal und offlinefähig (Hands-on) Chat with data
  38. System prompt Metaprompt that defines… – character – capabilities/limitations –

    output format – behavior – grounding data Hallucinations and prompt injections cannot be eliminated. You are a helpful assistant. Answer user questions on todos. Generate a valid JSON object. Avoid negative content. These are the user’s todos: … Angular-Apps smarter machen mit Generativer KI Lokal und offlinefähig (Hands-on) Chat with data
  39. Flow System message • The user has these todos: 1.

    … 2. … 3. … User message • How many todos do I have? Assistant message • You have three todos. Angular-Apps smarter machen mit Generativer KI Lokal und offlinefähig (Hands-on) Chat with data
  40. Using a system & user prompt Adjust the code in

    inferWebLLM() to include the system prompt: const systemPrompt = `Here's the user's todo list: ${this.todos().map(todo => `* ${todo.text} (${todo.done ? 'done' : 'not done'})`).join('\n')}`; const messages: ChatCompletionMessageParam[] = [ { role: "system", content: systemPrompt }, { role: "user", content: userPrompt } ]; Angular-Apps smarter machen mit Generativer KI Lokal und offlinefähig (Hands-on) Chat with data LAB #7
  41. Techniques – Providing examples (single shot, few shot, …) –

    Priming outputs – Specify output structure – Repeating instructions – Chain of thought – … Success also depends on the model. Angular-Apps smarter machen mit Generativer KI Lokal und offlinefähig (Hands-on) Prompt Engineering https://learn.microsoft.com/en-us/azure/ai-foundry/openai/concepts/prompt-engineering
  42. const systemPrompt = `You are a helpful assistant. The user

    will ask questions about their todo list. Briefly answer the questions. Don't try to make up an answer if you don't know it. Here's the user's todo list: ${this.todos().map(todo => `* ${todo.text} (this todo is ${todo.done ? 'done' : 'not done'})`).join('\n')} ${this.todos().length === 0 ? 'The list is empty, there are no todos.' : ''}`; Angular-Apps smarter machen mit Generativer KI Lokal und offlinefähig (Hands-on) Prompt Engineering LAB #8
  43. Alternatives Prompt Engineering Retrieval Augmented Generation Fine-tuning Custom model Angular-Apps

    smarter machen mit Generativer KI Lokal und offlinefähig (Hands-on) Prompt Engineering Effort
  44. Adjust todo.ts as follows: const chunks = await this.engine!.chat.completions.create({ messages,

    stream: true, stream_options: { include_usage: true } }); let reply = ''; for await (const chunk of chunks) { reply += chunk.choices[0]?.delta.content ?? ''; console.log(chunk.usage); yield reply; } Ask a new question and check your console for performance statistics. Angular-Apps smarter machen mit Generativer KI Lokal und offlinefähig (Hands-on) Performance LAB #9
  45. Workshop Participants Device Tokens/s (Decode) MacBook Pro M4 Max (2024)

    33,27 DELL Precision 3581 (2023) 15 Thinkpad Linux 4,53 MacBook Pro M3 (2023) 20 Thinkpad Linux mit Software-GPU 0,076 Angular-Apps smarter machen mit Generativer KI Lokal und offlinefähig (Hands-on) Performance
  46. Comparison 45 33 1200 0 200 400 600 800 1000

    1200 1400 WebLLM (Llama3-8b, M4) Azure OpenAI (gpt-4o-mini) Groq (Llama3-8b) Tokens/sec Angular-Apps smarter machen mit Generativer KI Lokal und offlinefähig (Hands-on) Performance WebLLM/Groq: Own tests (14.11.2024), OpenAI/Azure OpenAI: https://learn.microsoft.com/en-us/azure/ai-services/openai/concepts/provisioned-throughput (18.07.2024)
  47. https://www.google.com/chrome/canary/ about://flags Enables optimization guide on device à EnabledBypassPerfRequirement Prompt

    API for Gemini Nano à Enabled await LanguageModel.create(); about://components about://on-device-internals Angular-Apps smarter machen mit Generativer KI Lokal und offlinefähig (Hands-on) Prompt API LAB #10
  48. Angular-Apps smarter machen mit Generativer KI Lokal und offlinefähig (Hands-on)

    Prompt API Operating System Website HTML/JS Browser Internet Apple Intelligence Gemini Nano
  49. Part of Chrome’s Built-In AI initiative – Exploratory API for

    local experiments and use case determination – Downloads Gemini Nano into Google Chrome – Model can be shared across origins – Uses native APIs directly – Fine-tuning API might follow in the future Angular-Apps smarter machen mit Generativer KI Lokal und offlinefähig (Hands-on) Prompt API https://developer.chrome.com/docs/ai/built-in
  50. npm i -D @types/dom-chromium-ai add "dom-chromium-ai" to the types in

    tsconfig.app.json Angular-Apps smarter machen mit Generativer KI Lokal und offlinefähig (Hands-on) Prompt API LAB #11
  51. Add the following lines to inferPromptApi(): const systemPrompt = `

    The user will ask questions about their todo list. Here's the user's todo list: ${this.todos().map(todo => `* ${todo.text} (${todo.done ? 'done' : 'not done'})`).join('\n')}`; const languageModel = await LanguageModel.create({ initialPrompts: [{ role: "system", content: systemPrompt }]}); const chunks = languageModel.promptStreaming(userPrompt); let reply = ''; for await (const chunk of chunks) { reply += chunk; yield reply; } Angular-Apps smarter machen mit Generativer KI Lokal und offlinefähig (Hands-on) Local AI Models LAB #12
  52. Alternatives: Ollama – Local runner for AI models – Offers

    a local server a website can connect to à allows sharing models across origins – Supported on macOS and Linux (Windows in Preview) https://webml-demo.vercel.app/ https://ollama.ai/ Angular-Apps smarter machen mit Generativer KI Lokal und offlinefähig (Hands-on) Local AI Models
  53. Alternatives: Hugging Face Transformers Pre-trained, specialized, significantly smaller models beyond

    GenAI Examples: – Text generation – Image classification – Translation – Speech recognition – Image-to-text Angular-Apps smarter machen mit Generativer KI Lokal und offlinefähig (Hands-on) Local AI Models
  54. Alternatives: Transformers.js – Pre-trained, specialized, significantly smaller models beyond GenAI

    – JavaScript library to run Hugging Face transformers in the browser – Supports most of the models https://huggingface.co/docs/transformers.js Angular-Apps smarter machen mit Generativer KI Lokal und offlinefähig (Hands-on) Local AI Models
  55. Just transfer the 17.34 euros to me, my IBAN is

    DE02200505501015871393. I am with Hamburger Sparkasse (HASPDEHH). Data Extraction Angular-Apps smarter machen mit Generativer KI Lokal und offlinefähig (Hands-on) Use Case Nice, here is my address: Peter Müller, Rheinstr. 7, 04435 Schkeuditz
  56. Just transfer the 17.34 euros to me, my IBAN is

    DE02200505501015871393. I am with Hamburger Sparkasse (HASPDEHH). Data Extraction Angular-Apps smarter machen mit Generativer KI Lokal und offlinefähig (Hands-on) Use Case Nice, here is my address: Peter Müller, Rheinstr. 7, 04435 Schkeuditz
  57. protected readonly formGroup = this.fb.group({ firstName: [''], lastName: [''], addressLine1:

    [''], addressLine2: [''], city: [''], state: [''], zip: [''], country: [''], }); Angular-Apps smarter machen mit Generativer KI Lokal und offlinefähig (Hands-on) Idea Nice, here is my address: Peter Müller, Rheinstr. 7, 04435 Schkeuditz Smart Form Filler (LLM)
  58. Angular-Apps smarter machen mit Generativer KI Lokal und offlinefähig (Hands-on)

    Form Field “Insurance numbers always start with INS.” “Try to determine the country based on the input.”
  59. (1/2) Add the following code to form.ts: private fb =

    inject(NonNullableFormBuilder); protected formGroup = this.fb.group({ name: '', city: '', }); async fillForm(value: string) {} Angular-Apps smarter machen mit Generativer KI Lokal und offlinefähig (Hands-on) Form Field LAB #13
  60. (2/2) Add the following code to form.html: <input type="text" #form>

    <button (click)="fillForm(form.value)">Fill form</button> <form [formGroup]="formGroup"> <input placeholder="Name" formControlName="name"> <input placeholder="City" formControlName="city"> </form> Angular-Apps smarter machen mit Generativer KI Lokal und offlinefähig (Hands-on) Form Field LAB #13
  61. Async Clipboard API Allows reading from/writing to the clipboard in

    an asynchronous manner Reading from the clipboard requires user consent first (privacy!) Supported by Chrome, Edge and Safari and Firefox Angular-Apps smarter machen mit Generativer KI Lokal und offlinefähig (Hands-on) Prompt Generator
  62. (1/2) Add the following code to form.ts: async paste() {

    const content = await navigator.clipboard.readText(); await this.fillForm(content); } Angular-Apps smarter machen mit Generativer KI Lokal und offlinefähig (Hands-on) Async Clipboard API LAB #14
  63. (2/2) Add the following code to form.html (after the “Fill

    form” button): <button (click)="paste()">Paste</button> Angular-Apps smarter machen mit Generativer KI Lokal und offlinefähig (Hands-on) Async Clipboard API LAB #14
  64. System message • The form has the following setup: {

    "name": "", "city": "" } User message • I am Peter from Berlin Assistant message • { "name": "Peter", "city": "Berlin" } Angular-Apps smarter machen mit Generativer KI Lokal und offlinefähig (Hands-on) Prompt Generator
  65. Add the following code to the fillForm() method: const languageModel

    = await LanguageModel.create({ initialPrompts: [{ role: 'system', content: `Extract the information to a JSON object of this shape: ${JSON.stringify(this.formGroup.value)}`, }], }); const result = await languageModel.prompt(value); console.log(result); Angular-Apps smarter machen mit Generativer KI Lokal und offlinefähig (Hands-on) Prompt Generator LAB #15
  66. Add the following code to form.ts (fillForm() method): const result

    = await languageModel.prompt(value, { responseConstraint: { type: 'object', properties: { name: { type: 'string' }, city: { type: 'string' }, }, }, }); Angular-Apps smarter machen mit Generativer KI Lokal und offlinefähig (Hands-on) Prompt Generator (Structured Output) LAB #16
  67. Angular-Apps smarter machen mit Generativer KI Lokal und offlinefähig (Hands-on)

    Prompt Parser Assistant message • { "name": "Peter", "city": "Berlin" }
  68. Add the following code to form.ts (fillForm() method): this.formGroup.setValue(JSON.parse(result)); Angular-Apps

    smarter machen mit Generativer KI Lokal und offlinefähig (Hands-on) Prompt Parser LAB #17
  69. Assistant message Parsing the assistant message as text/JSON/… JSON Mode

    Tool calling Specifying a well-defined interface via a JSON schema called by the LLM (safer, growing support) Structured Output Angular-Apps smarter machen mit Generativer KI Lokal und offlinefähig (Hands-on) Prompt Parser
  70. Pros & Cons + Data does not leave the browser

    (privacy) + High availability (offline support) + Low latency + Stability (no external API changes) + Low cost – Lower quality – High system (RAM, GPU) and bandwidth requirements – Large model size, models cannot always be shared – Model initialization and inference are relatively slow – APIs are experimental Angular-Apps smarter machen mit Generativer KI Lokal und offlinefähig (Hands-on) Summary
  71. – Cloud-based models remain the most powerful models – Due

    to their size and high system requirements, local generative AI models are currently rather interesting for very special scenarios (e.g., high privacy demands, offline availability) – Small, specialized models are an interesting alternative (if available) – Large language models are becoming more compact and efficient – Vendors are shipping AI models with their devices – Devices are becoming more powerful for running AI workloads – Experiment with the AI APIs and make your Angular App smarter! Angular-Apps smarter machen mit Generativer KI Lokal und offlinefähig (Hands-on) Summary