Christian Liebel
@christianliebel
Consultant
Angular-Apps smarter machen mit Generative AI:
lokal und offlinefähig
Slide 2
Slide 2 text
Hello, it’s me.
Angular-Apps smarter machen mit Generative AI: lokal und offlinefähig
Christian Liebel
X:
@christianliebel
Email:
christian.liebel
@thinktecture.com
Angular & PWA
Slides:
thinktecture.com
/christian-liebel
Slide 3
Slide 3 text
09:00–10:30 Block 1
10:30–11:00 Coffee Break
11:00–12:30 Block 2
Angular-Apps smarter machen mit Generative AI: lokal und offlinefähig
Timetable
Slide 4
Slide 4 text
What to expect
Focus on web app development
Focus on Generative AI
Up-to-date insights: the ML/AI field is
evolving fast
Live demos on real hardware
Hands-on labs
What not to expect
Deep dive into AI specifics, RAG, model
finetuning or training
Stable libraries or specifications
WebSD in Angular
1:1 Support
Angular-Apps smarter machen mit Generative AI: lokal und offlinefähig
Expectations
Slide 5
Slide 5 text
Angular-Apps smarter machen mit Generative AI: lokal und offlinefähig
DEMO
Slide 6
Slide 6 text
(Workshop Edition)
Angular-Apps smarter machen mit Generative AI: lokal und offlinefähig
Demo Use Case
DEMO
Slide 7
Slide 7 text
Setup complete?
(Node.js, Google Chrome, Editor, Git,
macOS/Windows,
20 GB free disk space, 6 GB VRAM)
Angular-Apps smarter machen mit Generative AI: lokal und offlinefähig
Setup (1/2)
LAB #0
Slide 8
Slide 8 text
git clone https://github.com/thinktecture/angular-
days-2024-fall-genai.git
cd angular-days-2024-fall-genai
npm i
npm start -- --open
Angular-Apps smarter machen mit Generative AI: lokal und offlinefähig
Setup (2/2)
LAB #0
Slide 9
Slide 9 text
Angular-Apps smarter machen mit Generative AI: lokal und offlinefähig
Generative AI everywhere
Source: https://www.apple.com/chde/apple-intelligence/
Slide 10
Slide 10 text
Run locally on the user’s system
Angular-Apps smarter machen mit Generative AI: lokal und offlinefähig
Single-Page Applications
Server-
Logik
Web API
Push Service
Web API
DBs
HTML, JS,
CSS, Assets
Webserver Webbrowser
SPA
Client-
Logik
View
HTML/CSS
View
HTML/CSS
View
HTML/CSS
HTTPS
WebSockets
HTTPS
HTTPS
Slide 11
Slide 11 text
Make SPAs offline-capable
Angular-Apps smarter machen mit Generative AI: lokal und offlinefähig
Progressive Web Apps
Service Worker
Internet
Website
HTML/JS
Cache
fetch
Slide 12
Slide 12 text
Overview
Angular-Apps smarter machen mit Generative AI: lokal und offlinefähig
Generative AI
Text
OpenAI GPT
Mistral
…
Speech
OpenAI Whisper
tortoise-tts
…
Images
DALL·E
Stable Diffusion
…
Audio/Music
Musico
Soundraw
…
Slide 13
Slide 13 text
Overview
Angular-Apps smarter machen mit Generative AI: lokal und offlinefähig
Generative AI
Text
OpenAI GPT
Mistral
…
Speech
OpenAI Whisper
tortoise-tts
…
Images
DALL·E
Stable Diffusion
…
Audio/Music
Musico
Soundraw
…
Slide 14
Slide 14 text
Examples
Angular-Apps smarter machen mit Generative AI: lokal und offlinefähig
Generative AI Cloud Providers
Slide 15
Slide 15 text
Drawbacks
Angular-Apps smarter machen mit Generative AI: lokal und offlinefähig
Generative AI Cloud Providers
Require a (stable) internet connection
Subject to network latency and server availability
Data is transferred to the cloud service
Require a subscription
Slide 16
Slide 16 text
Can we run GenAI models locally?
Angular-Apps smarter machen mit Generative AI: lokal und offlinefähig
Slide 17
Slide 17 text
Large: Trained on lots of data
Language: Process and generate text
Models: Programs/neural networks
Examples:
– GPT (ChatGPT, Bing Chat, …)
– Gemini, Gemma (Google)
– LLaMa (Meta AI)
Angular-Apps smarter machen mit Generative AI: lokal und offlinefähig
Large Language Models
Slide 18
Slide 18 text
Token
A meaningful unit of text (e.g., a word, a part of a word, a character).
Context Window
The maximum amount of tokens the model can process.
Parameters/weights
Internal variables learned during training, used to make predictions.
Angular-Apps smarter machen mit Generative AI: lokal und offlinefähig
Large Language Models
Slide 19
Slide 19 text
Prompts serve as the universal interface
Unstructured text conveying specific semantics
Paradigm shift in software architecture
Natural language becomes a first-class citizen
Caveats
Non-determinism and hallucination, prompt injections
Angular-Apps smarter machen mit Generative AI: lokal und offlinefähig
Large Language Models
Slide 20
Slide 20 text
Size Comparison
Model:Parameters Size
phi3:3b 2.2 GB
mistral:7b 4.1 GB
llama3:8b 4.7 GB
gemma2:9b 5.4 GB
gemma2:27b 16 GB
llama3:70b 40 GB
Angular-Apps smarter machen mit Generative AI: lokal und offlinefähig
Large Language Models
Slide 21
Slide 21 text
https://webllm.mlc.ai/
Angular-Apps smarter machen mit Generative AI: lokal und offlinefähig
WebLLM
DEMO
Slide 22
Slide 22 text
On NPM
Angular-Apps smarter machen mit Generative AI: lokal und offlinefähig
WebLLM
Slide 23
Slide 23 text
npm i @mlc-ai/web-llm
Angular-Apps smarter machen mit Generative AI: lokal und offlinefähig
LAB #1
Slide 24
Slide 24 text
(1/3)
In app.component.ts, add the following lines:
protected readonly progress = signal(0);
protected readonly ready = signal(false);
protected engine?: MLCEngine;
Angular-Apps smarter machen mit Generative AI: lokal und offlinefähig
Downloading a model LAB #2
Slide 25
Slide 25 text
(2/3)
In app.component.ts (ngOnInit()), add the following lines:
const model = 'Llama-3.2-3B-Instruct-q4f32_1-MLC';
this.engine = await CreateMLCEngine(model, {
initProgressCallback: ({ progress }) =>
this.progress.set(progress)
});
this.ready.set(true);
Angular-Apps smarter machen mit Generative AI: lokal und offlinefähig
Downloading a model LAB #2
Slide 26
Slide 26 text
(3/3)
In app.component.html, add the following lines:
Ask
Launch the app via npm start. The progress bar should begin to move.
Angular-Apps smarter machen mit Generative AI: lokal und offlinefähig
Downloading a model LAB #2
Slide 27
Slide 27 text
Storing model files locally
Angular-Apps smarter machen mit Generative AI: lokal und offlinefähig
Cache API
Internet
Website
HTML/JS
Cache with
model files
Hugging
Face
Note: Due to the Same-Origin Policy, models cannot be shared across origins.
Slide 28
Slide 28 text
Parameter cache
Angular-Apps smarter machen mit Generative AI: lokal und offlinefähig
Cache API
Slide 29
Slide 29 text
Angular-Apps smarter machen mit Generative AI: lokal und offlinefähig
WebAssembly (Wasm)
– Bytecode for the web
– Compile target for arbitrary
languages
– Can be faster than JavaScript
– WebLLM uses a model-specific
Wasm library to accelerate model
computations
Slide 30
Slide 30 text
Angular-Apps smarter machen mit Generative AI: lokal und offlinefähig
WebGPU
– Grants low-level access to the
Graphics Processing Unit (GPU)
– Near native performance for machine
learning applications
– Supported by Chromium-based
browsers on Windows and macOS
from version 113
Slide 31
Slide 31 text
– Grants web apps access to the
device’s CPU, GPU and Neural
Processing Unit (NPU)
– In specification by the WebML
Working Group at W3C
– Implementation in progress in
Chromium (behind a flag)
– Even better performance compared
to WebGPU
Angular-Apps smarter machen mit Generative AI: lokal und offlinefähig
WebNN
Source: https://webmachinelearning.github.io/webnn-intro/
DEMO
Slide 32
Slide 32 text
Angular-Apps smarter machen mit Generative AI: lokal und offlinefähig
WebNN: near-native inference performance
Source: Intel. Browser: Chrome Canary 118.0.5943.0, DUT: Dell/Linux/i7-1260P, single p-core, Workloads: MediaPipe solution models (FP32, batch=1)
Slide 33
Slide 33 text
(1/3)
In app.component.ts, add the following lines at the top of the class:
protected readonly reply = signal('');
Angular-Apps smarter machen mit Generative AI: lokal und offlinefähig
Model inference LAB #3
Slide 34
Slide 34 text
(2/3)
In the runPrompt() method, add the following code:
await this.engine!.resetChat();
this.reply.set('…');
const messages: ChatCompletionMessageParam[] = [
{ role: "user", content: userPrompt }
];
const reply = await this.engine!.chat.completions.create({ messages });
this.reply.set(reply.choices[0].message.content ?? '');
Angular-Apps smarter machen mit Generative AI: lokal und offlinefähig
Model inference LAB #3
Slide 35
Slide 35 text
(3/3)
In app.component.html, add the following line:
{{ reply() }}
You should now be able to send prompts to the model and see the responses in the
template.
Angular-Apps smarter machen mit Generative AI: lokal und offlinefähig
Model inference LAB #3
Slide 36
Slide 36 text
Angular-Apps smarter machen mit Generative AI: lokal und offlinefähig
Slide 37
Slide 37 text
npm run build
Angular-Apps smarter machen mit Generative AI: lokal und offlinefähig
LAB #4
Slide 38
Slide 38 text
1. In angular.json, increase the bundle size for the Angular project (property
architect.build.configurations.production.budgets[0]
.maximumError) to at least 5MB.
2. Then, run npm run build again. This time, the build should succeed.
3. If you stopped the development server, don’t forget to bring it back up again (npm
start).
Angular-Apps smarter machen mit Generative AI: lokal und offlinefähig
Build issues LAB #4
Slide 39
Slide 39 text
(1/2)
In app.component.ts, add the following signal at the top:
protected readonly todos = signal([]);
Add the following line to the addTodo() method:
this.todos.update(todos => [...todos, { done: false, text }]);
Angular-Apps smarter machen mit Generative AI: lokal und offlinefähig
Todo management LAB #5
Slide 40
Slide 40 text
(2/2)
In app.component.html, add the following lines to add todos from the UI:
Add
@for(todo of todos(); track $index) {
{{ todo.text }}
}
Angular-Apps smarter machen mit Generative AI: lokal und offlinefähig
Todo management LAB #5
Slide 41
Slide 41 text
In app.component.ts, add the following lines to toggleTodo():
this.todos.update(todos => todos.map((todo, todoIndex) =>
todoIndex === index ? { ...todo, done: !todo.done } : todo));
In app.component.html, add the following content to the
node:
You should now be able to toggle the checkboxes.
Angular-Apps smarter machen mit Generative AI: lokal und offlinefähig
Todo management (extended) LAB #6
Slide 42
Slide 42 text
Concept and limitations
The todo data has to be converted into
natural language.
For the sake of simplicity, we will add all
TODOs to the prompt.
Remember: LLMs have a context
window (Mistral-7B: 8K).
If you need to chat with larger sets of
text, refer to Retrieval Augmented
Generation (RAG).
These are the todos:
* Wash clothes
* Pet the dog
* Take out the trash
Angular-Apps smarter machen mit Generative AI: lokal und offlinefähig
Chat with data
Slide 43
Slide 43 text
System prompt
Metaprompt that defines…
– character
– capabilities/limitations
– output format
– behavior
– grounding data
Hallucinations and prompt injections cannot be eliminated.
You are a helpful assistant.
Answer user questions on todos.
Generate a valid JSON object.
Avoid negative content.
These are the user’s todos: …
Angular-Apps smarter machen mit Generative AI: lokal und offlinefähig
Chat with data
Slide 44
Slide 44 text
Flow
System
message
•The user
has these
todos: 1. …
2. … 3. …
User
message
•How many
todos do I
have?
Assistant
message
•You have
three
todos.
Angular-Apps smarter machen mit Generative AI: lokal und offlinefähig
Chat with data
Slide 45
Slide 45 text
Using a system & user prompt
Adjust the implementation in runPrompt() to include the system prompt:
const systemPrompt = `Here's the user's todo list:
${this.todos().map(todo => `* ${todo.text} (${todo.done ?
'done' : 'not done'})`).join('\n')}`;
const messages: ChatCompletionMessageParam[] = [
{ role: "system", content: systemPrompt },
{ role: "user", content: userPrompt }
];
Angular-Apps smarter machen mit Generative AI: lokal und offlinefähig
Chat with data LAB #7
Slide 46
Slide 46 text
Techniques
– Providing examples (single shot, few shot, …)
– Priming outputs
– Specify output structure
– Repeating instructions
– Chain of thought
– …
Success also depends on the model.
Angular-Apps smarter machen mit Generative AI: lokal und offlinefähig
Prompt Engineering
https://learn.microsoft.com/en-us/azure/ai-services/openai/concepts/advanced-prompt-engineering
Slide 47
Slide 47 text
const systemPrompt = `You are a helpful assistant.
The user will ask questions about their todo list.
Briefly answer the questions.
Don't try to make up an answer if you don't know it.
Here's the user's todo list:
${this.todos().map(todo => `* ${todo.text} (this todo is
${todo.done ? 'done' : 'not done'})`).join('\n')}
${this.todos().length === 0 ? 'The list is empty, there are
no todos.' : ''}`;
Angular-Apps smarter machen mit Generative AI: lokal und offlinefähig
Prompt Engineering LAB #8
Slide 48
Slide 48 text
Alternatives
Prompt Engineering
Retrieval Augmented
Generation
Fine-tuning
Custom model
Angular-Apps smarter machen mit Generative AI: lokal und offlinefähig
Prompt Engineering
Effort
Slide 49
Slide 49 text
Add the following line to the runPrompt() method:
console.log(reply.usage);
Ask a new question and check your console for performance statistics.
Angular-Apps smarter machen mit Generative AI: lokal und offlinefähig
Performance LAB #9
– Open-source text-to-image model
– Generates 512x512px images from a
prompt
– WebSD: special version of Stable
Diffusion for the web
(2 GB in size)
– No npm package this time
Angular-Apps smarter machen mit Generative AI: lokal und offlinefähig
Stable Diffusion
Prompt: A guinea pig eating a watermelon
Slide 52
Slide 52 text
https://websd.mlc.ai/
Angular-Apps smarter machen mit Generative AI: lokal und offlinefähig
Web Stable Diffusion
DEMO
Slide 53
Slide 53 text
Live Demo
Retrofitting AI image generation into an
existing drawing application
(https://paint.js.org)
Angular-Apps smarter machen mit Generative AI: lokal und offlinefähig
Web Stable Diffusion
DEMO
Slide 54
Slide 54 text
Pros & Cons
+ Data does not leave the browser
(privacy)
+ High availability
(offline support)
+ Low latency
+ Stability
(no external API changes)
+ Low cost
– Lower quality
– High system (RAM, GPU) and
bandwidth requirements
– Large model size, models cannot
always be shared
– Model initialization and inference
are relatively slow
– APIs are experimental
Angular-Apps smarter machen mit Generative AI: lokal und offlinefähig
Local AI Models
Slide 55
Slide 55 text
Mitigations
Download model in the background if the user is not on a metered connection
Helpful APIs:
– Network Information API to estimate the network quality/determine data saver
(negative standards position by Apple and Mozilla)
– Storage Manager API to estimate the available free disk space
Angular-Apps smarter machen mit Generative AI: lokal und offlinefähig
Local AI Models
Slide 56
Slide 56 text
Mitigations
Hybrid modes:
– Allow the user to switch between cloud/local execution (availability, system
requirements)
– Deploy OSS model on internal/enterprise infrastructure (privacy)
Angular-Apps smarter machen mit Generative AI: lokal und offlinefähig
Local AI Models
Slide 57
Slide 57 text
Alternatives: Prompt API
Angular-Apps smarter machen mit Generative AI: lokal und offlinefähig
Local AI Models
Operating
System
Website
HTML/JS
Browser Internet
Apple Intelligence
Gemini Nano
Slide 58
Slide 58 text
Alternatives: Prompt API
– Exploratory API for local experiments
and use case determination
– Downloads Gemini Nano into Google
Chrome
– Model is shared across origins
– Uses native APIs directly
– Related APIs: Translation API, Writing
Assistance APIs
Angular-Apps smarter machen mit Generative AI: lokal und offlinefähig
Local AI Models
https://developer.chrome.com/docs/ai/built-in
Slide 59
Slide 59 text
Alternatives: Ollama
– Local runner for AI models
– Offers a local server a website can
connect to → allows sharing models
across origins
– Supported on macOS and Linux
(Windows in Preview)
https://webml-demo.vercel.app/
https://ollama.ai/
Angular-Apps smarter machen mit Generative AI: lokal und offlinefähig
Local AI Models
Slide 60
Slide 60 text
Alternatives: Hugging Face Transformers
Pre-trained, specialized, significantly smaller models beyond GenAI
Examples:
– Text generation
– Image classification
– Translation
– Speech recognition
– Image-to-text
Angular-Apps smarter machen mit Generative AI: lokal und offlinefähig
Local AI Models
Slide 61
Slide 61 text
Alternatives: Transformers.js
– Pre-trained, specialized,
significantly smaller models
beyond GenAI
– JavaScript library to run
Hugging Face transformers
in the browser
– Supports most of the models
https://xenova.github.io/transformers.js/
Angular-Apps smarter machen mit Generative AI: lokal und offlinefähig
Local AI Models
Slide 62
Slide 62 text
– Cloud-based models remain the most powerful models
– Due to their size and high system requirements, local generative AI models are
currently rather interesting for very special scenarios (e.g., high privacy demands,
offline availability)
– Small, specialized models are an interesting alternative (if available)
– Large language models are becoming more compact and efficient
– Vendors start shipping AI models with their devices
– Devices are becoming more powerful for running AI tasks
– Experiment with the AI APIs and make your Angular App smarter!
Angular-Apps smarter machen mit Generative AI: lokal und offlinefähig
Summary
Slide 63
Slide 63 text
Thank you
for your kind attention!
Christian Liebel
@christianliebel
[email protected]