BASTA! Spring 2025: Semantic AI in Action: Architektur-Patterns für LLMs & Embeddings

Semantic AI in Action Architektur-Patterns für LLMs & Embeddings Christian
Weyer | Co-Founder & CTO | Thinktecture AG | [email protected]

Semantic AI in Action Architektur-Patterns für LLMs & Embeddings LLM-
ALL-THE-THINGS? 2

Language Models understand and generate semantically rich human language, transforming
it into text or structured data for both humans and machines. ⚠ Non-deterministic: same input can lead to different outputs. Embedding Models capture semantic meaning by encoding human language into numerical vector representations, facilitating understanding, comparison, and retrieval for both humans and machines. ✅ Deterministic: same input always results in the same embedding. Semantic AI in Action Architektur-Patterns für LLMs & Embeddings 3 🫱 🫲 Semantic AI Generative AI

Semantic AI in Action Architektur-Patterns für LLMs & Embeddings MODELS
FOR OUR SOFTWARE 4

§ Language models (LLM, SLM) & embedding models (EM) in
end-to-end architectures § EM enable semantic search & comparison § LM enable human language understanding via API calls § System prompt § User query Semantic AI in Action Architektur-Patterns für LLMs & Embeddings Model integrations 5

§ We rely on Language Model’s Language understanding NOT on
its world knowledge❗ Semantic AI in Action Architektur-Patterns für LLMs & Embeddings ⚠ Important shoutout 6

Semantic AI in Action Architektur-Patterns für LLMs & Embeddings Classical
applications & UIs 7

Semantic AI in Action Architektur-Patterns für LLMs & Embeddings Language-enabled
“UIs” 8

Semantic AI in Action Architektur-Patterns für LLMs & Embeddings 9
Sample solution - C4 system context diagram

Semantic AI in Action Architektur-Patterns für LLMs & Embeddings Sample
solution - Technology stack 10 Services § Python as the go-to-platform for ML/AI/Gen-AI § Esp. for local model execution § But: Most of the logic could be implemented in any language/platform Clients

Semantic AI in Action Architektur-Patterns für LLMs & Embeddings PATTERN
LOCAL RAG [RETRIEVAL-AUGMENTED GENERATION] 11

Semantic AI in Action Architektur-Patterns für LLMs & Embeddings “Talk
to your data” Cleanup & Split Text Embedding Question Text Embedding Save Query Relevant Results Question Answ er LLM 12 Embedding Model Embedding Model 💡 Indexing / Embedding Question Answering .md, .docx, .pdf etc. “Lorem ipsum…?” 💡 Vector DB

§ Frameworks § LangChain § Fastembed § Lightweight & efﬁcient
for generating text embeddings § Embedding model § jinaai/jina-embeddings-v2-base-de (local) § Vector store § PostgreSql (pgvector) vector store § LLM/SLM § Llama 3.3 70B on Cerebras (very fast) Semantic AI in Action Architektur-Patterns für LLMs & Embeddings Technical implementation – Local RAG 13

STRUCTURED OUTPUT 14

Semantic AI in Action Architektur-Patterns für LLMs & Embeddings Structured
data from unstructured input – e.g. for API calling 15 “OK, when is my colleague CW available for a two- days workshop?” System Prompt (with employee data) + Schema / Function Calling (for structured output) Web API Availability business logic

§ Frameworks § Pydantic § Instructor § Methodology § Schema
with JSON Mode (not Function Calling) § SLM/LLM § Llama 3.3 70B on Cerebras (very fast) Semantic AI in Action Architektur-Patterns für LLMs & Embeddings Technical implementation – Structured Output 16

SEMANTIC GUARDING & ROUTING 17

Semantic AI in Action Architektur-Patterns für LLMs & Embeddings Semantics-based
decisions for user interactions 18 Guarding (e.g. prompt injection) Routing (selecting correct target) “Lorem ipsum…?” Semantic Engine (Fine-tuned Language Model, Embedding Model) Target RAG 1 Target Structured Output & API Call Target … something else …

Guarding § Frameworks § llm-guard § HuggingFace Transformers § Model
§ deepset/deberta-v3-base- injection Routing § Frameworks § semantic-routing § Fastembed § Embedding model § BAAI/bge-small-en-v1.5 - local § Vector store § PostgreSql (pgvector) Semantic AI in Action Architektur-Patterns für LLMs & Embeddings Technical implementation – Semantic Guarding & Routing 19

/ SOLUTION OBSERVABILITY 20

Semantic AI in Action Architektur-Patterns für LLMs & Embeddings Things
can get… overwhelming 21

Semantic AI in Action Architektur-Patterns für LLMs & Embeddings End-to-end
tracing 22

§ Methodology § Open Telemetry (OTel) § Frameworks § OTel
Python packages § LogFire SDK § Tools § LogFire, LangFuse § Any OTel-enabled system Semantic AI in Action Architektur-Patterns für LLMs & Embeddings Technical implementation - Observability 23

Semantic AI in Action Architektur-Patterns für LLMs & Embeddings END-TO-END
SOLUTION ILLUSTRATED 24

Semantic routing Semantic AI in Action Architektur-Patterns für LLMs &
Embeddings "Talk to your systems"(for Availability info) 25 Web App / Watch App Speech-to-Text Internal Gateway (Python FastAPI) LLM / SLM Text-to-Speech Transcribe spoken text Transcribed text Check for experts availability with text Extract { experts, booking times } from text Structured JSON data (Function calling) Generate response with availability Response Response with experts availability 🔉 Speech-to-text for response Response audio Internal Business API (node.js – veeeery old) Query Availability API Availability When is CL…? CL will be…

Semantic AI in Action Architektur-Patterns für LLMs & Embeddings Recap:
Top Semantic AI patterns & solutions – in end-to-end software engineering 26 Local RAG Structured Output Semantic Guarding & Routing Insightful Observability 💡 Fun Fact: Large parts been built with AI-assisted Coding

Thank you! Christian Weyer https://thinktecture.com/christian-weyer [email protected] 27

§ Technology catalyst § AI-powered solutions § Pragmatic end-to-end architectures
§ Microsoft Regional Director § Microsoft MVP for AI § Google GDE for Web AI [email protected] https://www.thinktecture.com Semantic AI in Action Architektur-Patterns für LLMs & Embeddings Christian Weyer Co-Founder & CTO @ Thinktecture AG

BASTA! Spring 2025: Semantic AI in Action: Arch...

BASTA! Spring 2025: Semantic AI in Action: Architektur-Patterns für LLMs & Embeddings

Christian Weyer
PRO

More Decks by Christian Weyer

Other Decks in Programming

Featured

Transcript

Semantic AI in Action Architektur-Patterns für LLMs & Embeddings Christian

Semantic AI in Action Architektur-Patterns für LLMs & Embeddings LLM-

Language Models understand and generate semantically rich human language, transforming

Semantic AI in Action Architektur-Patterns für LLMs & Embeddings MODELS

§ Language models (LLM, SLM) & embedding models (EM) in

§ We rely on Language Model’s Language understanding NOT on

Semantic AI in Action Architektur-Patterns für LLMs & Embeddings Classical

Semantic AI in Action Architektur-Patterns für LLMs & Embeddings Language-enabled

Semantic AI in Action Architektur-Patterns für LLMs & Embeddings 9

Semantic AI in Action Architektur-Patterns für LLMs & Embeddings Sample

Semantic AI in Action Architektur-Patterns für LLMs & Embeddings PATTERN

Semantic AI in Action Architektur-Patterns für LLMs & Embeddings “Talk

§ Frameworks § LangChain § Fastembed § Lightweight & efﬁcient

Semantic AI in Action Architektur-Patterns für LLMs & Embeddings PATTERN

Semantic AI in Action Architektur-Patterns für LLMs & Embeddings Structured

§ Frameworks § Pydantic § Instructor § Methodology § Schema

Semantic AI in Action Architektur-Patterns für LLMs & Embeddings PATTERN

Semantic AI in Action Architektur-Patterns für LLMs & Embeddings Semantics-based

Guarding § Frameworks § llm-guard § HuggingFace Transformers § Model

Semantic AI in Action Architektur-Patterns für LLMs & Embeddings PATTERN

Semantic AI in Action Architektur-Patterns für LLMs & Embeddings Things

Semantic AI in Action Architektur-Patterns für LLMs & Embeddings End-to-end

§ Methodology § Open Telemetry (OTel) § Frameworks § OTel

Semantic AI in Action Architektur-Patterns für LLMs & Embeddings END-TO-END

Semantic routing Semantic AI in Action Architektur-Patterns für LLMs &

Semantic AI in Action Architektur-Patterns für LLMs & Embeddings Recap:

Thank you! Christian Weyer https://thinktecture.com/christian-weyer [email protected] 27

§ Technology catalyst § AI-powered solutions § Pragmatic end-to-end architectures