Upgrade to Pro — share decks privately, control downloads, hide ads and more …

programmier.con 2025: Semantic AI: A Language M...

programmier.con 2025: Semantic AI: A Language Model & an Embedding Model walk into a bar...

Semantic AI als Erweiterung von Generative AI kann der Schlüssel zur Integration von KI in eigene Lösungen sein. In diesem Vortrag zeigt Christian Weyer praxisnahe Architektur-Patterns und Ansätze für die Nutzung von Large und Small Language Models wie GPT oder Llama sowie Embedding-Modellen in modernen Software-Architekturen. Wichtige Konzepte wie Semantic Routing, Light-weight RAG, Structured Output und Observability werden anhand eines End-to-End-Systems mit mehreren Services und Client-Anwendungen demonstriert. Entwickler und Architekten erhalten einen pragmatischen Überblick über die mögliche Umsetzung in eigenen Projekten.

Avatar for Christian Weyer

Christian Weyer PRO

October 30, 2025
Tweet

More Decks by Christian Weyer

Other Decks in Programming

Transcript

  1. Semantic AI: A Language Model & an Embedding Model walk

    into a bar... Christian Weyer | Co-Founder & CTO | Thinktecture AG | [email protected]
  2. Semantic AI A Language Model & an Embedding Model Walk

    Into a Bar... Our journey Models for our software Lightweight RAG Semantic Routing Structured Output / Tool Calling 2
  3. Semantic AI A Language Model & an Embedding Model Walk

    Into a Bar... MODELS FOR OUR SOFTWARE 3
  4. Language Models understand and generate semantically rich human language, transforming

    it into text or structured data for both humans and machines. ⚠ Non-deterministic: same input can lead to different outputs. Embedding Models capture semantic meaning by encoding human language into numerical vector representations, facilitating understanding, comparison, and retrieval for both humans and machines. ✅ Deterministic: same input always results in the same embedding. Semantic AI A Language Model & an Embedding Model Walk Into a Bar... 🫱 🫲 Semantic AI Generative AI 4
  5. § Language & embedding models part of end-to-end architectures §

    Embedding models can be run locally § Optimized for CPU § Language models still hard to run locally § High GPU power § High VRAM § High memory bandwidth Semantic AI A Language Model & an Embedding Model Walk Into a Bar... API-based AI model integrations 5
  6. Semantic AI A Language Model & an Embedding Model Walk

    Into a Bar... Classical applications & UIs API-based data Document-based data 6
  7. Semantic AI A Language Model & an Embedding Model Walk

    Into a Bar... Language-enabled “UIs” – Talk-to-TT 7
  8. Semantic AI A Language Model & an Embedding Model Walk

    Into a Bar... C4 system context diagram § Various tech stacks § Docker-based distributed system 8
  9. Semantic AI A Language Model & an Embedding Model Walk

    Into a Bar... PATTERN STRUCTURED OUTPUT 9
  10. § Tools integration (and more) is being standardized with MCP

    Semantic AI A Language Model & an Embedding Model Walk Into a Bar... Talking to APIs (Function / Tool calling) 10 “When is CW available for a two-days workshop?” System Prompt (+ employee data) + Schema (for structured output) Web API Availability business logic
  11. § Frameworks § Pydantic § Instructor § Methodology § Schema

    with JSON Mode (not Function Calling) § SLM/LLM § Llama 3.3 70B on Cerebras (very fast) Semantic AI A Language Model & an Embedding Model Walk Into a Bar... Technical implementation – Structured Output 11
  12. Semantic AI A Language Model & an Embedding Model Walk

    Into a Bar... PATTERN LIGHTWEIGHT RAG 12
  13. Semantic AI A Language Model & an Embedding Model Walk

    Into a Bar... Talking to documents (Retrieval-augmented generation) Cleanup & Split Text Embedding Question Text Embedding Save Query Relevant Results Question Answ er w / sources LLM Embedding Model Embedding Model 💡 Indexing / Embedding Question Answering .md, .docx, .pdf etc. “Lorem ipsum…?” 💡 Vector DB 13
  14. § Frameworks § LangChain § FastEmbed § Lightweight & efficient

    for generating text embeddings § Embedding model § jinaai/jina-embeddings-v2-base-de (local) – 768 dims § Vector store § PostgreSql (pgvector) vector store § LLM/SLM § Llama 3.3 70B on Cerebras (very fast) Semantic AI A Language Model & an Embedding Model Walk Into a Bar... Technical implementation – Lightweight RAG 14
  15. Semantic AI A Language Model & an Embedding Model Walk

    Into a Bar... PATTERN SEMANTIC ROUTING 15
  16. Semantic AI A Language Model & an Embedding Model Walk

    Into a Bar... Semantics-based decisions for user interactions Guarding (e.g. prompt injection) Routing (selecting correct target) “Lorem ipsum…?” Target RAG Target API Call Target … something else … Fine-tuned NLP Model Embedding Model 16
  17. Guarding § Frameworks § llm-guard § HuggingFace Transformers § NLP

    model § deepset/ deberta-v3-base-injection (local) Routing § Frameworks § semantic-routing § FastEmbed § Embedding model § intfloat/ multilingual-e5-large (local) – 1024 dims § Vector store § PostgreSql (pgvector) Semantic AI A Language Model & an Embedding Model Walk Into a Bar... Technical implementation – Semantic Guarding & Routing 17
  18. Semantic AI A Language Model & an Embedding Model Walk

    Into a Bar... Typical Semantic AI patterns & solutions – in end-to-end software engineering Lightweight RAG Structured Output Semantic Guarding & Routing Insightful Observability 18
  19. Semantic AI A Language Model & an Embedding Model Walk

    Into a Bar... 19 AI-based solutions are ≅10% AI and 100% software engineering. 19