Semantic AI, as an evolution of Generative AI, can be the key to integrating AI into your own solutions. In this talk, Christian Weyer presents practical architecture patterns and approaches for using large and small language models like GPT or LLaMA, as well as embedding models, in modern software architectures. Key concepts such as Semantic Routing, Semantic Search & lightweight RAG, as well as Structured Output are demonstrated using an end-to-end system with multiple services and client applications. Developers and architects will gain a pragmatic overview of how these can be implemented in their own projects.