Java-Powered AI on Kubernetes: From Development to Deployment with Ease #JavaCro25

qaware.de Java-Powered AI on Kubernetes: From Development to Deployment with
Ease Mario-Leander Reimer [email protected] @LeanderReimer @qaware #CloudNativeNerd #gerneperdude

2 Mario-Leander Reimer Managing Director | CTO @LeanderReimer #cloudnativenerd #qaware
#gernperDude

qaware.de A wave is coming!

qaware.de Agentic AI Software engineering agents Domain speciﬁc agentic workloads

qaware.de ... and we have the perfect surfboard! The logical
continuation: a. From applications to microservices to AI agents b. From on-prem to cloud platforms to AI platforms

Micro-Agent GenAI Usage Prompts, Flow control Tools (MCP) Antwort enthält
Aufrufe an OpenAI API ❏ Clear responsibility ❏ Vertical in terms of expertise ❏ manageably large ❏ potentially reusable Micro-Agent A2A AI agents will be implemented according to the microservice architecture paradigm. … … … Tool Server Business Logic LLM, LAM, SLM, domain-speciﬁc foundation models ? MCP

"According to Gartner, 80% of AI PoCs fail on their
way into productive use." https://www.qaware.de/ki-vom-proof-of-concept-poc-zur-entwicklung/

The 80% Fallacy of AI projects. 8 QAware Juan Pablo
Bottaro, LinkedIn Engineering Blog

The 60% Fallacy of production ready AI projects. 9 QAware
Important quality attributes and architectural drivers are either postponed or neglected.

Chatbots and AI assistants: The more speciﬁc the use case,
the more complex it becomes. ChatGPT or comparable with world knowhow ChatGPT with organisational context knowledge Specialized AI Assistent ▪ Retrieval Augment Generation ▪ Transfer Learning ▪ Specially trained model ▪ Hyper Automation Complexity Beneﬁt ▪ Easy to realise and relatively cost-efficient ▪ Requires data protection and compliance guidelines 10 QAware

Key challenges: technology, models and tools, scaling. Source: https://www.mckinsey.com/capabilities/quantumblack/our-insights/the-state-of-ai-in-2023-generative-ais-breakout-year ▪
Different challenges are seen depending on the maturity of the group ▪ AI newcomers often underestimate the complexity of technologies, models and tools ▪ Production and scaling challenges often hinder production readiness ▪ High cognitive load and lack of expertise are also drivers for failing projects 11

Conceptual Demo Showcase Architecture 13 QAware REST Beer Service Chatbot
Easy RAG Web UI Websockets gRPC Beer Service Ollama Model Llama 3.1 OpenAI Chat Service OpenAI Proxy REST Ollama Chat Service REST REST REST ADK Time Agent

qaware/k8s-native-java-ai

Quarkus Starter 15 QAware

16 QAware

Our proposal for an AI Platform Reference Architecture

The Kubernetes cluster topology requires precise planning. Otherwise the costs
will go through the roof! 18 QAware ▪ There are different GPU machines ▪ Not all types are available in all regions ▪ Prices vary drastically, accurate research is recommended ▪ Additional local SSDs are recommended ▪ To be decided: – all nodes with GPU – different nodes optimised for normal as well as GPU workloads https://cloud.google.com/compute/gpus-pricing?hl=de#other-gpu-models

Platform Plane Observability Operability Resource Plane Compute Data Integration Security
Delivery FinOps Integration & Delivery Plane Quality Plane Data Plane Model Plane Compliance Plane Service Plane User Serving Plane Access Plane / APIs Orchestration Plane Data Modelling Plane

Compliance Plane Integration & Delivery Plane Service Plane Platform Plane
Operability Resource Plane Compute Data: Local SSD Integration Security Delivery FinOps Quality Plane Data Plane Model Plane User Serving Plane Access Plane Data Modelling Pl.

lreimer/k8s-native-ai-platform lreimer/k3s-ai-platform

https://www.agentic-layer.ai/ https://www.agentic-layer.ai

agentic-layer/

The technical backbone for smart workloads.

QAware GmbH | Aschauer Straße 30 | 81549 München |
GF: Dr. Josef Adersberger, Michael Stehnken, Michael Rohleder, Mario-Leander Reimer Niederlassungen in München, Mainz, Rosenheim, Darmstadt | +49 89 232315-0 | [email protected] Thank you! The next step? Let's talk. Mario-Leander Reimer Managing Director, CTO [email protected] +49 151 61314748

Java-Powered AI on Kubernetes: From Development...

Java-Powered AI on Kubernetes: From Development to Deployment with Ease #JavaCro25

M.-Leander Reimer PRO

More Decks by M.-Leander Reimer

Other Decks in Programming

Featured

Transcript

qaware.de Java-Powered AI on Kubernetes: From Development to Deployment with

2 Mario-Leander Reimer Managing Director | CTO @LeanderReimer #cloudnativenerd #qaware

qaware.de A wave is coming!

qaware.de Agentic AI Software engineering agents Domain speciﬁc agentic workloads

qaware.de ... and we have the perfect surfboard! The logical

Micro-Agent GenAI Usage Prompts, Flow control Tools (MCP) Antwort enthält

"According to Gartner, 80% of AI PoCs fail on their

The 80% Fallacy of AI projects. 8 QAware Juan Pablo

The 60% Fallacy of production ready AI projects. 9 QAware

Chatbots and AI assistants: The more speciﬁc the use case,

Key challenges: technology, models and tools, scaling. Source: https://www.mckinsey.com/capabilities/quantumblack/our-insights/the-state-of-ai-in-2023-generative-ais-breakout-year ▪

vs

Conceptual Demo Showcase Architecture 13 QAware REST Beer Service Chatbot

qaware/k8s-native-java-ai

Quarkus Starter 15 QAware

16 QAware

Our proposal for an AI Platform Reference Architecture

The Kubernetes cluster topology requires precise planning. Otherwise the costs

Platform Plane Observability Operability Resource Plane Compute Data Integration Security

Compliance Plane Integration & Delivery Plane Service Plane Platform Plane

lreimer/k8s-native-ai-platform lreimer/k3s-ai-platform

https://www.agentic-layer.ai/ https://www.agentic-layer.ai

agentic-layer/

The technical backbone for smart workloads.

QAware GmbH | Aschauer Straße 30 | 81549 München |