Slide 1

Slide 1 text

qaware.de Turbocharging AI Innovation How AI Platforms Enable The Bulletproof Deployment of GenAI Use Cases Mario-Leander Reimer © 2024 QAware

Slide 2

Slide 2 text

2 Mario-Leander Reimer Managing Director | CTO @LeanderReimer #cloudnativenerd #qaware #gernperDude

Slide 3

Slide 3 text

Platform engineering is the discipline of designing and building toolchains and workflows that enable self-service capabilities for software engineering organizations in the cloud-native era. Platform engineers provide an integrated product most often referred to as an “Internal Developer Platform” covering the operational necessities of the entire lifecycle of an application. https://platformengineering.org/blog/what-is-platform-engineering

Slide 4

Slide 4 text

“Too much cognitive load will become a bottleneck for fast flow and high productivity for many teams.” ■ Intrinsic Cognitive Load Relates to fundamental aspects and knowledge in the problem space (e.g. languages, APIs, frameworks) ■ Extraneous Cognitive Load Relates to the environment (e.g. console command, deployment, configuration) ■ Germane Cognitive Load Relates to specific aspects of the business domain (aka. „value added“ thinking)

Slide 5

Slide 5 text

An IDP and your platform engineers are key enablers for high productivity of the stream-aligned DevOps teams. QAware | 5 ■ Responsible to build and operation a platform to enable and support the teams in their day to day development work. ■ The platform aims to hide the inherent complexity to reduce the cognitive load for the other teams. – Standardization (Compliance, Security, …) – Developer Self-Service ■ Fully automated software delivery is the goal! https://hennyportman.wordpress.com/2020/05/25/review-team-topologies/

Slide 6

Slide 6 text

AI platform engineering is the discipline of designing and building toolchains and workflows to provide self-service capabilities for data and AI driven organizations. Business experts, data engineers as well as software engineers work together in an integrated platform from now on referred to as an “Enterprise AI Platform” covering the operational necessities of the entire lifecycle of AI use cases. © 2024, M.-Leander Reimer

Slide 7

Slide 7 text

Endless Possibilities and Use Cases Chatbots, CWYD, Content Creation

Slide 8

Slide 8 text

The most common uses for GenAI tools are in marketing, sales, product development and service operations. Source: https://www.mckinsey.com/capabilities/quantumblack/our-insights/the-state-of-ai-in-2023-generative-ais-breakout-year

Slide 9

Slide 9 text

Use Case: Customer Support Support Ressourcen Recherche Support Flow Customer Data Images and Icons were generated with the assistance of AI RAG Automation Intent Recognition Text2Speech Speech2Text Anomaly Detection Similarity Matching AI Assistant Call Chat Multi Agent Workflow

Slide 10

Slide 10 text

RAG in a Nutshell. Index, e.g. Vector DB Indexing (Chunking & Embedding) Documents Ingestion Phase Query Encoding Retrieval Phase Context Prompt LLM with world knowhow Response

Slide 11

Slide 11 text

From input to embedding: this is how a high-performance semantic search using vector databases works. Embedding Model Images were generated with the assistance of AI { 23.567, 45.899, 76.345, …}

Slide 12

Slide 12 text

Chatbots and AI Assistants The more specific the use case, the more complex it becomes. ChatGPT or comparable with world knowhow ChatGPT with organisational context knowledge Specialized AI Assistent ■ Retrieval Augment Generation ■ Transfer Learning ■ Specially trained model ■ Process automation Complexity Benefit ■ Easy to use and cost efficient ■ Requires guidelines on data protection and compliance

Slide 13

Slide 13 text

Why do we need an AI platform?

Slide 14

Slide 14 text

The 80% Fallacy Juan Pablo Bottaro, LinkedIn Engineering Blog

Slide 15

Slide 15 text

Key challenges: models, tools and skills. Source: https://www.mckinsey.com/capabilities/quantumblack/our-insights/the-state-of-ai-in-2023-generative-ais-breakout-year

Slide 16

Slide 16 text

Each stakeholder involved has a different expertise and thus a different focus. Domain Expert Software Engineers and Architects This image was generated with the assistance of AI Data Scientists, AI Experts Platform Engineers

Slide 17

Slide 17 text

It all starts with an understanding of data and use cases. This is crucial for a structured start in any AI project. Business Understanding Data Understanding Data Preparation Modelling Evaluation Deployment

Slide 18

Slide 18 text

"But we are already doing this!" Really? MLOps only covers part of the tasks related to GenAI. Source: https://neptune.ai/blog/mlops

Slide 19

Slide 19 text

Our proposal for an AI platform architecture

Slide 20

Slide 20 text

Integration & Delivery Plane Service Plane Platform Plane Resource Plane Quality Plane Compliance Plane Foundation Foundation Interfaces Domain Service Domain Service Domain Service

Slide 21

Slide 21 text

Compliance Plane Integration & Delivery Plane Service Plane Data Plane Platform Plane Observability Operability Resource Plane User Serving Plane Access Plane / APIs Orchestration Plane Data Modelling Plane Model Plane Compute Data Integration Security Delivery FinOps Quality Plane

Slide 22

Slide 22 text

Quality Plane Integration & Delivery Plane Service Plane Access Plane/APIs User Serving Plane Orchestration Plane Data Modelling Pl. Data Plane Model Plane Compliance Plane Platform Plane Observability: Monito- ring, Logging, Tracing Security: Secrets, IAM Encryption, Certs, … Scale, Backups, Recovery, … Delivery: CI/DC, Registry Pipelines, Orchestrator, … FinOps Resource Plane Compute: CPU and GPU Data: Vector DBasS, other Storage, … Integration: Self-hosted LLMs Public LLMs Managed AI Services

Slide 23

Slide 23 text

Quality Plane Integration & Delivery Plane Service Plane Access Plane/APIs User Serving Plane „Convenience UIs“, Self Service, RAG per Drag and Drop, … (a) LLM, Embedding, (b) RAG, Chatbot, … (c) Data Access, … Orchestration Plane Data Modelling Pl. Playground Prompt Engineering Konfiguration Runtime, Instantiation, Orchestration, Scaling, Configuration Data Plane Model Plane Compliance Plane Platform Plane Observability: Monito- ring, Logging, Tracing Security: Secrets, IAM Encryption, Certs, … Scale, Backups, Recovery, … Delivery: CI/DC, Registry Pipelines, Orchestrator, … FinOps Resource Plane Compute: CPU and GPU Data: Vector DBasS, other Storage, … Integration: Self-hosted LLMs Public LLMs Managed AI Services

Slide 24

Slide 24 text

Quality Plane Integration & Delivery Plane Service Plane Access Plane/APIs User Serving Plane Technical and Business Metrics like Accuracy, Harmfulness, … Test Automation for LLMs „Convenience UIs“, Self Service, RAG per Drag and Drop, … (a) LLM, Embedding, (b) RAG, Chatbot, … (c) Data Access, … Orchestration Plane Data Modelling Pl. Playground Prompt Engineering Konfiguration Runtime, Instantiation, Orchestration, Scaling, Configuration Data Plane Ingestion Pipelines Data Versioning Embeddings & Vectorization Model Plane MLOps: Model Registry Model Management Experiment Tracking Model Serving Compliance Plane Tonality, Bias Security, Data Protection Platform Plane Observability: Monito- ring, Logging, Tracing Security: Secrets, IAM Encryption, Certs, … Scale, Backups, Recovery, … Delivery: CI/DC, Registry Pipelines, Orchestrator, … FinOps Resource Plane Compute: CPU and GPU Data: Vector DBasS, other Storage, … Integration: Self-hosted LLMs Public LLMs Managed AI Services

Slide 25

Slide 25 text

From concept to realisation: possible variants

Slide 26

Slide 26 text

All roads lead to Rome. Depending on the context, one or other variant makes sense. Buy an AI platform solution Combination of cloud provider building blocks Custom platform with open source components

Slide 27

Slide 27 text

Azure AI Studio (Preview) Azure AI Content Safety Quality Plane Integration & Delivery Plane Service Plane Azure API Management Access Plane Azure AI Studio (Preview) User Serving Plane Azure AI Studio (Preview) Semantic Kernel Orchestration Plane Azure AI Document Intelligence Data Modelling Pl. Azure AI Search with Indexers, Indices incl. Vector DBs. OneLake, Fabric Data Plane Azure OpenAI Azure Machine Learning Model Plane Azure AI Content Safety Compliance Plane Platform Plane Observability Security Scale, Backups, Recovery, … Delivery FinOps Resource Plane Compute Data Azure OpenAI Azure AI Language Speech Service Azure AI Translator Integration Overview on Azure AI Services: https://learn.microsoft.com/en-us/azure/ai-services/what-are-ai-services

Slide 28

Slide 28 text

Azure AI Studio (Preview) Azure AI Content Safety Quality Plane Integration & Delivery Plane Service Plane Azure API Management Access Plane Azure AI Studio (Preview) User Serving Plane Azure AI Studio (Preview) Semantic Kernel Orchestration Plane Azure AI Document Intelligence Data Modelling Pl. Azure AI Search with Indexers, Indices incl. Vector DBs. OneLake, Fabric Data Plane Azure OpenAI Azure Machine Learning Model Plane Azure AI Content Safety Compliance Plane Platform Plane Observability Security Scale, Backups, Recovery, … Delivery FinOps Resource Plane Compute Data Azure OpenAI Azure AI Language Speech Service Azure AI Translator Integration Overview on Azure AI Services: https://learn.microsoft.com/en-us/azure/ai-services/what-are-ai-services Just give it a try. Or ask Azure experts.

Slide 29

Slide 29 text

mlflow, Evidently AI, RAGAS (for RAG), DeepEval (for LLM) Quality Plane Integration & Delivery Plane Service Plane API Gateways Access Plane Build your own User Serving Plane Kubeflow Orchestration Plane Jupyter Kubeflow Data Modelling Pl. Weaviate, neo4J, … Custom Pipelines Data Plane mlflow (Registry) BentoML (Serving) Kubeflow (Serving) Model Plane Build your own Compliance Plane Platform Plane Observability Security Scale, Backups, Recovery, … Delivery FinOps Resource Plane Compute Data LLMs: Llama, Mistral, … mlflow BentoML Integration

Slide 30

Slide 30 text

No content

Slide 31

Slide 31 text

mlflow, Evidently AI, RAGAS (for RAG), DeepEval (for LLM) Quality Plane Integration & Delivery Plane Service Plane API Gateways Access Plane Build your own User Serving Plane Kubeflow Orchestration Plane Jupyter Kubeflow Data Modelling Pl. Weaviate, neo4J, … Custom Pipelines Data Plane mlflow (Registry) BentoML (Serving) Kubeflow (Serving) Model Plane Build your own Compliance Plane Platform Plane Observability Security Scale, Backups, Recovery, … Delivery FinOps Resource Plane Compute Data LLMs: Llama, Mistral, … mlflow BentoML Integration Use at your own risk! Or ask an AI platform expert.

Slide 32

Slide 32 text

Which one is right for me?

Slide 33

Slide 33 text

Start lean and agile! Tailored to the domain and problem instead of ‘One size fits all’. Use Case Identification Business Understanding Skill, Resource & Requirements Analysis Building Block Mapping & Prioritization Implementation Evaluation Commoditization

Slide 34

Slide 34 text

QAware GmbH | Aschauer Straße 30 | 81549 München | GF: Dr. Josef Adersberger, Michael Stehnken, Michael Rohleder, Mario-Leander Reimer Niederlassungen in München, Mainz, Rosenheim, Darmstadt | +49 89 232315-0 | [email protected] Thank you!