Upgrade to Pro — share decks privately, control downloads, hide ads and more …

Turbocharging AI Innovation: How AI Platforms E...

Turbocharging AI Innovation: How AI Platforms Enable The Bulletproof Deployment of GenAI Use Cases. #SAG2024

Generative AI is the talk of the town. Anyone who spends just five minutes thinking about AI can surely come up with several useful business use cases. However, all too often, we find ourselves facing the following dilemma: we want to quickly launch our chatbots and assistant systems and bring our ideas to market readiness. Yet at the same time, important, complex, cross-functional aspects such as data protection, compliance, operational readiness, or model fine-tuning often slow down rapid development and deployment.

Furthermore, enterprise scale AI projects often involve many different stakeholders: data engineers, AI specialists, software engineers, operational experts, and business departments. Too much talking and no progress at all are the result.

AI platforms to the rescue! We believe that established platform engineering approaches and technologies, combined with LLM Ops practices, can tackle this dilemma. Only a robust, scalable, and flexible platform enables our teams to efficiently develop, operate, and manage their data, models, and applications. The platform hides the inherent technical complexity, while allowing users to fully focus on the use case and the creation of value and innovation.

We will explore what a corporate AI platform can look like and the components and services it requires. We discuss how a company-wide platform strategy not only simplifies technical implementation but also creates an ecosystem for innovation, fosters collaboration, increases reusability, and ultimately drastically shortens the time to market.

M.-Leander Reimer

November 12, 2024
Tweet

More Decks by M.-Leander Reimer

Other Decks in Technology

Transcript

  1. qaware.de Turbocharging AI Innovation How AI Platforms Enable The Bulletproof

    Deployment of GenAI Use Cases Mario-Leander Reimer © 2024 QAware
  2. Platform engineering is the discipline of designing and building toolchains

    and workflows that enable self-service capabilities for software engineering organizations in the cloud-native era. Platform engineers provide an integrated product most often referred to as an “Internal Developer Platform” covering the operational necessities of the entire lifecycle of an application. https://platformengineering.org/blog/what-is-platform-engineering
  3. “Too much cognitive load will become a bottleneck for fast

    flow and high productivity for many teams.” ▪ Intrinsic Cognitive Load Relates to fundamental aspects and knowledge in the problem space (e.g. languages, APIs, frameworks) ▪ Extraneous Cognitive Load Relates to the environment (e.g. console command, deployment, configuration) ▪ Germane Cognitive Load Relates to specific aspects of the business domain (aka. „value added“ thinking)
  4. An IDP and your platform engineers are key enablers for

    high productivity of the stream-aligned DevOps teams. QAware | 5 ▪ Responsible to build and operation a platform to enable and support the teams in their day to day development work. ▪ The platform aims to hide the inherent complexity to reduce the cognitive load for the other teams. – Standardization (Compliance, Security, …) – Developer Self-Service ▪ Fully automated software delivery is the goal! https://hennyportman.wordpress.com/2020/05/25/review-team-topologies/
  5. AI platform engineering is the discipline of designing and building

    toolchains and workflows to provide self-service capabilities for data and AI driven organizations. Business experts, data engineers as well as software engineers work together in an integrated platform from now on referred to as an “Enterprise AI Platform” covering the operational necessities of the entire lifecycle of AI use cases. © 2024, M.-Leander Reimer
  6. The most common uses for GenAI tools are in marketing,

    sales, product development and service operations. Source: https://www.mckinsey.com/capabilities/quantumblack/our-insights/the-state-of-ai-in-2023-generative-ais-breakout-year
  7. Use Case: Customer Support Support Ressourcen Recherche Support Flow Customer

    Data Images and Icons were generated with the assistance of AI RAG Automation Intent Recognition Text2Speech Speech2Text Anomaly Detection Similarity Matching AI Assistant Call Chat Multi Agent Workflow
  8. RAG in a Nutshell. Index, e.g. Vector DB Indexing (Chunking

    & Embedding) Documents Ingestion Phase Query Encoding Retrieval Phase Context Prompt LLM with world knowhow Response
  9. From input to embedding: this is how a high-performance semantic

    search using vector databases works. Embedding Model Images were generated with the assistance of AI { 23.567, 45.899, 76.345, …}
  10. Chatbots and AI Assistants The more specific the use case,

    the more complex it becomes. ChatGPT or comparable with world knowhow ChatGPT with organisational context knowledge Specialized AI Assistent ▪ Retrieval Augment Generation ▪ Transfer Learning ▪ Specially trained model ▪ Process automation Complexity Benefit ▪ Easy to use and cost efficient ▪ Requires guidelines on data protection and compliance
  11. Each stakeholder involved has a different expertise and thus a

    different focus. Domain Expert Software Engineers and Architects This image was generated with the assistance of AI Data Scientists, AI Experts Platform Engineers
  12. It all starts with an understanding of data and use

    cases. This is crucial for a structured start in any AI project. Business Understanding Data Understanding Data Preparation Modelling Evaluation Deployment
  13. "But we are already doing this!" Really? MLOps only covers

    part of the tasks related to GenAI. Source: https://neptune.ai/blog/mlops
  14. Integration & Delivery Plane Service Plane Platform Plane Resource Plane

    Quality Plane Compliance Plane Foundation Foundation Interfaces Domain Service Domain Service Domain Service
  15. Compliance Plane Integration & Delivery Plane Service Plane Data Plane

    Platform Plane Observability Operability Resource Plane User Serving Plane Access Plane / APIs Orchestration Plane Data Modelling Plane Model Plane Compute Data Integration Security Delivery FinOps Quality Plane
  16. Quality Plane Integration & Delivery Plane Service Plane Access Plane/APIs

    User Serving Plane Orchestration Plane Data Modelling Pl. Data Plane Model Plane Compliance Plane Platform Plane Observability: Monito- ring, Logging, Tracing Security: Secrets, IAM Encryption, Certs, … Scale, Backups, Recovery, … Delivery: CI/DC, Registry Pipelines, Orchestrator, … FinOps Resource Plane Compute: CPU and GPU Data: Vector DBasS, other Storage, … Integration: Self-hosted LLMs Public LLMs Managed AI Services
  17. Quality Plane Integration & Delivery Plane Service Plane Access Plane/APIs

    User Serving Plane „Convenience UIs“, Self Service, RAG per Drag and Drop, … (a) LLM, Embedding, (b) RAG, Chatbot, … (c) Data Access, … Orchestration Plane Data Modelling Pl. Playground Prompt Engineering Konfiguration Runtime, Instantiation, Orchestration, Scaling, Configuration Data Plane Model Plane Compliance Plane Platform Plane Observability: Monito- ring, Logging, Tracing Security: Secrets, IAM Encryption, Certs, … Scale, Backups, Recovery, … Delivery: CI/DC, Registry Pipelines, Orchestrator, … FinOps Resource Plane Compute: CPU and GPU Data: Vector DBasS, other Storage, … Integration: Self-hosted LLMs Public LLMs Managed AI Services
  18. Quality Plane Integration & Delivery Plane Service Plane Access Plane/APIs

    User Serving Plane Technical and Business Metrics like Accuracy, Harmfulness, … Test Automation for LLMs „Convenience UIs“, Self Service, RAG per Drag and Drop, … (a) LLM, Embedding, (b) RAG, Chatbot, … (c) Data Access, … Orchestration Plane Data Modelling Pl. Playground Prompt Engineering Konfiguration Runtime, Instantiation, Orchestration, Scaling, Configuration Data Plane Ingestion Pipelines Data Versioning Embeddings & Vectorization Model Plane MLOps: Model Registry Model Management Experiment Tracking Model Serving Compliance Plane Tonality, Bias Security, Data Protection Platform Plane Observability: Monito- ring, Logging, Tracing Security: Secrets, IAM Encryption, Certs, … Scale, Backups, Recovery, … Delivery: CI/DC, Registry Pipelines, Orchestrator, … FinOps Resource Plane Compute: CPU and GPU Data: Vector DBasS, other Storage, … Integration: Self-hosted LLMs Public LLMs Managed AI Services
  19. All roads lead to Rome. Depending on the context, one

    or other variant makes sense. Buy an AI platform solution Combination of cloud provider building blocks Custom platform with open source components
  20. Azure AI Studio (Preview) Azure AI Content Safety Quality Plane

    Integration & Delivery Plane Service Plane Azure API Management Access Plane Azure AI Studio (Preview) User Serving Plane Azure AI Studio (Preview) Semantic Kernel Orchestration Plane Azure AI Document Intelligence Data Modelling Pl. Azure AI Search with Indexers, Indices incl. Vector DBs. OneLake, Fabric Data Plane Azure OpenAI Azure Machine Learning Model Plane Azure AI Content Safety Compliance Plane Platform Plane Observability Security Scale, Backups, Recovery, … Delivery FinOps Resource Plane Compute Data Azure OpenAI Azure AI Language Speech Service Azure AI Translator Integration Overview on Azure AI Services: https://learn.microsoft.com/en-us/azure/ai-services/what-are-ai-services
  21. Azure AI Studio (Preview) Azure AI Content Safety Quality Plane

    Integration & Delivery Plane Service Plane Azure API Management Access Plane Azure AI Studio (Preview) User Serving Plane Azure AI Studio (Preview) Semantic Kernel Orchestration Plane Azure AI Document Intelligence Data Modelling Pl. Azure AI Search with Indexers, Indices incl. Vector DBs. OneLake, Fabric Data Plane Azure OpenAI Azure Machine Learning Model Plane Azure AI Content Safety Compliance Plane Platform Plane Observability Security Scale, Backups, Recovery, … Delivery FinOps Resource Plane Compute Data Azure OpenAI Azure AI Language Speech Service Azure AI Translator Integration Overview on Azure AI Services: https://learn.microsoft.com/en-us/azure/ai-services/what-are-ai-services Just give it a try. Or ask Azure experts.
  22. mlflow, Evidently AI, RAGAS (for RAG), DeepEval (for LLM) Quality

    Plane Integration & Delivery Plane Service Plane API Gateways Access Plane Build your own User Serving Plane Kubeflow Orchestration Plane Jupyter Kubeflow Data Modelling Pl. Weaviate, neo4J, … Custom Pipelines Data Plane mlflow (Registry) BentoML (Serving) Kubeflow (Serving) Model Plane Build your own Compliance Plane Platform Plane Observability Security Scale, Backups, Recovery, … Delivery FinOps Resource Plane Compute Data LLMs: Llama, Mistral, … mlflow BentoML Integration
  23. mlflow, Evidently AI, RAGAS (for RAG), DeepEval (for LLM) Quality

    Plane Integration & Delivery Plane Service Plane API Gateways Access Plane Build your own User Serving Plane Kubeflow Orchestration Plane Jupyter Kubeflow Data Modelling Pl. Weaviate, neo4J, … Custom Pipelines Data Plane mlflow (Registry) BentoML (Serving) Kubeflow (Serving) Model Plane Build your own Compliance Plane Platform Plane Observability Security Scale, Backups, Recovery, … Delivery FinOps Resource Plane Compute Data LLMs: Llama, Mistral, … mlflow BentoML Integration Use at your own risk! Or ask an AI platform expert.
  24. Start lean and agile! Tailored to the domain and problem

    instead of ‘One size fits all’. Use Case Identification Business Understanding Skill, Resource & Requirements Analysis Building Block Mapping & Prioritization Implementation Evaluation Commoditization
  25. QAware GmbH | Aschauer Straße 30 | 81549 München |

    GF: Dr. Josef Adersberger, Michael Stehnken, Michael Rohleder, Mario-Leander Reimer Niederlassungen in München, Mainz, Rosenheim, Darmstadt | +49 89 232315-0 | [email protected] Thank you!