Upgrade to Pro — share decks privately, control downloads, hide ads and more …

Architecture for Generative AI

Architecture for Generative AI

What are the key architectural concerns around generative AI?
How to design and implement generative AI with an architect view?

William El Kaim

December 31, 2024
Tweet

More Decks by William El Kaim

Other Decks in Technology

Transcript

  1. Copyright © 2023 by Boston Consulting Group. All rights reserved.

    1 IT Architecture Director BCG Platinion, Paris SAP Community enterprise architecture group volunteer Leader of the EMESA Platinion Ent. Arch. practice, framework and tools community ​ BCG Platinion Blog: https://bcgplatinion.com/insights Personal Blog: https://el-kaim.com/ William El Kaim, PhD
  2. Copyright © 2023 by Boston Consulting Group. All rights reserved.

    2 Agenda for today Architecture for Gen AI Deploying Gen AI at scale raises requirements on architecture mgmt. & governance Gen AI for Architects Gen AI new capabilities that could impact architect's discipline Image created by midjourney, prompt: future tech version of mind-blowing chatbot, ultra-realistic, photo, --ar 3:4 Agenda for tomorrow
  3. 4 Copyright © 2023 by Boston Consulting Group. All rights

    reserved. AI empowered Composable Cloud Native Data Driven Paradigm shift due to a combination of 5 major architecture evolutions Becoming a data-driven organization is one of the top strategic goal of most companies today. But despite increasing effort and investment in building data factories and data platforms, companies find the results unsatisfying (facing challenges with monolith legacy data systems and change resistant culture). A new enterprise data architecture concept, "Data Mesh", is gaining traction enabling to better build and operate distributed data architecture at scale. Companies face multiple barriers when adopting cloud: Mental barriers on business risks and compliance, lack of experienced in-house skills, lack of knowledge of the owned existing IT and data landscape. Effective and value driven cloud usage requires programmatic automation (everything as code), affecting the entire software development / deployment lifecycle and sometimes changes in the organization as well Applications can be "augmented" by intelligent capabilities”, built outside and integrated or directly embedded. Building or selecting the right AI model requires new skills (Data Scientist) and implementing it at scale revealed to be complex. AI Model performance can drift in front of black swan events (like COVID 19) and requires active guardrails. Generative AI tidal wave, made it even further complicated, bringing new concepts, technologies and security risks not mastered by most Enterprise Architects Composable architecture is a design pattern allowing to build an application by assembling reusable components seamlessly. The objectives being to reduce risk, cost and to react quickly to business changes. To fight against monolith, the MACH architecture (microservice, API-First, Cloud native, Headless) emerged. SAP is recommending to adopt a clean-core, with satellites and a common integration layer to cement the whole (SAP BTP). 1 2 3 4 Frugal 5 Dr. Werner Vogels, the CTO at AWS, dedicated the first part of his re:Invent keynote to discussing the laws of frugal architectures, cloud-native architectures that aim to deliver cost-aware, sustainable, and maintainable solutions. Vogels laid out seven simple laws based on his and AWS’ experience building and evolving cloud platform services, with cost implications as one of the primary drivers. Source: BCG Platinion
  4. 5 Copyright © 2023 by Boston Consulting Group. All rights

    reserved. From Application Driven to Data Driven … Master Data, & Analytics Data Mesh, Modern Data Platform, Stream processing Transactional data Core systems, servitization (API), Event-driven • Apps created to deliver a process through a seamless user journey • Fast-paced iterations (days to weeks) to add & improve features • Evolves quickly over time to adapt to changing Business priorities From an "application-driven" approach… …Towards a "data-driven" approach Key area of focus • Focus on generating right quality data for applications and algorithms to refine into valuable Business insights • Automated treatment pipelines to refresh insights from minutes to real time • Stable in time (automatically adapting when Business data content evolves) Source: BCG Platinion
  5. 6 Copyright © 2023 by Boston Consulting Group. All rights

    reserved. Xxx Leaders in scaling and generating value from AI do three things better than other companies: 1. They prioritize the highest-impact use cases and scale them quickly to maximize value 2. They make data and technology accessible across the organization, avoiding siloed and incompatible tech stacks and standalone databases that impede scaling 3. They recognize the importance of aligned leadership and employees who build and leverage AI, and they support staff who promote collaboration and end-to-end agile product delivery. Source: BCG (link) How Winners Stand Apart? BCG Benchmark proved that scaling AI pays off, no matter the investment … • Leaders scale AI projects two to 3 times faster • Leaders move from idea to execution at scale in a matter of months (typically just 5 to 7) while other companies take an average of 15 to 17 months • As a result of this speed advantage, leaders can scale up 44% of the use cases in their portfolios, more than twice the 19% of other companies
  6. 7 Copyright © 2023 by Boston Consulting Group. All rights

    reserved. Legacy application are more “static", since evolving data or software requires to build a new release Data Models Code Guardrail Inspired by Martin Fowler blog post-Continuous Delivery for Machine Learning Image created by midjourney, prompt: future tech version of robot looking through a magnifier,--ar 3:4 (Gen) AI System is a new generation of application enterprise architects will have to deal with
  7. 9 Copyright © 2023 by Boston Consulting Group. All rights

    reserved. Start from your existing enterprise architecture playbook, and "extend" or rebuild its key artefacts Run Operation- alization Direction Architecture implementation Design target, as-is, solution architecture, and architecture roadmap Implement solution architectures Architecture Patterns Define standardized and best-practice architectural solutions Architecture Blueprints Capture as-is and target state of architecture and make it accessible to architect community Tech Standards Define which IT products and services are allowed for building arch. solutions Procedures and Guardrails Provide and define lower-level architecture design guidelines and guardrails Reference Architecture Define ("North Star") architecture vision for industry or business domains Architecture Principles Define general rules and guidelines for use and deployment of all IT resources and assets across the enterprise Form the basis for making future IT decisions IT strategy Define long-term vision and strategic direction of IT Source: BCG Platinion Focus for today
  8. 10 Copyright © 2023 by Boston Consulting Group. All rights

    reserved. Patterns guide the design of the “AI empowered” architecture principle AI empowered Enterprise LLM available via API switchboard • LLM will be considered as an enterprise IS services to be accessible on demand • LLMs will be provided as a services with strong AI guardrail and security • Usage of the LLMs will be facilitated through design patterns and guardrails • APIs will be the preferred way to consume the service (and prompt size limitations will have to be overcome) Embedded (local) LLM for tactical need • LLM will be used in local context, in both data centric or application centric use case • Local LLM will be open source and trained or fine tuned specifically to solve a particular issue Source: BCG Platinion Architecture patterns Model as a Service • Provide access to models executed in inference engines via APIs Intelligence At Scale • Leverage Algorithm and Data As A Service to build reusable Intelligent capabilities at scale LLM empowered • Leverage pre-trained models (LLM) to bring intelligence in applications or business process Two modes: decoupled or embedded
  9. 11 Copyright © 2023 by Boston Consulting Group. All rights

    reserved. Example: LLM Empowered pattern Problem Intelligent capabilities are needed in an application, but it is too complex to build it from scratch or to use standard AI techniques Solution Provide access to pre-trained foundation model (LLM) from the market How-To • Select the appropriate LLM, if many • Define how company data to be used to fine tune the model or using embeddings • Define prompts needed • Select the appropriate language and library to call the LLM • Ensure security and AI guardrail are in place Technology /booster Azure OpenAI, Google Gemini, Anthropic Claude, etc. Before After Application Algorithm Client Request Cognitive Application LLM Client Request Source: BCG Platinion
  10. 12 Copyright © 2023 by Boston Consulting Group. All rights

    reserved. Reference Architecture for (Gen)AI shows new layers … Infra and Cloud Layer Security Integration Data Governance Master Data Managemen t Analytical Data Catalog Data Ops Data Layer Operational Data Services Data Products Smart Business Layer (Systems of engagement) Repository & Storage Distribution & Integration (Gen)AI Layer Ops and monitoring E2E app vendors Cognitive Apps Builder Cognitive services Cognitive Apps Conversational Apps Model garden Foundation models Other small models Model platform Orchestration Guardrails & monitoring New architecture elements 7 6 4 5 1 2 3 Key considerations 1 2 3 6 4 5 7 8 9 Model platforms Support multiple models, privacy controls, performance (1-2 preferred platform(s) in short-run) Foundation models Model capabilities required for use-cases may impact near term platform selection. Open source for build use- cases. (Expect multiple) Orchestration New capabilities expected to coordinate different models and calls to internal and external apis Data management Leverage existing investment in data mgt, including data mesh, and add new required tech services (vector, prompt storage, etc.) Guardrails Capabilities to ensure correct behavior of (gen)ai (e.g., RLHF, red-teaming, constitutional AI) OPS and monitoring New capabilities to ensure correct operation of (gen)ai use-cases (including models, pipelines and data) Cybersecurity Extend capabilities required to detect and respond to new threats (e.G., Prompt injection, sponge attacks) Infra and cloud Ensure (gen)ai choices align with overall hosting strategy (multi/hybrid cloud); plan for higher infra consumption Integration Integrate new (gen)ai use cases to existing enterprise systems; leverage e 9 … Core Transaction Layer Public Cloud 8 Private Cloud Specialized Hardware (GPU & TPU) Extend current Capabilities New Capabilities Source: BCG Platinion
  11. 13 Copyright © 2023 by Boston Consulting Group. All rights

    reserved. GEnAI brings also new components and technologies … Proprietary API (OpenAI, Anthropic) Cloud Provider (AWS, GCP, Azure, COreweave) Open API (Hugging Face, Replicate) Opinionated Cloud (Databricks, Anyscale, Mosaic, Modal, Runpod,…) Contextual data Prompt Few-shot examples Query Output LLM APIs and Hosting LEGEND Gray boxes show key components of the stack, with leading tools/system listed Arrow show the flow of data through the stack Data Pipelines (Databricks, Airflow, Unstructured, …) APIs, Plugins (Serp, Wolfram, Zapier, …) LLM Cache (Redis, SQlite, GPTCache) Logging/LLMops (Weights & Biases, ML flow, PromptLayer, Helicone) Validation (Guardrails, Rebuff, Guidance, LMQL) Playground (OpenAI, nat.dev, Humanloop) App Hosting (Vercel, Steamship, Streamlit, Modal) Embedding Model (OpenAI, Cohere, Hugging Face) Orchestration (Phython/DIY, LangChain, LIamaIndex, ChatGPT) Vector Database (Pinecone, Weaviate, Chroma, pgvector) Output returned to users Queries submitted by users Prompts and few-shot examples that are sent to the LLM Contextual data provided by app developers to condition LLM output Source: Emerging Architectures for LLM Applications by Matt Bornstein and Rajko Radovanovic
  12. 14 Copyright © 2023 by Boston Consulting Group. All rights

    reserved. … Requiring to further decompose the LLM Empowered pattern … • LLM Commercial pre- trained LLM offered as a service via API/or dedicated UI • Prompts as the primary interaction mechanism • Trained on public data, can be extended with client data • Ex: OpenAI, Cohere, Anthropic • Enterprise made LLM offered as a service via API/or dedicated UI • Use existing LLM or create your own • Integrated in client bus. value chain • Trained or fine tuned specifically for a purpose/business domain • Ex: BloombergGPT, Carrefour Hopla, Hippocratic AI • Agent used to execute complex tasks leveraging multiple LLMs • Use existing LLM or client specific • Ex: BCG X AgentX, AutoGPT, AgentGPT, babyAGI, langchain, • LLM offered as enterprise Applications (SaaS) • Available in app marketplace can be independent or integrated in existing app (copilot, plug-in) • Work on client data • Ex: GitHub Copilot, petal, Careerflow AI, Qmed Copilot General purpose LLM Purpose-Built LLM Agent Copilot Source: BCG Platinion
  13. 15 Copyright © 2023 by Boston Consulting Group. All rights

    reserved. Including Platform and Model selection 1. Select Platform & Model provider 2. Select model(s) Select platform that will help provide the IT infrastructure & environment to build, train & deploy Gen AI foundation models The foundation models can be made available via APIs (integrated SaaS/Public cloud providers) or on client's private/on-prem environment Select specific model(s) that performs the task to support a specific use case A specific use case can be deployed by a model or a group of models that are supported by the chosen platform/model providers (e.g., Google vertex supports PaLM, Claude, GPT-4 etc.) Selection driven by company context Selection driven by specific use case Non-exhaustive Open-source models Google Vertex AWS Bedrock Source: BCG Platinion
  14. 16 Copyright © 2023 by Boston Consulting Group. All rights

    reserved. GenAI Platform stacks Infrastructure Layer Core Transactions Layer Data Layer Repository & Storage Ingestion & distribution Data Product Smart Bus. Layer Conversational and Cognitive App Cognitive App Builder Cognitive services AI Layer Model Garden Guardrails & monitoring Orchestration Model Platform AI ops Source: BCG Experience Palantir New AIP platform Provide full environment to create app Use GPT4 and BERT AWS Covers all layers Offers platform and business centric solutions Bedrock still in Preview Google Covers all layers Offers platform and business centric solutions Most GenAI products GA Microsoft Covers all layers Offers platform and business centric solutions enhanced with copilot Most GenAI products GA Databricks Data and AI platform Offers platform and business centric solutions Most GenAI products GA C3 AI platform Provide specific solutions for industries Use GPT 3 Hugging Face Data and AI platform Offers both open source and commercia l models SaaS platform and business centric solutions Nvidia Covers all layers, including hardware Offers platform and business centric solutions Leverage open- source solutions Salesforce Integration of GPT in Einstein for extending its product offer Starting by commerce Snowflake Offers platform and business centric solutions on multiple cloud Partnerhip with Microsoft to connect natively with its data and AI tools (PureView, Powerapps, Cognitive services) SAP Covers all layers Joule a new GenAI powered digital assistant embedded across suite of business applications Pompt engineering experience in SAP AI Launchpad Offers generative AI hub as part of SAP AI Core, support Langchain natively
  15. Copyright © 2023 by Boston Consulting Group. All rights reserved.

    17 (Gen) AI System can be described in LeanIX Source: BCG Data and Digital Platform Infrastructure/Cloud Layer Data Layer Ingestion & distribution Repository & storage Operational Data Services Data products Core Transaction Layer Smart Business Layer Cognitive App Builder Conversational App Cognitive Services Cognitive App (Gen)AI layer AI Ops Commercial E2E app Orchestration Model Garden Guardrails & monitoring Model Platform LeanIX metamodel V4 P A P A A A A P A P A P I A T P A I P T A I T P I T I A T A I T P A P P A I A I T P I D "AI model" concept seems to be the only one missing as first-class object of LeanIX metamodel
  16. Copyright © 2023 by Boston Consulting Group. All rights reserved.

    18 Can I describe my AI systems using the new LeanIX Editor? Source: BCG Platinion
  17. 19 Copyright © 2023 by Boston Consulting Group. All rights

    reserved. The new Enterprise Architect has a role to play in AI adoption at scale Application are built using modular cloud native components (including Software as a Service) that can be assembled to create new applications or functionalities Build Composable architecture Cloud native, API, (Micro)services, PaaS, CI/CD, Devops, Developer Portal, Event- driven Architecture Data driven (vs. App driven) approach ensure data can be leveraged to build AI services and systems at scale New AI tech standards to be defined and deployed as foundation Modern data platform, Generative AI, Advanced analytics and BI, data Mesh Ensure Intelligence everywhere Cloud native architecture are reactive by nature and can by automated by code, enabling cost observability (through resources mgt.) and sustainability impact App. Portfolio Mgt. and rationalization, move to cloud, Infrastructure as code, Sustainability reporting Make Cost a Non- functional Requirement Domain driven design applied to organization Platform / product separation using 4 "team Topologies" New roles and responsibilities for architects Platform based organization, Domain Driven Design, Agile at scale Adapt Organization continuously Source: BCG Platinion
  18. 20 The services and materials provided by Boston Consulting Group

    (BCG) are subject to BCG's Standard Terms (a copy of which is available upon request) or such other agreement as may have been previously executed by BCG. BCG does not provide legal, accounting, or tax advice. The Client is responsible for obtaining independent advice concerning these matters. This advice may affect the guidance given by BCG. Further, BCG has made no undertaking to update these materials after the date hereof, notwithstanding that such information may become outdated or inaccurate. The materials contained in this presentation are designed for the sole use by the board of directors or senior management of the Client and solely for the limited purposes described in the presentation. The materials shall not be copied or given to any person or entity other than the Client (“Third Party”) without the prior written consent of BCG. These materials serve only as the focus for discussion; they are incomplete without the accompanying oral commentary and may not be relied on as a stand-alone document. Further, Third Parties may not, and it is unreasonable for any Third Party to, rely on these materials for any purpose whatsoever. To the fullest extent permitted by law (and except to the extent otherwise agreed in a signed writing by BCG), BCG shall have no liability whatsoever to any Third Party, and any Third Party hereby waives any rights and claims it may have at any time against BCG with regard to the services, this presentation, or other materials, including the accuracy or completeness thereof. Receipt and review of this document shall be deemed agreement with and consideration for the foregoing. BCG does not provide fairness opinions or valuations of market transactions, and these materials should not be relied on or construed as such. Further, the financial evaluations, projected market and financial information, and conclusions contained in these materials are based upon standard valuation methodologies, are not definitive forecasts, and are not guaranteed by BCG. BCG has used public and/or confidential data and assumptions provided to BCG by the Client. BCG has not independently verified the data and assumptions used in these analyses. Changes in the underlying data or operating assumptions will clearly impact the analyses and conclusions. Copyright © 2023 by Boston Consulting Group. All rights reserved.