rights reserved. Abhishek Gupta Principal Developer Advocate Amazon Web Services Vector Databases for generative AI @abhi_tweeter linkedin/in/abhirockzz
rights reserved. Generative AI is powered by foundation models Pre-trained on vast amounts of unstructured data Contains large number of parameters which makes them capable of learning complex concepts Can be applied in a wide range of contexts Customize FMs using your data for domain specific tasks
rights reserved. Broad choice of models Amazon Titan Text Embeddings Titan Multimodal Embeddings Titan Text Lite Titan Text Express Titan Image Generator Claude Opus Claude Haiku Claude Sonnet Claude Instant Claude 2.x Llama 2 13B Llama 2 70B Llama 2 Chat 13B Llama 2 Chat 70B Command Cohere Command Light Cohere Embed English Cohere Embed Multilingual Stable Diffusion XL1.0 Jurassic-2 Ultra Jurassic-2 Mid Summarization, complex reasoning, writing, coding Contextual answers, summarization, paraphrasing High-quality images and art Text generation, search, classification Q&A and reading comprehension Text summarization, generation, Q&A, search, Image generation Mistral Large Mistral 7B Mistral 8x7B Text summarization, Q&A, Text classification, Text completion, code generation
rights reserved. LLM limitations: (Lack of) Access to external or domain- specific data Generic generative AI Generative AI that knows your business and your customer
rights reserved. Prompt Engineering Retrieval Augmented Generation (RAG) Fine Tuning (FT) Continued Pre- Training (CPT) Complexity, Quality, Cost, Time Common solutions
rights reserved. What are Vectors (embeddings) ? E M B E D D I N G M O D E L 0.027 -0.011 -0.023 … 0.025 -0.009 -0.025 … New York Paris Vector Embeddings Human Text • Numerical representation of text (vectors) that captures semantics and relationships between words. • Embedding models capture features and nuances of the text. -0.011 0.021 0.013 … Animal -0.009 0.019 0.015 … Horse
rights reserved. Large Language Model Prompt Augmentation Response Generation RAG in Action Embeddings model Data source Vector store Embeddings model Embedding User User Input Context -0.02 0.89 -0.38 -0.53 0.95 0.17 Text Generation Workflow Data Ingestion Workflow Retrieval (Semantic search) Document chunks
rights reserved. However, when it comes to implementing RAG, there are challenges… Creating vector embeddings for large volumes of data Orchestration Managing multiple data sources Scaling retrieval mechanism Coding effort Incremental updates to vector store
rights reserved. Knowledge Bases for Fully-managed native support for retrieval augmented generation Fully managed support for end-to- end RAG workflow Securely connect FMs and agents to data sources Automatically converts text documents into embeddings Stores embeddings in your vector database Retrieves embeddings and augments prompts Provide source attribution
rights reserved. Knowledge Bases: RetrieveAndGenerate API Response User User query Fully managed RAG RetrieveAndGenerate API Generate query embedding Retrieve similar documents from knowledge base Augment query with retrieved documents Generate response from LLM User query Generated response
rights reserved. Customize RAG workflows using Retrieve API Large Language Model Prompt augmentation Response User User Input Context Customized RAG workflow Retrieve API Generate query embedding Retrieve similar documents from knowledge base User query Retrieved documents
affiliates. All rights reserved. Amazon Web Services, AWS, the Powered by AWS logo, and all AWS service names used in this slide deck are trademarks of Amazon.com, Inc. or its affiliates. Abhishek Gupta @abhi_tweeter linkedin/in/abhirockzz Generative AI course on AWS Skill Builder Build along with the generative AI community!