Vector Database for generative AI

©, 2024 Amazon Web Services, Inc. or its affiliates. All
rights reserved. Abhishek Gupta Principal Developer Advocate Amazon Web Services Vector Databases for generative AI @abhi_tweeter linkedin/in/abhirockzz

© 2024, Amazon Web Services, Inc. or its affiliates. All
rights reserved.

rights reserved. Agenda Intro: Large Language Models, Vector, Vector stores (demo) Vector Databases: Landscape and options Retrieval Augmented Generation (RAG) (more demos…)

rights reserved. Generative AI is powered by foundation models Pre-trained on vast amounts of unstructured data Contains large number of parameters which makes them capable of learning complex concepts Can be applied in a wide range of contexts Customize FMs using your data for domain specific tasks

rights reserved. Broad choice of models Amazon Titan Text Embeddings Titan Multimodal Embeddings Titan Text Lite Titan Text Express Titan Image Generator Claude Opus Claude Haiku Claude Sonnet Claude Instant Claude 2.x Llama 2 13B Llama 2 70B Llama 2 Chat 13B Llama 2 Chat 70B Command Cohere Command Light Cohere Embed English Cohere Embed Multilingual Stable Diffusion XL1.0 Jurassic-2 Ultra Jurassic-2 Mid Summarization, complex reasoning, writing, coding Contextual answers, summarization, paraphrasing High-quality images and art Text generation, search, classification Q&A and reading comprehension Text summarization, generation, Q&A, search, Image generation Mistral Large Mistral 7B Mistral 8x7B Text summarization, Q&A, Text classification, Text completion, code generation

rights reserved. LLM limitations: Knowledge cut-off and Hallucination Hallucination

rights reserved. LLM limitations: (Lack of) Access to external or domain- specific data Generic generative AI Generative AI that knows your business and your customer

rights reserved. Prompt Engineering Retrieval Augmented Generation (RAG) Fine Tuning (FT) Continued Pre- Training (CPT) Complexity, Quality, Cost, Time Common solutions

rights reserved. Vector Databases

rights reserved. What are Vectors (embeddings) ? E M B E D D I N G M O D E L 0.027 -0.011 -0.023 … 0.025 -0.009 -0.025 … New York Paris Vector Embeddings Human Text • Numerical representation of text (vectors) that captures semantics and relationships between words. • Embedding models capture features and nuances of the text. -0.011 0.021 0.013 … Animal -0.009 0.019 0.015 … Horse

rights reserved. Vector Databases - options Vector support in existing databases Specialized Vector stores Amazon Aurora (PostgreSQL) Amazon DocumentDB (with MongoDB compatibility) Amazon MemoryDB for Redis Amazon OpenSearch

rights reserved. The big picture Image Documents Audio 0.35 0.1 0 0.9 001.0 00 0001.0 0 0… 0.35 0.1 0 0.8 001.0 00 0001.0 0 0… 0.15 0.1 0 0.7 001.0 00 0001.0 0 0… Vector Database Build ML-powered search and analytics applications Dense vector encodings Sparse vector encodings (automatic metadata extraction) Retrieve content most similar to some content: question context, image, music clip . . . Content classification Salient terms and topics . . . Retrieve most relevant content by key terms (metadata) . . . Raw data Vector embedding space Dev-ready and operationalized Consumable Machine learning model (embedding)

rights reserved. Demo: Semantic Search More info

rights reserved. RAG – Retrieval Augmented Generation

rights reserved. Large Language Model Prompt Augmentation Response Generation RAG in Action Embeddings model Data source Vector store Embeddings model Embedding User User Input Context -0.02 0.89 -0.38 -0.53 0.95 0.17 Text Generation Workflow Data Ingestion Workflow Retrieval (Semantic search) Document chunks

rights reserved. Demo: RAG More info

rights reserved. However, when it comes to implementing RAG, there are challenges… Creating vector embeddings for large volumes of data Orchestration Managing multiple data sources Scaling retrieval mechanism Coding effort Incremental updates to vector store

rights reserved. Fully-managed RAG experience

rights reserved. Knowledge Bases for Fully-managed native support for retrieval augmented generation Fully managed support for end-to- end RAG workflow Securely connect FMs and agents to data sources Automatically converts text documents into embeddings Stores embeddings in your vector database Retrieves embeddings and augments prompts Provide source attribution

rights reserved. Demo: Knowledge bases for Amazon Bedrock

rights reserved. Knowledge Bases: RetrieveAndGenerate API Response User User query Fully managed RAG RetrieveAndGenerate API Generate query embedding Retrieve similar documents from knowledge base Augment query with retrieved documents Generate response from LLM User query Generated response

rights reserved. Customize RAG workflows using Retrieve API Large Language Model Prompt augmentation Response User User Input Context Customized RAG workflow Retrieve API Generate query embedding Retrieve similar documents from knowledge base User query Retrieved documents

Thank you! © 2023, Amazon Web Services, Inc. or its
affiliates. All rights reserved. Amazon Web Services, AWS, the Powered by AWS logo, and all AWS service names used in this slide deck are trademarks of Amazon.com, Inc. or its affiliates. Abhishek Gupta @abhi_tweeter linkedin/in/abhirockzz Generative AI course on AWS Skill Builder Build along with the generative AI community!

Vector Database for generative AI

Vector Database for generative AI

Abhishek Gupta

More Decks by Abhishek Gupta

Featured

Transcript

©, 2024 Amazon Web Services, Inc. or its affiliates. All

© 2024, Amazon Web Services, Inc. or its affiliates. All

© 2024, Amazon Web Services, Inc. or its affiliates. All

© 2024, Amazon Web Services, Inc. or its affiliates. All

© 2024, Amazon Web Services, Inc. or its affiliates. All

© 2024, Amazon Web Services, Inc. or its affiliates. All

© 2024, Amazon Web Services, Inc. or its affiliates. All

© 2024, Amazon Web Services, Inc. or its affiliates. All

© 2024, Amazon Web Services, Inc. or its affiliates. All

© 2024, Amazon Web Services, Inc. or its affiliates. All

© 2024, Amazon Web Services, Inc. or its affiliates. All

© 2024, Amazon Web Services, Inc. or its affiliates. All

© 2024, Amazon Web Services, Inc. or its affiliates. All

© 2024, Amazon Web Services, Inc. or its affiliates. All

© 2024, Amazon Web Services, Inc. or its affiliates. All

© 2024, Amazon Web Services, Inc. or its affiliates. All

© 2024, Amazon Web Services, Inc. or its affiliates. All

© 2024, Amazon Web Services, Inc. or its affiliates. All

© 2024, Amazon Web Services, Inc. or its affiliates. All

© 2024, Amazon Web Services, Inc. or its affiliates. All

© 2024, Amazon Web Services, Inc. or its affiliates. All

© 2024, Amazon Web Services, Inc. or its affiliates. All

© 2024, Amazon Web Services, Inc. or its affiliates. All

Thank you! © 2023, Amazon Web Services, Inc. or its