Upgrade to Pro — share decks privately, control downloads, hide ads and more …

Vector Database for generative AI

Abhishek Gupta
April 22, 2024
28

Vector Database for generative AI

Abhishek Gupta

April 22, 2024
Tweet

Transcript

  1. ©, 2024 Amazon Web Services, Inc. or its affiliates. All

    rights reserved. Abhishek Gupta Principal Developer Advocate Amazon Web Services Vector Databases for generative AI @abhi_tweeter linkedin/in/abhirockzz
  2. © 2024, Amazon Web Services, Inc. or its affiliates. All

    rights reserved. Agenda Intro: Large Language Models, Vector, Vector stores (demo) Vector Databases: Landscape and options Retrieval Augmented Generation (RAG) (more demos…)
  3. © 2024, Amazon Web Services, Inc. or its affiliates. All

    rights reserved. Generative AI is powered by foundation models Pre-trained on vast amounts of unstructured data Contains large number of parameters which makes them capable of learning complex concepts Can be applied in a wide range of contexts Customize FMs using your data for domain specific tasks
  4. © 2024, Amazon Web Services, Inc. or its affiliates. All

    rights reserved. Broad choice of models Amazon Titan Text Embeddings Titan Multimodal Embeddings Titan Text Lite Titan Text Express Titan Image Generator Claude Opus Claude Haiku Claude Sonnet Claude Instant Claude 2.x Llama 2 13B Llama 2 70B Llama 2 Chat 13B Llama 2 Chat 70B Command Cohere Command Light Cohere Embed English Cohere Embed Multilingual Stable Diffusion XL1.0 Jurassic-2 Ultra Jurassic-2 Mid Summarization, complex reasoning, writing, coding Contextual answers, summarization, paraphrasing High-quality images and art Text generation, search, classification Q&A and reading comprehension Text summarization, generation, Q&A, search, Image generation Mistral Large Mistral 7B Mistral 8x7B Text summarization, Q&A, Text classification, Text completion, code generation
  5. © 2024, Amazon Web Services, Inc. or its affiliates. All

    rights reserved. LLM limitations: Knowledge cut-off and Hallucination Hallucination
  6. © 2024, Amazon Web Services, Inc. or its affiliates. All

    rights reserved. LLM limitations: (Lack of) Access to external or domain- specific data Generic generative AI Generative AI that knows your business and your customer
  7. © 2024, Amazon Web Services, Inc. or its affiliates. All

    rights reserved. Prompt Engineering Retrieval Augmented Generation (RAG) Fine Tuning (FT) Continued Pre- Training (CPT) Complexity, Quality, Cost, Time Common solutions
  8. © 2024, Amazon Web Services, Inc. or its affiliates. All

    rights reserved. What are Vectors (embeddings) ? E M B E D D I N G M O D E L 0.027 -0.011 -0.023 … 0.025 -0.009 -0.025 … New York Paris Vector Embeddings Human Text • Numerical representation of text (vectors) that captures semantics and relationships between words. • Embedding models capture features and nuances of the text. -0.011 0.021 0.013 … Animal -0.009 0.019 0.015 … Horse
  9. © 2024, Amazon Web Services, Inc. or its affiliates. All

    rights reserved. Vector Databases - options Vector support in existing databases Specialized Vector stores Amazon Aurora (PostgreSQL) Amazon DocumentDB (with MongoDB compatibility) Amazon MemoryDB for Redis Amazon OpenSearch
  10. © 2024, Amazon Web Services, Inc. or its affiliates. All

    rights reserved. The big picture Image Documents Audio 0.35 0.1 0 0.9 001.0 00 0001.0 0 0… 0.35 0.1 0 0.8 001.0 00 0001.0 0 0… 0.15 0.1 0 0.7 001.0 00 0001.0 0 0… Vector Database Build ML-powered search and analytics applications Dense vector encodings Sparse vector encodings (automatic metadata extraction) Retrieve content most similar to some content: question context, image, music clip . . . Content classification Salient terms and topics . . . Retrieve most relevant content by key terms (metadata) . . . Raw data Vector embedding space Dev-ready and operationalized Consumable Machine learning model (embedding)
  11. © 2024, Amazon Web Services, Inc. or its affiliates. All

    rights reserved. Demo: Semantic Search More info
  12. © 2024, Amazon Web Services, Inc. or its affiliates. All

    rights reserved. RAG – Retrieval Augmented Generation
  13. © 2024, Amazon Web Services, Inc. or its affiliates. All

    rights reserved. Large Language Model Prompt Augmentation Response Generation RAG in Action Embeddings model Data source Vector store Embeddings model Embedding User User Input Context -0.02 0.89 -0.38 -0.53 0.95 0.17 Text Generation Workflow Data Ingestion Workflow Retrieval (Semantic search) Document chunks
  14. © 2024, Amazon Web Services, Inc. or its affiliates. All

    rights reserved. Demo: RAG More info
  15. © 2024, Amazon Web Services, Inc. or its affiliates. All

    rights reserved. However, when it comes to implementing RAG, there are challenges… Creating vector embeddings for large volumes of data Orchestration Managing multiple data sources Scaling retrieval mechanism Coding effort Incremental updates to vector store
  16. © 2024, Amazon Web Services, Inc. or its affiliates. All

    rights reserved. Fully-managed RAG experience
  17. © 2024, Amazon Web Services, Inc. or its affiliates. All

    rights reserved. Knowledge Bases for Fully-managed native support for retrieval augmented generation Fully managed support for end-to- end RAG workflow Securely connect FMs and agents to data sources Automatically converts text documents into embeddings Stores embeddings in your vector database Retrieves embeddings and augments prompts Provide source attribution
  18. © 2024, Amazon Web Services, Inc. or its affiliates. All

    rights reserved. Demo: Knowledge bases for Amazon Bedrock
  19. © 2024, Amazon Web Services, Inc. or its affiliates. All

    rights reserved. Knowledge Bases: RetrieveAndGenerate API Response User User query Fully managed RAG RetrieveAndGenerate API Generate query embedding Retrieve similar documents from knowledge base Augment query with retrieved documents Generate response from LLM User query Generated response
  20. © 2024, Amazon Web Services, Inc. or its affiliates. All

    rights reserved. Customize RAG workflows using Retrieve API Large Language Model Prompt augmentation Response User User Input Context Customized RAG workflow Retrieve API Generate query embedding Retrieve similar documents from knowledge base User query Retrieved documents
  21. Thank you! © 2023, Amazon Web Services, Inc. or its

    affiliates. All rights reserved. Amazon Web Services, AWS, the Powered by AWS logo, and all AWS service names used in this slide deck are trademarks of Amazon.com, Inc. or its affiliates. Abhishek Gupta @abhi_tweeter linkedin/in/abhirockzz Generative AI course on AWS Skill Builder Build along with the generative AI community!