Upgrade to Pro — share decks privately, control downloads, hide ads and more …

Supercharge Conversational AI with RAG, Agents, and Memory

Bilge Yücel
November 02, 2023

Supercharge Conversational AI with RAG, Agents, and Memory

The slide deck of a webinar hosted by Pinecone: Supercharge Conversational AI with RAG, Agents, and Memory

Recording: https://www.youtube.com/watch?v=TYuoQjWur8w&t=1s

In this workshop, we will delve into RAG, agents, and memory — how to integrate them all using Pinecone and Haystack to create high performance, up-to-date, conversational AI.

One of the greatest challenges of developing with LLMs is their lack of up-to-date knowledge of the world or niche that they operate in. Retrieval Augmented Generation (RAG) allows us to tackle this issue by giving LLMs direct access to external information.

By pairing this with AI Agents, we can build systems with incredible accuracy and flexibility. The superpower of LLMs is their ability to converse in natural language, so to make the most of RAG and Agents we incorporate short-term memory, a fundamental component of conversational systems such as chatbots and personal assistants.

Bilge Yücel

November 02, 2023
Tweet

More Decks by Bilge Yücel

Other Decks in Technology

Transcript

  1. Webinar: Supercharge Conversational AI with RAG, Agents, and Memory (November

    2023) Supercharge Conversational AI with RAG, Agents, and Memory
  2. Webinar: Supercharge Conversational AI with RAG, Agents, and Memory (November

    2023) Not only a problem with up to date info 🔒 Internal data 󰡸 Niche domain 🥚 Freshness 🔮 Hallucinations
  3. Webinar: Supercharge Conversational AI with RAG, Agents, and Memory (November

    2023) The Solutions 🏋 Fine-tuning 📚 Retrieval Augmented Generation (RAG)
  4. Webinar: Supercharge Conversational AI with RAG, Agents, and Memory (November

    2023) Fine-tuning ✅ Chatbot behavior ✅ Kind of accurate knowledge of data that doesn’t update often and is not too sensitive to errors ❌ Trust in results (as we cannot provide citations) ❌ Data that updates quickly (like company stock catalogue)
  5. Webinar: Supercharge Conversational AI with RAG, Agents, and Memory (November

    2023) RAG ✅ Fast changing knowledge that must be managed ✅ Adding citations to improve user trust ✅ Learning and storing facts about users ❌ Chatbot behavior
  6. Webinar: Supercharge Conversational AI with RAG, Agents, and Memory (November

    2023) How do we Feed External Knowledge to the LLM?
  7. Webinar: Supercharge Conversational AI with RAG, Agents, and Memory (November

    2023) Could we stuff everything into the context window? (No) Arxiv Paper: N. Liu, et. al., Lost in the Middle: How Language Models Use Long Contexts (2023), Stanford University Our Article: A. Catav, Less is More: Why Use Retrieval Instead of Larger Context Windows (2023), Pinecone
  8. Webinar: Supercharge Conversational AI with RAG, Agents, and Memory (November

    2023) Arxiv Paper: N. Liu, et. al., Lost in the Middle: How Language Models Use Long Contexts (2023), Stanford University Our Article: A. Catav, Less is More: Why Use Retrieval Instead of Larger Context Windows (2023), Pinecone
  9. Webinar: Supercharge Conversational AI with RAG, Agents, and Memory (November

    2023) Is There a Better Way? We need to selectively feed highly relevant info into our context LLMs work in natural language — so ideally our search should too Using a vector DB, we can retrieve relevant docs with “natural language” — we can do a semantic search
  10. Webinar: Supercharge Conversational AI with RAG, Agents, and Memory (November

    2023) 🏦 Where is the Bank of England? 🌱 Where is the grassy bank? 🛩 How does a plane bank? 🐝 “the bees decided to have a mutiny against their queen” 🐝 “flying stinging insects rebelled in opposition to the matriarch” Semantic Meaning
  11. Webinar: Supercharge Conversational AI with RAG, Agents, and Memory (November

    2023) Retrieval Augmented Generation (RAG) ✅ Retrieval of highly relevant docs using natural language search ✅ Scalable to billions of records ✅ Data management like a traditional DB 🚀 No context stuffing required!
  12. Webinar: Supercharge Conversational AI with RAG, Agents, and Memory (November

    2023) Vector DB Given a ‘query’ vector, Pinecone returns the most similar ‘context’ vectors
  13. Webinar: Supercharge Conversational AI with RAG, Agents, and Memory (November

    2023) Pinecone • Purpose-built, cloud-native vector database for SotA vector search • Scales to 10B+ records • Integrations with ML frameworks and LLMs like Haystack! • Provides completely managed infrastructure with automatic scaling, load balancing, and fault tolerance
  14. What is Haystack? • Fully open-source framework built in Python

    for custom LLM applications • Provides tools that developers need to build state-of-the-art NLP systems • Building blocks: Pipelines & Components
  15. RAG Pipeline • Connection to LLMs • Receives input •

    Sends a prompt • Returns a response
  16. Haystack Agent • Uses LLMs • Understands and processes the

    information retrieved from external sources
  17. Haystack Agent • Uses LLMs • Understands and processes the

    information retrieved from external sources
  18. • An Agent is in essence, a PromptNode that connects

    to an LLM and has a very clever initial prompt • An Agent has access to a variety of Tools • Tools may be: Haystack Pipelines, data connectors, the web… • Each Tool may be useful to achieve one specific task, or simply, it may provide the Agent access to a certain knowledge base • An Agent can create an action plan to resolve complex queries, and do so by invoking and orchestrating the Tools that it has at its disposal • Memory: whole conversation or summary Features of an Agent
  19. Q&A Follow 👇 Haystack on Twitter: Haystack_AI What’s coming in

    Haystack 2.0 Haystack Community on Discord Pinecone Community Pinecone Twitter