Supercharge Conversational AI with RAG, Agents, and Memory

Webinar: Supercharge Conversational AI with RAG, Agents, and Memory (November
2023) Supercharge Conversational AI with RAG, Agents, and Memory

Bilge Yücel Developer Advocate, Deepset James Briggs Staff Developer Advocate,
Pinecone 2

2023)

2023) Not only a problem with up to date info 🔒 Internal data 󰡸 Niche domain 🥚 Freshness 🔮 Hallucinations

2023) The Problem

2023) The Solutions 🏋 Fine-tuning 📚 Retrieval Augmented Generation (RAG)

2023) Fine-tuning ✅ Chatbot behavior ✅ Kind of accurate knowledge of data that doesn’t update often and is not too sensitive to errors ❌ Trust in results (as we cannot provide citations) ❌ Data that updates quickly (like company stock catalogue)

2023) RAG ✅ Fast changing knowledge that must be managed ✅ Adding citations to improve user trust ✅ Learning and storing facts about users ❌ Chatbot behavior

2023) Need all of these things? Use both!

2023) What is RAG?

2023) RAG swaps this:

2023) For this…

2023) How do we Feed External Knowledge to the LLM?

2023) Could we stuﬀ everything into the context window? (No) Arxiv Paper: N. Liu, et. al., Lost in the Middle: How Language Models Use Long Contexts (2023), Stanford University Our Article: A. Catav, Less is More: Why Use Retrieval Instead of Larger Context Windows (2023), Pinecone

2023) Arxiv Paper: N. Liu, et. al., Lost in the Middle: How Language Models Use Long Contexts (2023), Stanford University Our Article: A. Catav, Less is More: Why Use Retrieval Instead of Larger Context Windows (2023), Pinecone

2023) Is There a Better Way? We need to selectively feed highly relevant info into our context LLMs work in natural language — so ideally our search should too Using a vector DB, we can retrieve relevant docs with “natural language” — we can do a semantic search

2023) 🏦 Where is the Bank of England? 🌱 Where is the grassy bank? 🛩 How does a plane bank? 🐝 “the bees decided to have a mutiny against their queen” 🐝 “ﬂying stinging insects rebelled in opposition to the matriarch” Semantic Meaning

2023) Semantic Search

2023) Retrieval Augmented Generation (RAG) ✅ Retrieval of highly relevant docs using natural language search ✅ Scalable to billions of records ✅ Data management like a traditional DB 🚀 No context stuﬃng required!

2023) Retrieval Augmented Generation

2023) Vector DB Given a ‘query’ vector, Pinecone returns the most similar ‘context’ vectors

2023) Pinecone • Purpose-built, cloud-native vector database for SotA vector search • Scales to 10B+ records • Integrations with ML frameworks and LLMs like Haystack! • Provides completely managed infrastructure with automatic scaling, load balancing, and fault tolerance

What is Haystack? • Fully open-source framework built in Python
for custom LLM applications • Provides tools that developers need to build state-of-the-art NLP systems • Building blocks: Pipelines & Components

What is Haystack? • Fully open-source framework built in Python
for custom LLM applications

Indexing Pipeline

RAG Pipeline

RAG Pipeline • Embeds query • Retrieves relevant documents

RAG Pipeline • Connection to LLMs • Receives input •
Sends a prompt • Returns a response

RAG Pipeline - Prompt

RAG Pipeline

Haystack Agent • Uses LLMs • Understands and processes the
information retrieved from external sources

• An Agent is in essence, a PromptNode that connects
to an LLM and has a very clever initial prompt • An Agent has access to a variety of Tools • Tools may be: Haystack Pipelines, data connectors, the web… • Each Tool may be useful to achieve one specific task, or simply, it may provide the Agent access to a certain knowledge base • An Agent can create an action plan to resolve complex queries, and do so by invoking and orchestrating the Tools that it has at its disposal • Memory: whole conversation or summary Features of an Agent

Agent Prompt

Agent Tool - RAG Pipeline

Agent Prompt Node

Agent Reasoning

Agent -> Conversational AI Memory

Conversational Agent Prompt

Memory

Conversational Agent

Q&A Follow 👇 Haystack on Twitter: Haystack_AI What’s coming in
Haystack 2.0 Haystack Community on Discord Pinecone Community Pinecone Twitter

Supercharge Conversational AI with RAG, Agents,...

Supercharge Conversational AI with RAG, Agents, and Memory

More Decks by Bilge Yücel

Other Decks in Technology

Featured

Transcript