Supercharge Conversational AI with RAG, Agents, and Memory

Slide 1

Slide 1 text

Webinar: Supercharge Conversational AI with RAG, Agents, and Memory (November 2023) Supercharge Conversational AI with RAG, Agents, and Memory

Slide 2

Slide 2 text

Bilge Yücel Developer Advocate, Deepset James Briggs Staff Developer Advocate, Pinecone 2

Slide 3

Slide 3 text

Webinar: Supercharge Conversational AI with RAG, Agents, and Memory (November 2023)

Slide 4

Slide 4 text

Webinar: Supercharge Conversational AI with RAG, Agents, and Memory (November 2023) Not only a problem with up to date info 🔒 Internal data 󰡸 Niche domain 🥚 Freshness 🔮 Hallucinations

Slide 5

Slide 5 text

Webinar: Supercharge Conversational AI with RAG, Agents, and Memory (November 2023) The Problem

Slide 6

Slide 6 text

Webinar: Supercharge Conversational AI with RAG, Agents, and Memory (November 2023) The Solutions 🏋 Fine-tuning 📚 Retrieval Augmented Generation (RAG)

Slide 7

Slide 7 text

Webinar: Supercharge Conversational AI with RAG, Agents, and Memory (November 2023) Fine-tuning ✅ Chatbot behavior ✅ Kind of accurate knowledge of data that doesn’t update often and is not too sensitive to errors ❌ Trust in results (as we cannot provide citations) ❌ Data that updates quickly (like company stock catalogue)

Slide 8

Slide 8 text

Webinar: Supercharge Conversational AI with RAG, Agents, and Memory (November 2023) RAG ✅ Fast changing knowledge that must be managed ✅ Adding citations to improve user trust ✅ Learning and storing facts about users ❌ Chatbot behavior

Slide 9

Slide 9 text

Webinar: Supercharge Conversational AI with RAG, Agents, and Memory (November 2023) Need all of these things? Use both!

Slide 10

Slide 10 text

Webinar: Supercharge Conversational AI with RAG, Agents, and Memory (November 2023) What is RAG?

Slide 11

Slide 11 text

Webinar: Supercharge Conversational AI with RAG, Agents, and Memory (November 2023) RAG swaps this:

Slide 12

Slide 12 text

Webinar: Supercharge Conversational AI with RAG, Agents, and Memory (November 2023) For this…

Slide 13

Slide 13 text

Webinar: Supercharge Conversational AI with RAG, Agents, and Memory (November 2023) How do we Feed External Knowledge to the LLM?

Slide 14

Slide 14 text

Webinar: Supercharge Conversational AI with RAG, Agents, and Memory (November 2023) Could we stuﬀ everything into the context window? (No) Arxiv Paper: N. Liu, et. al., Lost in the Middle: How Language Models Use Long Contexts (2023), Stanford University Our Article: A. Catav, Less is More: Why Use Retrieval Instead of Larger Context Windows (2023), Pinecone

Slide 15

Slide 15 text

Webinar: Supercharge Conversational AI with RAG, Agents, and Memory (November 2023) Arxiv Paper: N. Liu, et. al., Lost in the Middle: How Language Models Use Long Contexts (2023), Stanford University Our Article: A. Catav, Less is More: Why Use Retrieval Instead of Larger Context Windows (2023), Pinecone

Slide 16

Slide 16 text

Webinar: Supercharge Conversational AI with RAG, Agents, and Memory (November 2023) Is There a Better Way? We need to selectively feed highly relevant info into our context LLMs work in natural language — so ideally our search should too Using a vector DB, we can retrieve relevant docs with “natural language” — we can do a semantic search

Slide 17

Slide 17 text

Webinar: Supercharge Conversational AI with RAG, Agents, and Memory (November 2023) 🏦 Where is the Bank of England? 🌱 Where is the grassy bank? 🛩 How does a plane bank? 🐝 “the bees decided to have a mutiny against their queen” 🐝 “ﬂying stinging insects rebelled in opposition to the matriarch” Semantic Meaning

Slide 18

Slide 18 text

Webinar: Supercharge Conversational AI with RAG, Agents, and Memory (November 2023) Semantic Search

Slide 19

Slide 19 text

Webinar: Supercharge Conversational AI with RAG, Agents, and Memory (November 2023) Retrieval Augmented Generation (RAG) ✅ Retrieval of highly relevant docs using natural language search ✅ Scalable to billions of records ✅ Data management like a traditional DB 🚀 No context stuﬃng required!

Slide 20

Slide 20 text

Webinar: Supercharge Conversational AI with RAG, Agents, and Memory (November 2023) Retrieval Augmented Generation

Slide 21

Slide 21 text

Webinar: Supercharge Conversational AI with RAG, Agents, and Memory (November 2023) Vector DB Given a ‘query’ vector, Pinecone returns the most similar ‘context’ vectors

Slide 22

Slide 22 text

Webinar: Supercharge Conversational AI with RAG, Agents, and Memory (November 2023) Pinecone ● Purpose-built, cloud-native vector database for SotA vector search ● Scales to 10B+ records ● Integrations with ML frameworks and LLMs like Haystack! ● Provides completely managed infrastructure with automatic scaling, load balancing, and fault tolerance

Slide 23

Slide 23 text

What is Haystack? ● Fully open-source framework built in Python for custom LLM applications ● Provides tools that developers need to build state-of-the-art NLP systems ● Building blocks: Pipelines & Components

Slide 24

Slide 24 text

What is Haystack? ● Fully open-source framework built in Python for custom LLM applications

Slide 25

Slide 25 text

What is Haystack? ● Fully open-source framework built in Python for custom LLM applications

Slide 26

Slide 26 text

What is Haystack? ● Fully open-source framework built in Python for custom LLM applications

Slide 27

Slide 27 text

What is Haystack? ● Fully open-source framework built in Python for custom LLM applications

Slide 28

Slide 28 text

Indexing Pipeline

Slide 29

Slide 29 text

Indexing Pipeline

Slide 30

Slide 30 text

Indexing Pipeline

Slide 31

Slide 31 text

RAG Pipeline

Slide 32

Slide 32 text

RAG Pipeline ● Embeds query ● Retrieves relevant documents

Slide 33

Slide 33 text

RAG Pipeline ● Connection to LLMs ● Receives input ● Sends a prompt ● Returns a response

Slide 34

Slide 34 text

RAG Pipeline - Prompt

Slide 35

Slide 35 text

RAG Pipeline - Prompt

Slide 36

Slide 36 text

RAG Pipeline

Slide 37

Slide 37 text

Haystack Agent ● Uses LLMs ● Understands and processes the information retrieved from external sources

Slide 38

Slide 38 text

Haystack Agent ● Uses LLMs ● Understands and processes the information retrieved from external sources

Slide 39

Slide 39 text

● An Agent is in essence, a PromptNode that connects to an LLM and has a very clever initial prompt ● An Agent has access to a variety of Tools ● Tools may be: Haystack Pipelines, data connectors, the web… ● Each Tool may be useful to achieve one specific task, or simply, it may provide the Agent access to a certain knowledge base ● An Agent can create an action plan to resolve complex queries, and do so by invoking and orchestrating the Tools that it has at its disposal ● Memory: whole conversation or summary Features of an Agent

Slide 40

Slide 40 text

Agent

Slide 41

Slide 41 text

Agent Prompt

Slide 42

Slide 42 text

Agent Prompt

Slide 43

Slide 43 text

Agent Prompt

Slide 44

Slide 44 text

Agent Prompt

Slide 45

Slide 45 text

Agent Tool - RAG Pipeline

Slide 46

Slide 46 text

Agent Prompt Node

Slide 47

Slide 47 text

Agent Prompt Node

Slide 48

Slide 48 text

Agent Reasoning

Slide 49

Slide 49 text

Agent -> Conversational AI Memory

Slide 50

Slide 50 text

Conversational Agent Prompt

Slide 51

Slide 51 text

Memory

Slide 52

Slide 52 text

Conversational Agent

Slide 53

Slide 53 text

Conversational Agent

Slide 54

Slide 54 text

DEMO

Slide 55

Slide 55 text

Q&A Follow 👇 Haystack on Twitter: Haystack_AI What’s coming in Haystack 2.0 Haystack Community on Discord Pinecone Community Pinecone Twitter