Build Your First LLM-based Application with Haystack

Slide 1

Slide 1 text

Bilge Yücel PyLadiesCon Build Your First LLM-based Application with Haystack

Slide 2

Slide 2 text

01 - What is it? ● 🥑 Developer Advocate at deepset ● 🏗 Open source LLM Framework: Haystack ● 📍 Istanbul, Turkey Bilge Yücel Developer Advocate 🥑 deepset Twitter: @bilgeycl Linkedin: Bilge Yucel GitHub: @bilgeyucel

Slide 3

Slide 3 text

Agenda 01 - Text Embeddings 02 - Vector Databases 03 - Retrieval Ret 04 - LLMs 05 - Build a Generative QA App

Slide 4

Slide 4 text

01 Text Embeddings

Slide 5

Slide 5 text

Text Embeddings/Text Vectors to be or not to be ● Manageable by computers ● Different techniques: ○ Sparse: TF-IDF, BM25... ○ Dense: Trained models (Sentence Transformers, Cohere, OpenAI...) ● Often has 768 dimensions

Slide 6

Slide 6 text

02 Vector Databases

Slide 7

Slide 7 text

● Databases that store high-dimensional vectors ● Optimized for vectors: ○ Vector search ○ CRUD operations ○ Metadata ﬁltering Vector Databases

Slide 8

Slide 8 text

03 Retrieval

Slide 9

Slide 9 text

Retrieval Query ● Getting the most relevant information to the query ● Used for semantic search, question answering and more

Slide 10

Slide 10 text

04 LLMs

Slide 11

Slide 11 text

Large Language Models (LLMs) ● Big language models ● Prompt → Human-like output ● Text generation: summarization, generative QA, writing code, chat…

Slide 12

Slide 12 text

05 Build a Generative QA Application

Slide 13

Slide 13 text

Prompting

Slide 14

Slide 14 text

LLM: Limitations ● LLMs do not know the answer to everything ● But they are good at following instructions ● We can help them in their task by giving them the relevant context + instruction

Slide 15

Slide 15 text

Prompting

Slide 16

Slide 16 text

Prompting

Slide 17

Slide 17 text

Retrieval Augmented Generation (RAG)

Slide 18

Slide 18 text

Retrieval Augmentation Use Cases Prompt Given the following context, answer the question. If the answer is not contained within the context, say ‘I don’t know’. Question Answering Context: {{context}} Question: {{question}} Answer: Prompt Summarize the following text. Summarization Text: {{text}} Summary: Question Generation Document: {{document}} Questions: Prompt Given the following document, generate some questions

Slide 19

Slide 19 text

● Fully open-source framework built in Python for custom LLM applications ● Provides tools that developers need to build state-of-the-art NLP systems ● Building blocks: Pipelines & Components Haystack

Slide 20

Slide 20 text

Indexing Pipeline urls

Slide 21

Slide 21 text

Indexing Pipeline Notebook

Slide 22

Slide 22 text

Indexing Pipeline Notebook

Slide 23

Slide 23 text

Generative QA Pipeline (RAG) What is happening at OpenAI?

Slide 24

Slide 24 text

Generative QA Pipeline (RAG) Notebook

Slide 25

Slide 25 text

Generative QA Pipeline (RAG) Notebook

Slide 26

Slide 26 text

https://haystack.deepset.ai/advent-of-haystack

Slide 27

Slide 27 text

Resources Join 👇 Advent of Haystack Check out 👇 Haystack @bilgeycl Bilge Yücel Find 👇 Presentation

Slide 28

Slide 28 text

Thank you!