Build NLP Apps
with Python
Even If You're a Total Newbie!
Slide 2
Slide 2 text
• 🥑 Developer Advocate at deepset
• 🏗 Open source LLM Framework: Haystack
• 🍕 First time at Hamburg Python Pizza
• 📍 Istanbul, Turkey
• 💃 Latin music
Twitter: @bilgeycl
Linkedin: Bilge Yucel
GitHub: @bilgeyucel
Bilge Yücel
Developer Advocate 🥑
deepset
Slide 3
Slide 3 text
Agenda
• Text Embeddings
• Vector Databases
• Retrieval
• LLMs
• Building a Generative QA App
Slide 4
Slide 4 text
Text Embeddings
01
Slide 5
Slide 5 text
Text Embeddings - Vectors
• Manageable by computers
• Different techniques:
⚬ Sparse: TF-IDF, BM25...
⚬ Dense: Trained models (Sentence Transformers, Cohere, OpenAI...)
• Often has 768 dimensions
Slide 6
Slide 6 text
Vector Databases
02
Slide 7
Slide 7 text
Vector Databases
• Databases that store high-dimensional vectors
• Optimized for vectors:
⚬ Vector search
⚬ CRUD operations
⚬ Metadata filtering
Slide 8
Slide 8 text
Retrieval
03
Slide 9
Slide 9 text
Retrieval
• Getting the most relevant information to the
query
• Used for semantic search, question answering
and more
Query
Slide 10
Slide 10 text
LLMs
04
Slide 11
Slide 11 text
Large Language Models (LLMs)
• Big language models
• Human-like output
• Text generation: summarization, generative QA, writing code, chat…
Slide 12
Slide 12 text
Building a
Generative QA App
05
Slide 13
Slide 13 text
Prompting
Slide 14
Slide 14 text
Retrieval Augmented Generation (RAG)
Slide 15
Slide 15 text
Retrieval Augmented Generation (RAG)
Slide 16
Slide 16 text
@bilgeycl
@Haystack_AI
Haystack
What is Haystack?
● Fully open-source framework built in Python for custom LLM
applications
● Provides tools that developers need to build state-of-the-art NLP
systems
● Building blocks: Pipelines & Components