Slide 1

Slide 1 text

Building a RAG-powered AI chat app with Python and VS Code aka.ms/rag-postgres-slides

Slide 2

Slide 2 text

Today we’ll discuss… •LLMs: Large Language Models •RAG: Retrieval Augmented Generation •RAG on PostgreSQL

Slide 3

Slide 3 text

LLMs

Slide 4

Slide 4 text

LLM: Large Language Model An LLM is a model that is so large that it achieves general-purpose language understanding and generation. Review: This movie sucks. Sentiment: negative Review: I love this movie: Sentiment: Input LLM positive Output

Slide 5

Slide 5 text

Popular LLMs Model # of Parameters Creator Uses GPT 3.5 175 B OpenAI ChatGPT, Copilots, APIs GPT 4 Undisclosed OpenAI Gemini Undisclosed Google Bard Claude 2,3 130 B Anthropic APIs LlaMA 70 B Meta OSS Mistral-7B, Mixtral 7 B Mistral AI OSS Phi-3 3.8B Microsoft Research OSS

Slide 6

Slide 6 text

Using OpenAI LLMs: Python response = openai.chat.completions.create( messages = [ { "role": "system", "content": "You are a helpful assistant with very flowery language" }, { "role": "user", "content": "what python frameworks does pamela use?" } ]) for event in response: print(event.choices[0].message.content) https://github.com/pamelafox/python-openai-demos

Slide 7

Slide 7 text

The limitations of LLMs Outdated public knowledge No internal knowledge

Slide 8

Slide 8 text

Incorporating domain knowledge Prompt engineering Fine tuning Retrieval Augmented Generation In-context learning Learn new skills (permanently) Learn new facts (temporarily)

Slide 9

Slide 9 text

Retrieval Augmented Generation

Slide 10

Slide 10 text

RAG on documents Search deploying-django-apps-to-azure.html: Since I'm on the Python team, I've been working on templates for the top Python frameworks, Django, Flask, and FastAPI.\nMy first finished template is for a Django app with PostgreSQL database that gets deployed to Azure App Service with a PostgreSQL flexible server. … Large Language Model Pamela uses the following Python frameworks: Flask 1, FastAPI 2, Django3 User Question What python frameworks does Pamela use?

Slide 11

Slide 11 text

The importance of the search step Garbage in, garbage out: If the search results don’t contain a good answer, the LLM will be unable to answer or will answer wrongly. Noisy input: If the LLM receives too much information, it may not find the correct answer amidst the noise. Source: Lost in the Middle: How Language Models Use Long Contexts, Liu et al. arXiv:2307.03172 50 55 60 65 70 75 5 15 25 Accuracy Number of documents in input context

Slide 12

Slide 12 text

Optimal search strategy Vector Keywords Fusion (RRF) Reranking model Vector search is best for finding semantically related matches Keyword search is best for exact matches (proper nouns, numbers, etc) Hybrid search combines vector search and keyword search, optimally using Reciprocal-Rank-Fusion for merging results and a ML model to re-rank results after aka.ms/ragrelevance

Slide 13

Slide 13 text

What is the RAG searching? Documents (Unstructured data) PDFs, docx, pptx, md, html, images You need an ingestion process for extracting, splitting, vectorizing, and storing document chunks. Azure options: Azure AI Search with Document Intelligence, OpenAI embedding models, Integrated Vectorization Database rows (Structured data) PostgreSQL, MongoDB, Qdrant, etc. You need a way to vectorize & search target columns. Azure options: • PostgreSQL+pgvector • CosmosMongoDB+vector • Container Apps services (Milvus, Qdrant, Weaviate) + OpenAI embedding models aka.ms/ragchat aka.ms/rag-postgres

Slide 14

Slide 14 text

RAG on PostgreSQL aka.ms/rag-postgres

Slide 15

Slide 15 text

RAG on database rows Search [12]: Name: Summit Climbing Harness Description: Conquer the highest peaks with the Raptor Elite Summit Climbing Harness. This durable and lightweight harness, available in bold red, provides maximum comfort and safety while scaling tricky routes. Price:109.99 … Large Language Model Consider these products: Summit Climbing Harness1 Apex Climbing Harness2 User Question Do you sell climbing gear?

Slide 16

Slide 16 text

Example database schema class Item(Base): __tablename__ = "items" id: Mapped[int] = mapped_column(primary_key=True, autoincrement=True) type: Mapped[str] = mapped_column() brand: Mapped[str] = mapped_column() name: Mapped[str] = mapped_column() description: Mapped[str] = mapped_column() price: Mapped[float] = mapped_column() embedding: Mapped[Vector] = mapped_column(Vector(1536)) Defined with SQLAlchemy ORM:

Slide 17

Slide 17 text

RAG with hybrid search Embedding Model Large Language Model Consider these products: Summit Climbing Harness1 Apex Climbing Harness2 User Question Do you sell climbing gear? [[0.0014615238, - 0.015594152, - 0.0072768144, - 0.012787478,…] “Do you sell…” “Do you sell…” [12]: Name: Summit Climbing Harness Description: Conquer the highest peaks with the Raptor Elite Summit Climbing Harness. This durable and lightweight harness, available in bold red, provides maximum comfort and safety while scaling tricky routes. Price:109.99 … “Do you sell…” Hybrid Search

Slide 18

Slide 18 text

Step 1: Vectorize user query embedding = await openai_client.embeddings.create( model=model_name, input=user_query )

Slide 19

Slide 19 text

Step 2: Hybrid search of PostgreSQL table WITH semantic_search AS ( SELECT id, RANK () OVER (ORDER BY embedding <=> %(embedding)s) AS rank FROM items ORDER BY embedding <=> %(embedding)s LIMIT 20 ), keyword_search AS ( SELECT id, RANK () OVER (ORDER BY ts_rank_cd(to_tsvector('english’, description), query) DESC) FROM items, plainto_tsquery('english', %(query)s) query WHERE to_tsvector('english', description) @@ query ORDER BY ts_rank_cd(to_tsvector('english', description), query) DESC LIMIT 20 ) SELECT COALESCE(semantic_search.id, keyword_search.id) AS id, COALESCE(1.0 / (%(k)s + semantic_search.rank), 0.0) + COALESCE(1.0 / (%(k)s + keyword_search.rank), 0.0) AS score FROM semantic_search FULL OUTER JOIN keyword_search ON semantic_search.id = keyword_search.id ORDER BY score DESC LIMIT 5 https://github.com/pgvector/pgvector-python/blob/master/examples/hybrid_search_rrf.py

Slide 20

Slide 20 text

RAG with filters

Slide 21

Slide 21 text

RAG with filters Embedding Model Large Language Model We offer 2 climbing bags for your budget: SummitStone Chalk Bag 1 Guardian Blue Chalk Bag 2 User Question Do you sell climbing gear cheaper than $30? [[0.0014615238, - 0.015594152, - 0.0072768144, - 0.012787478,…] “Do you sell…” “Do you sell…” [12]: Name: SummitStone Chalk Bag Price:29.99 Brand:Grolltex Type:Climbing Description: The SummitStone Chalk Bag in forest green is a must- have for climbers seeking adventure. … “Do you sell…” Hybrid Search price < 30

Slide 22

Slide 22 text

Query rewriting phase User Question Do you sell climbing gear cheaper than $30? Large Language Model with function calling search_database( "climbing_gear", {"operator" : "<", "value" : "30" } )

Slide 23

Slide 23 text

RAG with filters Embedding Model LLM We offer 2 climbing bags for your budget: SummitStone Chalk Bag 1 Guardian Blue Chalk Bag 2 User Question Do you sell climbing gear cheaper than $30? [[0.0014615238, - 0.015594152, - 0.0072768144, - 0.012787478,…] ”climbing gear” “Do you sell…” [12]: Name: SummitStone Chalk Bag Price:29.99 Brand:Grolltex Type:Climbing Description: The SummitStone Chalk Bag in forest green is a must- have for climbers seeking adventure. … Hybrid Search price < 30 LLM with function calling “Do you sell…” “climbing gear”

Slide 24

Slide 24 text

Get started with RAG

Slide 25

Slide 25 text

RAG orchestration libraries Project Languages Langchain https://www.langchain.com/ Python, TypeScript, Java Llamaindex https://docs.llamaindex.ai/ Python, TypeScript (Microsoft) Semantic Kernel https://github.com/microsoft/semantic-kernel Python, .NET (Microsoft) PromptFlow https://github.com/microsoft/promptflow Python

Slide 26

Slide 26 text

RAG chat app starter repositories GitHub repository Technologies pamelafox/rag-on-postgres Python backend, React frontend, PostgreSQL Azure-Samples/azure-search-openai-demo Python backend, React frontend, Azure AI search Azure-Samples/azure-search-openai-javascript NodeJS backend, Web components frontend, Azure AI search Azure-Samples/azure-search-openai-demo-csharp .NET backend, Blazor Web Assembly frontend, Azure AI Search Azure-Samples/azure-search-openai-demo-java Java backend, React frontend, Azure AI Search microsoft/chat-copilot .NET backend w/Semantic Kernel, React frontend, Azure AI Search

Slide 27

Slide 27 text

Thank you! Any questions? Find me online at: @pamelafox pamelafox.org