Upgrade to Pro — share decks privately, control downloads, hide ads and more …

Building a RAG-powered AI chat app with Python and VS Code

Building a RAG-powered AI chat app with Python and VS Code

A presentation for VS Code Day 2024.

Pamela will walk through a Python web app that can chat on your own data using Retrieval-Augmented-Generation (RAG) and large language models like GPT-4. Plus, we'll see how you can use VS Code Dev Containers plus Ollama for a fully local development experience.

Pamela Fox

April 24, 2024
Tweet

More Decks by Pamela Fox

Other Decks in Technology

Transcript

  1. LLM: Large Language Model An LLM is a model that

    is so large that it achieves general-purpose language understanding and generation. Review: This movie sucks. Sentiment: negative Review: I love this movie: Sentiment: Input LLM positive Output
  2. Popular LLMs Model # of Parameters Creator Uses GPT 3.5

    175 B OpenAI ChatGPT, Copilots, APIs GPT 4 Undisclosed OpenAI Gemini Undisclosed Google Bard Claude 2,3 130 B Anthropic APIs LlaMA 70 B Meta OSS Mistral-7B, Mixtral 7 B Mistral AI OSS Phi-3 3.8B Microsoft Research OSS
  3. Using OpenAI LLMs: Python response = openai.chat.completions.create( messages = [

    { "role": "system", "content": "You are a helpful assistant with very flowery language" }, { "role": "user", "content": "what python frameworks does pamela use?" } ]) for event in response: print(event.choices[0].message.content) https://github.com/pamelafox/python-openai-demos
  4. Incorporating domain knowledge Prompt engineering Fine tuning Retrieval Augmented Generation

    In-context learning Learn new skills (permanently) Learn new facts (temporarily)
  5. RAG on documents Search deploying-django-apps-to-azure.html: Since I'm on the Python

    team, I've been working on templates for the top Python frameworks, Django, Flask, and FastAPI.\nMy first finished template is for a Django app with PostgreSQL database that gets deployed to Azure App Service with a PostgreSQL flexible server. … Large Language Model Pamela uses the following Python frameworks: Flask 1, FastAPI 2, Django3 User Question What python frameworks does Pamela use?
  6. The importance of the search step Garbage in, garbage out:

    If the search results don’t contain a good answer, the LLM will be unable to answer or will answer wrongly. Noisy input: If the LLM receives too much information, it may not find the correct answer amidst the noise. Source: Lost in the Middle: How Language Models Use Long Contexts, Liu et al. arXiv:2307.03172 50 55 60 65 70 75 5 15 25 Accuracy Number of documents in input context
  7. Optimal search strategy Vector Keywords Fusion (RRF) Reranking model Vector

    search is best for finding semantically related matches Keyword search is best for exact matches (proper nouns, numbers, etc) Hybrid search combines vector search and keyword search, optimally using Reciprocal-Rank-Fusion for merging results and a ML model to re-rank results after aka.ms/ragrelevance
  8. What is the RAG searching? Documents (Unstructured data) PDFs, docx,

    pptx, md, html, images You need an ingestion process for extracting, splitting, vectorizing, and storing document chunks. Azure options: Azure AI Search with Document Intelligence, OpenAI embedding models, Integrated Vectorization Database rows (Structured data) PostgreSQL, MongoDB, Qdrant, etc. You need a way to vectorize & search target columns. Azure options: • PostgreSQL+pgvector • CosmosMongoDB+vector • Container Apps services (Milvus, Qdrant, Weaviate) + OpenAI embedding models aka.ms/ragchat aka.ms/rag-postgres
  9. RAG on database rows Search [12]: Name: Summit Climbing Harness

    Description: Conquer the highest peaks with the Raptor Elite Summit Climbing Harness. This durable and lightweight harness, available in bold red, provides maximum comfort and safety while scaling tricky routes. Price:109.99 … Large Language Model Consider these products: Summit Climbing Harness1 Apex Climbing Harness2 User Question Do you sell climbing gear?
  10. Example database schema class Item(Base): __tablename__ = "items" id: Mapped[int]

    = mapped_column(primary_key=True, autoincrement=True) type: Mapped[str] = mapped_column() brand: Mapped[str] = mapped_column() name: Mapped[str] = mapped_column() description: Mapped[str] = mapped_column() price: Mapped[float] = mapped_column() embedding: Mapped[Vector] = mapped_column(Vector(1536)) Defined with SQLAlchemy ORM:
  11. RAG with hybrid search Embedding Model Large Language Model Consider

    these products: Summit Climbing Harness1 Apex Climbing Harness2 User Question Do you sell climbing gear? [[0.0014615238, - 0.015594152, - 0.0072768144, - 0.012787478,…] “Do you sell…” “Do you sell…” [12]: Name: Summit Climbing Harness Description: Conquer the highest peaks with the Raptor Elite Summit Climbing Harness. This durable and lightweight harness, available in bold red, provides maximum comfort and safety while scaling tricky routes. Price:109.99 … “Do you sell…” Hybrid Search
  12. Step 2: Hybrid search of PostgreSQL table WITH semantic_search AS

    ( SELECT id, RANK () OVER (ORDER BY embedding <=> %(embedding)s) AS rank FROM items ORDER BY embedding <=> %(embedding)s LIMIT 20 ), keyword_search AS ( SELECT id, RANK () OVER (ORDER BY ts_rank_cd(to_tsvector('english’, description), query) DESC) FROM items, plainto_tsquery('english', %(query)s) query WHERE to_tsvector('english', description) @@ query ORDER BY ts_rank_cd(to_tsvector('english', description), query) DESC LIMIT 20 ) SELECT COALESCE(semantic_search.id, keyword_search.id) AS id, COALESCE(1.0 / (%(k)s + semantic_search.rank), 0.0) + COALESCE(1.0 / (%(k)s + keyword_search.rank), 0.0) AS score FROM semantic_search FULL OUTER JOIN keyword_search ON semantic_search.id = keyword_search.id ORDER BY score DESC LIMIT 5 https://github.com/pgvector/pgvector-python/blob/master/examples/hybrid_search_rrf.py
  13. RAG with filters Embedding Model Large Language Model We offer

    2 climbing bags for your budget: SummitStone Chalk Bag 1 Guardian Blue Chalk Bag 2 User Question Do you sell climbing gear cheaper than $30? [[0.0014615238, - 0.015594152, - 0.0072768144, - 0.012787478,…] “Do you sell…” “Do you sell…” [12]: Name: SummitStone Chalk Bag Price:29.99 Brand:Grolltex Type:Climbing Description: The SummitStone Chalk Bag in forest green is a must- have for climbers seeking adventure. … “Do you sell…” Hybrid Search price < 30
  14. Query rewriting phase User Question Do you sell climbing gear

    cheaper than $30? Large Language Model with function calling search_database( "climbing_gear", {"operator" : "<", "value" : "30" } )
  15. RAG with filters Embedding Model LLM We offer 2 climbing

    bags for your budget: SummitStone Chalk Bag 1 Guardian Blue Chalk Bag 2 User Question Do you sell climbing gear cheaper than $30? [[0.0014615238, - 0.015594152, - 0.0072768144, - 0.012787478,…] ”climbing gear” “Do you sell…” [12]: Name: SummitStone Chalk Bag Price:29.99 Brand:Grolltex Type:Climbing Description: The SummitStone Chalk Bag in forest green is a must- have for climbers seeking adventure. … Hybrid Search price < 30 LLM with function calling “Do you sell…” “climbing gear”
  16. RAG orchestration libraries Project Languages Langchain https://www.langchain.com/ Python, TypeScript, Java

    Llamaindex https://docs.llamaindex.ai/ Python, TypeScript (Microsoft) Semantic Kernel https://github.com/microsoft/semantic-kernel Python, .NET (Microsoft) PromptFlow https://github.com/microsoft/promptflow Python
  17. RAG chat app starter repositories GitHub repository Technologies pamelafox/rag-on-postgres Python

    backend, React frontend, PostgreSQL Azure-Samples/azure-search-openai-demo Python backend, React frontend, Azure AI search Azure-Samples/azure-search-openai-javascript NodeJS backend, Web components frontend, Azure AI search Azure-Samples/azure-search-openai-demo-csharp .NET backend, Blazor Web Assembly frontend, Azure AI Search Azure-Samples/azure-search-openai-demo-java Java backend, React frontend, Azure AI Search microsoft/chat-copilot .NET backend w/Semantic Kernel, React frontend, Azure AI Search