Upgrade to Pro — share decks privately, control downloads, hide ads and more …

Building RAG with Java and Semantic Kernel

Davide Antelmo
September 06, 2024
30

Building RAG with Java and Semantic Kernel

Davide Antelmo

September 06, 2024
Tweet

Transcript

  1. Agenda 1. Chat with your private data 2. What is

    RAG 3. Java RAG solution deep dive
  2. App or Copilot agent API & SDK Azure OpenAI Words,

    pdf, markdown, websites, etc. databases, datalakes, rest api, etc. Business Data & API Documents Chat with your private data
  3. Azure AI Chat Java Reference Template All Azure AI App

    templates: https://aka.ms/azai Java implementation: https://github.com/Azure-Samples/azure-search-openai- demo-java
  4. Azure AI Chat Java Reference Template App Service | Azure

    Container Apps | Azure Kubernetes Service
  5. 1. Project Setup Options: • Github Codespaces → • VS

    Code with Dev Containers extension • Local Development •Java 17 •Maven 3.8.x •Azure Developer CLI •Node.js 14+ 1 •Git •Powershell 7+ (pwsh) 2. Cd in one deployment option folder • deploy/aca • deploy/aks • deploy/app-service 3. Run ‘azd auth login’ and ‘azd up’ Getting Started
  6. GPT-35-turbo Working with Open AI GPT-4o ADA Chat Completion API

    Embedding API prompt AI App Devs completion Azure Open AI Hosted on Azure [ 13 33 34 … ] text embedddings
  7. Documents Chat Flow Retrieval System Azure Storage Azure Open AI

    gpt-35-turbo Ask question Indexing Flow Move to cloud Admins Configure/Manage Azure AI Services Document Intelligence Users Push/Pull data Text extract Index content Generate Embeddings Search Info Generate Answer Info Retriever AI Orchestration Data Loading Data Chunking Embeddings Generation Full Text Vector Store gpt4 ada-2 Retrieval Augmented Generation flow – for private documents
  8. Retrieval Augmented Generation – Building blocks Prompt Templating Documents Processing

    LLM Models • Gpt-3.5-turbo • Gpt-4-turbo • Gpt-4o • ADA-002 • Embeddings • Semantic search • Handlebars • Jinja2 • Chat History • Function Calls • Planning • Agents • Built-in chains Vector Storage • Documents Loader • Document Parser • Text Chunker Orchestration
  9. Semantic Kernel https://github.com/microsoft/semantic-kernel-java Microsoft Open Source Languages: .Net, Python, Java

    AI Services abstract LLM models capabilities and interactions. Plugins are SK units of work orchestrated by the kernel.
  10. Java RAG Implementation Options Plain Java Azure SDK • Low

    level AI orchestration • Useful to understand RAG behind the scenes • Streaming support • Ton of boilerplate code Semantic kernel • Simplified AI orchestration • Common RAG building blocks abstractions • Doesn’t support streaming (yet)
  11. Project Structure Folder Description app App code components deploy Infrastructure

    code. Bicep and Azure developer CLI configuration. Indexing scripts. app/backend Chat flow - Java app/frontend Chat UI – React application app/indexer Indexing flow - Java deploy/app-service App Service. 1 App. Frontend and backend are colocated 1 springboot app. indexer runs locally. deploy/aca Azure Container Apps. Microservice architecture. 2 spring boot apps. 1 Ngnix app. deploy/aks Azure Kubernetes Service. Microservice architecture. 2 spring boot apps. 1 Ngnix app. data Preloaded pdf documents about job description, company policies and benefits
  12. React API client Single Page Application React and typescript api.ts

    entry point https://github.com/Azure-Samples/azure-search-openai-demo-java/tree/main/app/frontend/src/api https://github.com/Azure-Samples/azure-search-openai-demo-java/tree/main/app/frontend/src/api
  13. Chat Endpoint Rest Controller providing entry point for the RAG

    chat. APPLICATION_NDJSON_VALUE based API is used for streaming response. Streaming supported only for PlainJavaChatApproach. https://github.com/Azure-Samples/azure-search-openai-demo-java/blob/main/app/backend/src/main/java/com/microsoft/openai/samples/rag/chat/controller/ChatController.java https://github.com/Azure-Samples/azure-search-openai-demo- java/blob/main/app/backend/src/main/java/com/microsoft/openai/samples/rag/chat/controller/ChatController.java
  14. Semantic Kernel Plugins and Functions RAG plugin with semantic functions

    • AnswerConversation • ExtractKeywords https://github.com/Azure-Samples/azure-search-openai-demo-java/blob/main/app/backend/src/main/java/com/microsoft/openai/samples/rag/retrieval/semantickernel/AzureAISearchPlugin.java https://github.com/Azure-Samples/azure-search-openai-demo- java/blob/main/app/backend/src/main/java/com/microsoft/openai/samples/rag/retrieval/ semantickernel/AzureAISearchPlugin.java InformationFinder plugin with native functions • searchFromConversation azure-search-openai-demo-java/app/backend/src/main/resources/semantickernel/Plugins/RAG at main · Azure-Samples/azure-search-openai-demo-java · GitHub
  15. Prompt Template and Chat Completion configuration with SK Handlebars templating

    engine • Parameters replacement. • Loops, conditional logic. • Direct call to plugin functions. Chat completion parameters: • Temperature • top_p • max_tokens https://github.com/Azure-Samples/azure-search-openai-demo- java/tree/main/app/backend/src/main/resources/semantickernel/Plugins/RAG/AnswerConversation
  16. Azure AI Search Retriever • Azure AI Search API to

    retrieve documents from the search index. • Query keywords are extracted from the whole chat conversation with additional call to Open AI. • Retrieval mode: text, vectors, hybrid. • OpenAI embedding API to convert the user's query text to an embeddings vector ( vector or hybrid) • Hybrid search improve search results mixing text search and vector search. • Can be further simplified with SK VectorStore abstraction and AzureAISearchVectoreStore implementation. No need to create an explicit database search plugin, provides features for performing similarity searches over databases https://github.com/Azure-Samples/azure-search-openai-demo-java/blob/main/app/backend/src/main/java/com/microsoft/openai/samples/rag/retrieval/AzureAISearchRetriever.java https://github.com/Azure-Samples/azure-search-openai-demo- java/blob/main/app/backend/src/main/java/com/microsoft/openai/samples/rag/retrieval/AzureAISearc hRetriever.java
  17. AzureAI Search Retriever as SK native plugin Plain java class.

    @DefineKernelFunction class method annotation. @DefineFunctionParameter method parameters annotation. https://github.com/Azure-Samples/azure-search-openai-demo- java/blob/main/app/backend/src/main/java/com/microsoft/openai/samples/rag/retrieval/semantickernel/A zureAISearchPlugin.java
  18. RAG Flow with SK 1. Configure the kernel: • AI

    service backed by AzureOpenAI client. • AnswerConversation semantic plugin function as external file. • Informationfinder native plugin function as decorated Java class. 2. Implement the chat flow: • Retrieve relevant documents using the chat conversation. Ask the kernel to trigger SearchFromConversation specific native plugin function. • Build a SK function context with retrieved sources and chat conversation. • Ask the kernel to generate an answer using AnswerConversation function from RAG plugin providing the function arguments. https://github.com/Azure-Samples/azure-search-openai-demo-java/blob/main/app/backend/src/main/java/com/microsoft/openai/samples/rag/chat/approaches/semantickernel/JavaSemanticKernelChainsChatApproach.java https://github.com/Azure-Samples/azure-search-openai-demo- java/blob/main/app/backend/src/main/java/com/microsoft/openai/samples/rag/chat/approaches/semantick ernel/JavaSemanticKernelChainsChatApproach.java
  19. CLI AddCommand https://github.com/Azure-Samples/azure-search-openai-demo- java/blob/main/app/indexer/cli/src/main/java/com/microsoft/openai/samples/indexer/CLI.java PICO CLI based implementation Triggered by

    deploy/app- service/scripts/prepdocs scripts AddCommand - Indexing running locally (App Service option): 1. Create Azure AI Search index fields 2. Scan local directory 3. Use DocumentProcessor to orchestrate indexing. 4. Upload file to Azure Blob Storage for citations detail
  20. DocumentProcessor Orchestrate document processing logic 1. Parses PDF file into

    pages 2. Split pages into text chunks 3. Load text chunks in Azure AI Search Index Text chunks embeddings are generated and stored in Azure AI Search index https://github.com/Azure-Samples/azure-search-openai-demo- java/blob/main/app/indexer/core/src/main/java/com/microsoft/openai/samples/indexer/DocumentProcess or.java
  21. PDF Parser with Document Intelligence Use Azure Document Intelligence OCR

    capabilities Tabular data are converted to html tables text For simple document (no tabular data) use local pdf parser ItextPdfParser.java https://github.com/Azure-Samples/azure-search-openai-demo- java/blob/main/app/indexer/core/src/main/java/com/microsoft/openai/samples/indexer/parser/Documen ntelligencePDFParser.java
  22. Text Splitter https://github.com/Azure-Samples/azure-search-openai-demo- java/blob/main/app/indexer/core/src/main/java/com/microsoft/openai/samples/indexer/parser/Te xtSplitter.java Split page text in smaller

    sections Handle sections overlaps Handle texts with html tables Section length and overlap can be configured. Default: • Section length: 1000 chars • Overlap length: 100 chars
  23. Index Manager Create index resources in Azure AI Search Upload

    index with text sections and related embeddings vector • Text chunk is stored in ‘content’ index field • Embeddings is vector stored in ‘embedding’ index field • Original file name, page numbers, category are stored as additional medatada EmbeddingService abstracts use of Azure Open AI embedding model • Embeddings requests arranged as batch to improve performance • Retries with exponential backoff policy in case of http throttling
  24. • Custom PDF in data folder • It’s not a

    delta process • Java process running locally when App Service deploy is selected. • Indexer microservice running on containers orchestrator when ACA or AKS is selected. Custom Data Ingestion/Indexing
  25. • Intro to intelligent apps, grounding, vector databases • Prompt

    engineering • Grouding LLMs • Chunking large documents • Semantic kernel • Hybrid Retrievial in Azure AI Search Learn More