Agenda
1. Chat with your private data
2. What is RAG
3. Java RAG solution deep dive
Slide 3
Slide 3 text
App or
Copilot agent
API & SDK
Azure OpenAI
Words, pdf, markdown, websites, etc.
databases, datalakes, rest api, etc.
Business Data & API
Documents
Chat with your private data
Slide 4
Slide 4 text
Azure AI Chat Java Reference Template
All Azure AI App templates: https://aka.ms/azai
Java implementation:
https://github.com/Azure-Samples/azure-search-openai-
demo-java
Slide 5
Slide 5 text
Azure AI Chat Java Reference Template
App Service | Azure Container Apps | Azure Kubernetes Service
Slide 6
Slide 6 text
1. Project Setup Options:
• Github Codespaces →
• VS Code with Dev Containers extension
• Local Development
•Java 17
•Maven 3.8.x
•Azure Developer CLI
•Node.js 14+ 1
•Git
•Powershell 7+ (pwsh)
2. Cd in one deployment option folder
• deploy/aca
• deploy/aks
• deploy/app-service
3. Run ‘azd auth login’ and ‘azd up’
Getting Started
Slide 7
Slide 7 text
What is RAG
Slide 8
Slide 8 text
GPT-35-turbo
Working with Open AI
GPT-4o
ADA
Chat Completion API
Embedding API
prompt
AI App
Devs
completion
Azure Open AI
Hosted on Azure
[ 13 33 34 … ]
text
embedddings
Slide 9
Slide 9 text
Documents
Chat Flow
Retrieval System Azure Storage
Azure Open AI
gpt-35-turbo
Ask question
Indexing Flow
Move to cloud
Admins
Configure/Manage
Azure AI Services
Document
Intelligence
Users
Push/Pull data
Text extract
Index content
Generate
Embeddings
Search Info
Generate
Answer
Info Retriever
AI Orchestration
Data Loading
Data Chunking
Embeddings
Generation
Full Text
Vector Store
gpt4
ada-2
Retrieval Augmented Generation flow – for private documents
Semantic Kernel
https://github.com/microsoft/semantic-kernel-java
Microsoft Open Source
Languages: .Net, Python, Java
AI Services abstract LLM models
capabilities and interactions.
Plugins are SK units of work
orchestrated by the kernel.
Slide 13
Slide 13 text
Java RAG Implementation Options
Plain Java Azure SDK
• Low level AI orchestration
• Useful to understand RAG behind
the scenes
• Streaming support
• Ton of boilerplate code
Semantic kernel
• Simplified AI orchestration
• Common RAG building blocks
abstractions
• Doesn’t support streaming
(yet)
Slide 14
Slide 14 text
Project Structure
Folder Description
app App code components
deploy Infrastructure code. Bicep and Azure
developer CLI configuration. Indexing
scripts.
app/backend Chat flow - Java
app/frontend Chat UI – React application
app/indexer Indexing flow - Java
deploy/app-service App Service. 1 App. Frontend and backend
are colocated 1 springboot app. indexer runs
locally.
deploy/aca Azure Container Apps. Microservice
architecture. 2 spring boot apps. 1 Ngnix
app.
deploy/aks Azure Kubernetes Service. Microservice
architecture. 2 spring boot apps. 1 Ngnix
app.
data Preloaded pdf documents about job
description, company policies and benefits
Slide 15
Slide 15 text
Implementing
RAG Chat flow
Slide 16
Slide 16 text
React API client
Single Page Application
React and typescript
api.ts entry point
https://github.com/Azure-Samples/azure-search-openai-demo-java/tree/main/app/frontend/src/api
https://github.com/Azure-Samples/azure-search-openai-demo-java/tree/main/app/frontend/src/api
Slide 17
Slide 17 text
Chat Endpoint
Rest Controller providing entry
point for the RAG chat.
APPLICATION_NDJSON_VALUE
based API is used for streaming
response.
Streaming supported only for
PlainJavaChatApproach.
https://github.com/Azure-Samples/azure-search-openai-demo-java/blob/main/app/backend/src/main/java/com/microsoft/openai/samples/rag/chat/controller/ChatController.java
https://github.com/Azure-Samples/azure-search-openai-demo-
java/blob/main/app/backend/src/main/java/com/microsoft/openai/samples/rag/chat/controller/ChatController.java
Slide 18
Slide 18 text
Semantic Kernel Plugins and Functions
RAG plugin with semantic functions
• AnswerConversation
• ExtractKeywords
https://github.com/Azure-Samples/azure-search-openai-demo-java/blob/main/app/backend/src/main/java/com/microsoft/openai/samples/rag/retrieval/semantickernel/AzureAISearchPlugin.java
https://github.com/Azure-Samples/azure-search-openai-demo-
java/blob/main/app/backend/src/main/java/com/microsoft/openai/samples/rag/retrieval/
semantickernel/AzureAISearchPlugin.java
InformationFinder plugin with native
functions
• searchFromConversation
azure-search-openai-demo-java/app/backend/src/main/resources/semantickernel/Plugins/RAG at
main · Azure-Samples/azure-search-openai-demo-java · GitHub
Slide 19
Slide 19 text
Prompt Template and Chat Completion configuration with SK
Handlebars templating engine
• Parameters replacement.
• Loops, conditional logic.
• Direct call to plugin functions.
Chat completion parameters:
• Temperature
• top_p
• max_tokens
https://github.com/Azure-Samples/azure-search-openai-demo-
java/tree/main/app/backend/src/main/resources/semantickernel/Plugins/RAG/AnswerConversation
Slide 20
Slide 20 text
Azure AI Search Retriever
• Azure AI Search API to retrieve documents from the
search index.
• Query keywords are extracted from the whole chat
conversation with additional call to Open AI.
• Retrieval mode: text, vectors, hybrid.
• OpenAI embedding API to convert the user's query
text to an embeddings vector ( vector or hybrid)
• Hybrid search improve search results mixing text
search and vector search.
• Can be further simplified with SK VectorStore
abstraction and AzureAISearchVectoreStore
implementation. No need to create an explicit
database search plugin, provides features for
performing similarity searches over databases
https://github.com/Azure-Samples/azure-search-openai-demo-java/blob/main/app/backend/src/main/java/com/microsoft/openai/samples/rag/retrieval/AzureAISearchRetriever.java
https://github.com/Azure-Samples/azure-search-openai-demo-
java/blob/main/app/backend/src/main/java/com/microsoft/openai/samples/rag/retrieval/AzureAISearc
hRetriever.java
Slide 21
Slide 21 text
AzureAI Search Retriever as SK native plugin
Plain java class.
@DefineKernelFunction class
method annotation.
@DefineFunctionParameter
method parameters annotation.
https://github.com/Azure-Samples/azure-search-openai-demo-
java/blob/main/app/backend/src/main/java/com/microsoft/openai/samples/rag/retrieval/semantickernel/A
zureAISearchPlugin.java
Slide 22
Slide 22 text
RAG Flow with SK
1. Configure the kernel:
• AI service backed by AzureOpenAI client.
• AnswerConversation semantic plugin
function as external file.
• Informationfinder native plugin function
as decorated Java class.
2. Implement the chat flow:
• Retrieve relevant documents using the
chat conversation. Ask the kernel to
trigger SearchFromConversation
specific native plugin function.
• Build a SK function context with retrieved
sources and chat conversation.
• Ask the kernel to generate an answer
using AnswerConversation function
from RAG plugin providing the function
arguments.
https://github.com/Azure-Samples/azure-search-openai-demo-java/blob/main/app/backend/src/main/java/com/microsoft/openai/samples/rag/chat/approaches/semantickernel/JavaSemanticKernelChainsChatApproach.java
https://github.com/Azure-Samples/azure-search-openai-demo-
java/blob/main/app/backend/src/main/java/com/microsoft/openai/samples/rag/chat/approaches/semantick
ernel/JavaSemanticKernelChainsChatApproach.java
Slide 23
Slide 23 text
Implementing
RAG Indexing flow
Slide 24
Slide 24 text
CLI AddCommand
https://github.com/Azure-Samples/azure-search-openai-demo-
java/blob/main/app/indexer/cli/src/main/java/com/microsoft/openai/samples/indexer/CLI.java
PICO CLI based implementation
Triggered by deploy/app-
service/scripts/prepdocs scripts
AddCommand - Indexing running
locally (App Service option):
1. Create Azure AI Search index fields
2. Scan local directory
3. Use DocumentProcessor to orchestrate
indexing.
4. Upload file to Azure Blob Storage for citations
detail
Slide 25
Slide 25 text
DocumentProcessor
Orchestrate document processing
logic
1. Parses PDF file into pages
2. Split pages into text chunks
3. Load text chunks in Azure AI Search Index
Text chunks embeddings are
generated and stored in Azure AI
Search index
https://github.com/Azure-Samples/azure-search-openai-demo-
java/blob/main/app/indexer/core/src/main/java/com/microsoft/openai/samples/indexer/DocumentProcess
or.java
Slide 26
Slide 26 text
PDF Parser with Document Intelligence
Use Azure Document Intelligence
OCR capabilities
Tabular data are converted to html
tables text
For simple document (no tabular
data) use local pdf parser
ItextPdfParser.java
https://github.com/Azure-Samples/azure-search-openai-demo-
java/blob/main/app/indexer/core/src/main/java/com/microsoft/openai/samples/indexer/parser/Documen
ntelligencePDFParser.java
Slide 27
Slide 27 text
Text Splitter
https://github.com/Azure-Samples/azure-search-openai-demo-
java/blob/main/app/indexer/core/src/main/java/com/microsoft/openai/samples/indexer/parser/Te
xtSplitter.java
Split page text in smaller sections
Handle sections overlaps
Handle texts with html tables
Section length and overlap can be
configured. Default:
• Section length: 1000 chars
• Overlap length: 100 chars
Slide 28
Slide 28 text
Index Manager
Create index resources in Azure AI
Search
Upload index with text sections and
related embeddings vector
• Text chunk is stored in ‘content’ index field
• Embeddings is vector stored in ‘embedding’ index
field
• Original file name, page numbers, category are
stored as additional medatada
EmbeddingService abstracts use of
Azure Open AI embedding model
• Embeddings requests arranged as batch to
improve performance
• Retries with exponential backoff policy in case of
http throttling
Slide 29
Slide 29 text
• Custom PDF in data folder
• It’s not a delta process
• Java process running locally when App Service deploy is selected.
• Indexer microservice running on containers orchestrator when ACA or AKS is
selected.
Custom Data Ingestion/Indexing
Slide 30
Slide 30 text
• Intro to intelligent apps, grounding, vector databases
• Prompt engineering
• Grouding LLMs
• Chunking large documents
• Semantic kernel
• Hybrid Retrievial in Azure AI Search
Learn More