Modular RAG Architectures with Java and Spring AI

Thomas Vitale Spring I/O 23rd May 2025 Modular RAG Architectures
with Java and @thomasvitale.com

Systematic Thomas Vitale @thomasvitale.com

Application Inference Service Consum es LLM s Architecture Database Reads/writes
data Observability Platform Exports telem etry Spring Boot Application Vaadin Spring AI Arconia @thomasvitale.com

Arconia Dev Services and OpenTelemetry arconia dev gradle bootRun mvn
spring-boot:run @thomasvitale.com arconia.io

Chatbot @thomasvitale.com “Legacy software companies adding an AI chatbot to
their product" @andykreed

Sequential RAG @thomasvitale.com

Prompt Stuffing Inference Service Request Response Augmenting Prompts with Context
Answer Application @thomasvitale.com Question Context

Retrieval Augmented Generation Inference Service Request Response Question Answer Application
Augment with Context Web Search Engines Search Engine HTTP API @thomasvitale.com Context

Sequential RAG Retrieval Response Query Augmented Query @thomasvitale.com Generation Naive

Ingestion Pipeline @thomasvitale.com Data Preparation Document Reader Reads from Writes
To Document Transformer Document Writer

Augment with Context Vector Stores Vector Store Semantic Search @thomasvitale.com

Sequential RAG @thomasvitale.com Advanced Pre- Retrieval Query Retrieval Generation Response

Sequential RAG Retrieval @thomasvitale.com Generation Advanced Pre- Retrieval Query Response
Post- Retrieval

Branching RAG @thomasvitale.com

Join Branching RAG @thomasvitale.com Post-Retrieval Join Generation Response Retrieval Retrieval
Retrieval Expand Pre- Retrieval Query

Conditional RAG @thomasvitale.com

Conditional RAG @thomasvitale.com Query Routing Generation Response Retrieval routes to
Generation Response Retrieval routes to Generation Response Retrieval routes to Router Query

Tools @thomasvitale.com

Augment with Context Prompt Augmentation with Retrieved Context Source Query @thomasvitale.com

Tools Inference Service Request Tool Calling Question Response Answer Application
API Tool Call Tool Execution Tool Call Request Tool Call Response @thomasvitale.com

Agentic RAG @thomasvitale.com

Agentic RAG @thomasvitale.com Orchestration Query Response Generation Retrieval Tool 2
uses Generation Retrieval Tool 1 uses LLM uses Agent

1979 IBM

Modular RAG https://github.com/ThomasVitale/modular-rag https://github.com/ThomasVitale/llm-apps-java-spring-ai Thomas Vitale @thomasvitale.com thomasvitale.com Architectures with
Java and

Modular RAG Architectures with Java and Spring AI

Modular RAG Architectures with Java and Spring AI

Thomas Vitale

More Decks by Thomas Vitale

Other Decks in Technology

Featured

Transcript

Thomas Vitale Spring I/O 23rd May 2025 Modular RAG Architectures

Systematic Thomas Vitale @thomasvitale.com

Application Inference Service Consum es LLM s Architecture Database Reads/writes

Arconia Dev Services and OpenTelemetry arconia dev gradle bootRun mvn

Chatbot @thomasvitale.com “Legacy software companies adding an AI chatbot to

Sequential RAG @thomasvitale.com

Prompt Stuffing Inference Service Request Response Augmenting Prompts with Context

Retrieval Augmented Generation Inference Service Request Response Question Answer Application

Sequential RAG Retrieval Response Query Augmented Query @thomasvitale.com Generation Naive

Ingestion Pipeline @thomasvitale.com Data Preparation Document Reader Reads from Writes

Retrieval Augmented Generation Inference Service Request Response Question Answer Application

Sequential RAG @thomasvitale.com Advanced Pre- Retrieval Query Retrieval Generation Response

Sequential RAG Retrieval @thomasvitale.com Generation Advanced Pre- Retrieval Query Response

Branching RAG @thomasvitale.com

Join Branching RAG @thomasvitale.com Post-Retrieval Join Generation Response Retrieval Retrieval

Conditional RAG @thomasvitale.com

Conditional RAG @thomasvitale.com Query Routing Generation Response Retrieval routes to

Tools @thomasvitale.com

Retrieval Augmented Generation Inference Service Request Response Question Answer Application

Tools Inference Service Request Tool Calling Question Response Answer Application

Agentic RAG @thomasvitale.com

Agentic RAG @thomasvitale.com Orchestration Query Response Generation Retrieval Tool 2

1979 IBM

Modular RAG https://github.com/ThomasVitale/modular-rag https://github.com/ThomasVitale/llm-apps-java-spring-ai Thomas Vitale @thomasvitale.com thomasvitale.com Architectures with