Bien débuter avec Spring AI

Bien débuter avec Spring AI Florian Beaufumé 01/2026

• Architecte logiciel et développeur senior • Expert Java/Spring/backend •
Freelance depuis 15+ ans • https://beaufume.fr/ • @fbeaufume Florian Beaufumé

• "Le Hibernate des LLM" • Simplifie l'intégration de LLM
dans des applications Spring • Haut niveau et masque la complexité d'accès aux divers LLM • Fournisseurs : Anthropic, OpenAI, Microsoft, Google, Mistral, Ollama, … • Fonctionnalités principales : • Prompting, observabilité, structured output, image, voix, évaluation, modération, function calling, chatbot, RAG, bases vectorielles, MCP, agents Spring AI @fbeaufume

• Choix d'un LLM • Ollama • OpenRouter • Techniques
avec Spring AI • Prompting • Traitement de documents • Mesure des tokens consommés • Structured output • Log des appels • Description d'image • Modération • Chatbot • Function calling • MCP client et serveur Sommaire @fbeaufume

En ligne Facilité & choix Local Offline & vie privée
Fournisseurs de LLM @fbeaufume Ollama Le Docker des LLM LM Studio GUI pour LLM OpenAI Google OpenRouter Plateforme multi LLM …

• Outil open source pour exécuter localement des LLM •
Sait utiliser le GPU • ollama pull gemma3:4b • ollama list • ollama rm gemma3:4b Ollama @fbeaufume

Configuration Maven @fbeaufume <dependencyManagement> <dependencies> <dependency> <groupId>org.springframework.ai</groupId> <artifactId>spring-ai-bom</artifactId> <version>1.1.2</version> <type>pom</type>
<scope>import</scope> </dependency> </dependencies> </dependencyManagement> <dependency> <groupId>org.springframework.ai</groupId> <artifactId>spring-ai-starter-model-openai</…> </dependency> <dependency> <groupId>org.springframework.ai</groupId> <artifactId>spring-ai-starter-model-ollama</…> </dependency> Ollama OpenRouter

• Exemple de configuration de base : • Il existe
de nombreux autres paramètres Configuration Spring @fbeaufume # Configuration for Ollama #spring.ai.ollama.chat.options.model=llama3.2:3b spring.ai.ollama.chat.options.model=llama3.1:8b #spring.ai.ollama.chat.options.model=mistral:7b #spring.ai.ollama.chat.options.model=gemma3:4b #spring.ai.ollama.chat.options.model=granite3.3:2b #spring.ai.ollama.chat.options.model=deepseek-r1:8b # Configuration for OpenRouter spring.ai.openai.api-key=YOUR_API_KEY spring.ai.openai.base-url=https://openrouter.ai/api spring.ai.openai.chat.options.model=meta-llama/llama-3.3-70b-instruct:free #spring.ai.openai.chat.options.model=openai/gpt-oss-20b:free #spring.ai.openai.chat.options.model=mistralai/mistral-small-3.2-24b-instruct:free

• ChatClient est l'abstraction principale pour appeler le LLM •
Créé par exemple dans une classe @Configuration : • ChatClient.Builder est injecté par Spring et encapsule le LLM • Température de 0 (le plus déterministe) à 1 (le plus créatif) • Les Advisor sont des intercepteurs ChatClient @fbeaufume @Bean ChatClient chatClient(ChatClient.Builder builder) { return builder .defaultSystem("You are a polite and helpful assistant.") .defaultOptions( ChatOptions.builder().temperature(0.6).build()) .defaultAdvisors( new SimpleLoggerAdvisor()) .build(); } 1 2 3 4

• Requête simple : • Sortie : • Supporte aussi
des templates comme Tell me a joke about {subject} Prompting @fbeaufume @Service public class BusinessService { @Autowired private ChatClient chatClient; public String tellJoke() { return chatClient.prompt("Tell me a programming joke").call().content(); } } Why do programmers prefer dark mode? Because light attracts bugs! 1 2

• Certains fournisseurs facturent au token • Tokens consommés par
le prompt et sa réponse : • Pour générer une blague : • 15+27 = 42 tokens, soit environ 0,000007 $ pour 0,1 (in) et 0,2 (out) $/MTok Consommation de tokens @fbeaufume public String askQuestion(String question) { ChatResponse chatResponse = chatClient.prompt(question) .call().chatResponse(); Usage usage = chatResponse.getMetadata().getUsage(); LOGGER.info("Used {} tokens ({} for prompt + {} for response generation)", usage.getTotalTokens(), usage.getPromptTokens(), usage.getCompletionTokens()); return chatResponse.getResult().getOutput().getText(); } 1 2

• Apache Tika permet de lire divers formats (PDF, Word,
PPT, HTML, etc) : • Prompt stuffing pour traiter des documents : Traitement de documents @fbeaufume <dependency> <groupId>org.springframework.ai</groupId> <artifactId>spring-ai-tika-document-reader</artifactId> </dependency> public String summarize(String filePath) throws IOException, TikaException { Resource resource = new ClassPathResource(filePath); TikaDocumentReader reader = new TikaDocumentReader(resource); String text = reader.get().getFirst().getText(); return chatClient.prompt().user("Summarize this text: " + text).call().content(); }

• Résultat : • Retourner List<Country> est supporté aussi Structured
output @fbeaufume public record Countries(List<Country> countries) { } public record Country(String name, int population) { } public Countries getMostPopulatedCountries() { String request = """ List the 3 most populated countries. For each country, provide the name and the population in millions. """; return chatClient.prompt(request).call().entity(Countries.class); } Countries[countries= [Country[name=China, population=1439], Country[name=India, population=1380], Country[name=United States, population=332]]]

Logs par SimpleLoggerAdvisor @fbeaufume List the 3 most populated countries.
For each country, provide the name and the population in millions. Your response should be in JSON format. Do not include any explanations, only provide a RFC8259 compliant JSON response following this format without deviation. Do not include markdown code blocks in your response. Remove the ```json markdown from the output. Here is the JSON Schema instance your output must adhere to: ```{ "$schema" : "https://json-schema.org/draft/2020-12/schema", "type" : "object", "properties" : { "countries" : { "type" : "array", "items" : { "type" : "object", "properties" : { "name" : { "type" : "string" }, "population" : { "type" : "integer" } }, "additionalProperties" : false } } }, "additionalProperties" : false }``` 1 2 3 5 6 4

Description d'image public String findMouse(String path) { Media media =
new Media( MimeTypeUtils.IMAGE_JPEG, // E.g. "photo.jpg" new ClassPathResource(path)); return chatClient.prompt() .user(u -> u.text("Where is the mouse?") .media(media)).call().content(); } If you look closely at the image, you'll notice a cat riding on the back of a dog. On the cat's back, there is a small mouse (or possibly a rat), near the cat's shoulder blades. So, the mouse is riding on the cat, which is riding on the dog! 1 2

• Détecter et empêcher les contenus problématiques : • Violence,
haine, harcèlement, criminalité, santé, auto-agression, sexualité, etc. • Il existe des LLM dédiés • Supporté par Spring AI pour OpenAI et Mistral Modération @fbeaufume Prompt Réponse ModerationOptions options = ...; ModerationPrompt prompt = new ModerationPrompt("Some text", options); ModerationModel model = ...; ModerationResult result = model.call(prompt).getResult().getOutput().getResults().get(0); boolean isHarassment = result.getCategories().isHarassment(); // isHate, isSelfHarm, etc double harassmentScore = result.getCategoryScores().getHarassment(); // 0.0 to 1.0 LLM Modération en entrée Modération en sortie

Chatbot @fbeaufume UI en HTML avec Bootstrap, Alpine.js et Marked

Chatbot - Endpoint @fbeaufume @RestController @RequestMapping("/chat") public class ChatController {
@Autowired private ChatClient chatClient; @GetMapping public String getResponse( @RequestParam String request, @RequestParam String conversationId) { return chatClient.prompt(request) .advisors(a -> a.param(ChatMemory.CONVERSATION_ID, conversationId)) .call().content(); } } 1 2 3 4

• ChatClient avec mémoire : • Conserve par défaut 20
messages • Sont stockés en mémoire ici, persistance en BD possible Chatbot - Mémoire @fbeaufume @Bean ChatClient chatClient(ChatClient.Builder builder, ChatMemory chatMemory) { return builder ... .defaultAdvisors( MessageChatMemoryAdvisor.builder(chatMemory).build(), ...) .build(); } 1 2

Function calling @fbeaufume

• Décrire les méthodes métier (les "tools") : • Déclarer
les services correspondants : Function calling - Déclarations @fbeaufume @Service public class WeatherService { @Tool(description = "Return the current weather report for a given city including the condition and temperature in Celsius.") public WeatherReport getCurrentWeather( @ToolParam(description = "The name of the city") String city) { ... } } @Bean ChatClient chatClient(..., WeatherService weatherService) { return builder ... .defaultTools(weatherService) .build(); } 1 2 3 4

Function calling - Séquence @fbeaufume Utilisateur LLM Application Service métier
Prompt Prompt + définitions des fonctions Nom de fonction et paramètres Résultat Réponse finale Réponse finale 1 4 5 6 2 3 Spring AI

Function calling - Premier appel HTTP (simplifié) POST http://localhost:11434/api/chat {
"messages":[ {"role":"system","content":"You are a polite, friendly and helpful assistant."}, {"role":"user","content":"What's the weather in Paris ?"} ], "tools":[ {"function":{ "name":"getCurrentWeather", "description":"Return the current weather report for (...)", "parameters":{ "properties":{"city":{"type":"string","description":"The name of the city"}}, "required":["city"] } } } ], } 200 OK { "message":{"role":"assistant","content":"","tool_calls":[ {"function":{"index":0,"name":"getCurrentWeather","arguments":{"city":"Paris"}}}]}, } 1 2 3

Function calling - Second appel HTTP (simplifié) POST http://localhost:11434/api/chat {
"messages":[ {"role":"system","content":"You are a polite, friendly and helpful assistant."}, {"role":"user","content":"What's the weather in Paris ?"}, {"role":"assistant","content":"","tool_calls":[ {"function":{"name":"getCurrentWeather","arguments":{"city":"Paris"}}} ]}, {"role":"tool","content":"{\"condition\":\"STORMY\",\"temperatureInCelsius\":10.0}"} ], "tools":[ {"function":{ "name":"getCurrentWeather", ... } } } ], } 200 OK { "message":{"role":"assistant","content":"(...) it's stormy and 10 degrees (...)"}, } 1 2 4 3

• Model Context Protocol • Protocole standard pour connecter des
LLM à des services tiers MCP @fbeaufume Travel application Weather service Booking service GitHub Jira GitHub Copilot plugin MCP MCP LLM

• MCP client et server supportés par Spring AI MCP
- Architecture @fbeaufume MCP host Tools Resources Prompts … Tools … MCP servers MCP client MCP client HTTP (JSON-RPC) or stdio LLM Travel application Weather service Booking service

• Pas besoin de LLM côté serveur MCP MCP -
Serveur @fbeaufume <dependency> <groupId>org.springframework.ai</groupId> <artifactId>spring-ai-starter-mcp-server-webmvc</…> </dependency> spring.ai.mcp.server.name=weather-server spring.ai.mcp.server.protocol=STREAMABLE @Service public class WeatherService { @McpTool(description = "Return the current weather report for a given city including the condition and temperature in celsius.") public WeatherReport getCurrentWeather( @McpToolParam(description = "The name of the city") String city) { ... } } pom.xml application.properties

MCP - Inspecteur npx @modelcontextprotocol/inspector 1 2 3 4 5
6 7

• En plus de la configuration Spring AI habituelle :
• Puis utiliser ChatClient comme vu plus tôt MCP - Client @fbeaufume <dependency> <groupId>org.springframework.ai</groupId> <artifactId>spring-ai-starter-mcp-client</…> </dependency> spring.ai.mcp.client.streamable-http .connections.my-weather-server.url= http://localhost:8081 @Bean ChatClient chatClient(ChatClient.Builder builder, ToolCallbackProvider provider) { return builder.defaultToolCallbacks(provider); } pom.xml application.properties AiConfig.java

@fbeaufume https://beaufume.fr/

Bien débuter avec Spring AI

Bien débuter avec Spring AI

Florian Beaufumé

More Decks by Florian Beaufumé

Other Decks in Programming

Featured

Transcript