Upgrade to Pro — share decks privately, control downloads, hide ads and more …

1/16/25 - Central Iowa Java Users Group - Java ...

1/16/25 - Central Iowa Java Users Group - Java Meets AI: Build LLM-Powered Apps with LangChain4j

Join CIJUG virtually for our first meetup of 2025! We are excited to announce Eric Deandrea from Red Hat as our guest speaker for this event, presenting "Java meets AI: Build LLM-Powered Apps with LangChain4j."

Join us for a guided tour through the possibilities of the LangChain4j framework! Chat with virtually any LLM provider? Generate AI images straight from your Java application with Dall-E and Gemini? Have LLMs return POJOs? Interact with local models on your machine? LangChain4j makes it a piece of cake! We will explain the fundamental building blocks of LLM-powered applications and show you how to chain them together into AI Services.

Speaker bio: Eric Deandrea is a Java Champion & Senior Principal Developer Advocate at Red Hat, focusing on application development technologies. Eric has over 25 years of experience designing and building Java-based solutions and developer training programs. He is a contributor to various OSS projects, including Quarkus, Spring, LangChain4j, WireMock, and Microcks, as well as a speaker at many public events and user groups around the world.

Thursday, January 16th
5:30pm-7pm CST
https://www.meetup.com/central-iowa-java-users-group/events/305348139/

Eric Deandrea

January 16, 2025
Tweet

More Decks by Eric Deandrea

Other Decks in Technology

Transcript

  1. @edeandrea 2 • Java Champion • 25+ years software development

    experience • ~11 years DevOps Architect • Contributor to Open Source projects Quarkus Spring Boot, Spring Framework, Spring Security LangChain4j (& Quarkus LangChain4j) Wiremock Microcks • Boston Java Users ACM Chapter Board Member • Published Author About Me
  2. @edeandrea • Showcase & explain Quarkus, how it enables modern

    Java development & the Kubernetes-native experience • Introduce familiar Spring concepts, constructs, & conventions and how they map to Quarkus • Equivalent code examples between Quarkus and Spring as well as emphasis on testing patterns & practices 4 https://red.ht/quarkus-spring-devs
  3. @edeandrea What are we going to see? How to build

    AI-Infused applications in Java - Some examples - Main concepts - Chat Models - Ai Services - Memory management - RAG - Function calling - Guardrails - Image models - The almost-all-in-one demo - Plain LangChain4j & Quarkus - Remote model (Open AI) & Local models (Ollama, Podman AI Studio) Example Code Slides https://github.com/cescoffier/langchain4j-deep-dive https://speakerdeck.com/edeandrea/25-central-iowa-java-users-group-java-meets-ai-build-llm-powered-apps-with-langchain4j
  4. @edeandrea What are Large Language Models (LLMs)? Neural Networks •

    Transformer based • Recognize, Predict, and Generate text • Trained on a VERY large corpuses of text • Deduce the statistical relationships between tokens • Can be fine-tuned A LLM predicts the next token based on its training data and statistical deduction
  5. @edeandrea The L of LLM means Large LLama 3.3: -

    70B parameters - Trained on > 15T of tokens - 128K token window - 43 Gb on disk Granite: - 34B parameters - Trained on 3500B of tokens - 3.8 Gb of RAM, 4.8Gb on disk More on: An idea of the size
  6. @edeandrea Model and Model Serving Model Model Serving - Run

    the model - CPU / GPU - Expose an API - REST - gRPC - May support multiple models
  7. @edeandrea Prompt and Prompt Engineering Model Input (Prompt) Output Input:

    - Prompt (text) - Instructions to give to the model - Taming a model is hard Output: - Depends on the modality of the model
  8. @edeandrea Application Model AI-infused application |ˌeɪˌaɪ ˈɪnˌfjuːzd ˌæplɪˈkeɪʃən| noun (Plural

    AI-Infused applications) A software program enhanced with artificial intelligence capabilities, utilizing AI models to implement intelligent features and functionalities.
  9. @edeandrea Using models to build apps on top Dev Ops

    Release Deploy Operate Monitor Plan Code Build Test Train Evaluate Deploy Collect Evaluate Curate Analyze Data ML APIs
  10. @edeandrea Using models to build apps on top Dev Ops

    Release Deploy Operate Monitor Plan Code Build Test Train Evaluate Deploy Collect Evaluate Curate Analyze Data ML Need some clients and toolkits
  11. @edeandrea LangChain4j https://github.com/langchain4j/langchain4j • Toolkit to build AI-Infused Java applications

    ◦ Provides integration with many LLM/SML providers ◦ Provides building blocks for the most common patterns (RAG, Function calling…) ◦ Abstractions to manipulate prompts, messages, memory, tokens… ◦ Integrate a large variety of vector stores and document loaders
  12. @edeandrea LangChain4j https://github.com/langchain4j/langchain4j AI Service Loaders Splitters Vector Store Embedding

    Models Language Models Image Models Prompt Function calling Memory Output Parsers Building blocks RAG
  13. @edeandrea Quarkus LangChain4j https://docs.quarkiverse.io/quarkus-langchain4j LangChain4j Quarkus LangChain4j Application LLMs Vector

    stores Embedding Models - Declarative clients - CDI integration - Observability (Otel, Prometheus) - Auditing - Resilience - RAG building blocks - Tool support - Mockable
  14. @edeandrea Bootstrapping LangChain4j <dependency> <groupId>dev.langchain4j</ groupId> <artifactId>langchain4j</ artifactId> </dependency> <dependency>

    <groupId>dev.langchain4j</ groupId> <artifactId>langchain4j-open-ai</ artifactId> </dependency> <dependency> <groupId>io.quarkiverse.langchain4j</ groupId> <artifactId>quarkus-langchain4j-openai</ artifactId> </dependency> Quarkus LangChain4j
  15. @edeandrea Chat Models • Text to Text ◦ Text in

    -> Text out ◦ NLP • Prompt ◦ Set of instructions explaining what the model must generate ◦ Use plain English (or other language) ◦ There are advanced prompting technique ▪ Prompt depends on the model ▪ Prompt engineering is an art ChatLanguageModel modelA = OpenAiChatModel.builder() .apiKey(System.getenv("...")).build(); String answerA = modelA.generate("Say Hello World"); @Inject ChatLanguageModel model; String answer = model.generate("Say Hello"); LangChain4j Quarkus LangChain4j - Chat Model Quarkus LangChain4j - AI Service @RegisterAiService interface PromptA { String ask(String prompt); } @Inject PromptA prompt; String answer = prompt.ask("Say Hello");
  16. @edeandrea var system = new SystemMessage( "You are Georgios, all

    your answers should be using the Java language using greek letters "); var user = new UserMessage("Say Hello World" ); var response = model.generate(system, user); // Pass a list of messages System.out.println( "Answer: " + response.content().text()); Messages Context or Memory
  17. @edeandrea Manual Memory List<ChatMessage> memory = new ArrayList<>(); memory.addAll(List.of( new

    SystemMessage( "You are a useful AI assistant." ), new UserMessage("Hello, my name is Clement." ), new UserMessage("What is my name?" ) )); var response = model.generate( memory); System.out.println( "Answer 1: " + response.content().text()); memory.add(response.content()); memory.add(new UserMessage("What's my name again?" )); response = model.generate( memory); System.out.println( "Answer 2: " + response.content().text()); var m = new UserMessage("What's my name again?" ); response = model.generate(m); // No memory System.out.println( "Answer 3: " + response.content().text());
  18. @edeandrea Messages and Memory Model Context Output Message Models are

    stateless - Pass a set of messages named context - These messages are stored in a memory - Context size is limited (eviction strategy) Context = (Stored input messages + Output messages) + New input
  19. @edeandrea Chat Memory var memory = MessageWindowChatMemory .builder() .id("user-id") .maxMessages(

    3) // Only 3 messages will be stored .build(); memory.add(new SystemMessage( "You are a useful AI assistant." )); memory.add(new UserMessage("Hello, my name is Clement and I live in Valence, France" )); memory.add(new UserMessage("What is my name?" )); var response = model.generate(memory.messages()); System.out.println("Answer: " + response.content().text());
  20. @edeandrea Context Limit & Pricing Number of tokens - Depends

    on the model and model serving (provider) - Tokens are not words Context size is not in terms of messages, but in number of tokens This_talk_is_really_ boring._Hopefully,_it_will _be_over_soon. [2500, 838, 2082, 15224, 3067, 2146, 1535, 7443, 2697, 127345, 46431, 278, 3567, 492, 40729, 34788, 62, 84908, 13] https://platform.openai.com/tokenizer
  21. @edeandrea Token Usage var memory = MessageWindowChatMemory .builder() .id("user-id") .maxMessages(

    3) // Only 3 messages will be stored .build(); memory.add(new SystemMessage( "You are a useful AI assistant." )); memory.add(new UserMessage("Hello, my name is Clement and I live in Valence, France" )); memory.add(new UserMessage("What is my name?" )); var response = model.generate(memory.messages()); System.out.println("Answer 1: " + response.content().text()); System.out.println("Input token: " + response.tokenUsage().inputTokenCount()); System.out.println("Output token: " + response.tokenUsage().outputTokenCount()); System.out.println("Total token: " + response.tokenUsage().totalTokenCount());
  22. @edeandrea LangChain4j AI Services Map LLM interaction to Java interfaces

    - Declarative model - You define the API the rest of the code uses - Mapping of the output - Parameterized prompt - Abstract/Integrate some of the concepts we have seen public void run() { Assistant assistant = AiServices.create(Assistant.class, model); System.out.println( assistant.answer("Say Hello World") ); } // Represent the interaction with the LLM interface Assistant { String answer(String question); }
  23. @edeandrea LangChain4j AI Services - System Message - @SystemMessage annotation

    - Or System message provider public void run() { var assistant = AiServices .create(Assistant.class, model); System.out.println( assistant.answer("Say Hello World") ); } interface Assistant { @SystemMessage("You are a Shakespeare, all your response must be in iambic pentameter.") String answer(String question); } var rapper = AiServices.builder(Friend.class) .chatLanguageModel( model) .systemMessageProvider( chatMemoryId -> "You’re a west coast rapper, all your response must be in rhymes." ) .build();
  24. @edeandrea LangChain4j AI Services - User Message and Parameters public

    void run() { Poet poet = AiServices.create(Poet.class, model); System.out.println(poet.answer("Devoxx")); } interface Poet { @SystemMessage ("You are Shakespeare, all your response must be in iambic pentameter." ) @UserMessage("Write a poem about {{topic}}. It should not be more than 5 lines long." ) String answer(@V("topic") String topic); }
  25. @edeandrea LangChain4j AI Services - Structured Output AI Service methods

    are not limited to returning String - Primitive types - Enum - JSON Mapping TriageService triageService = … System.out.println(triageService.triage( "It was a great experience!" )); System.out.println(triageService.triage( "It was a terrible experience!" )); // … enum Sentiment { POSITIVE, NEGATIVE,} record Feedback(Sentiment sentiment, String summary) {} interface TriageService { @SystemMessage("You are an AI that need to triage user feedback." ) @UserMessage(""" Analyze the given feedback, and determine i it is positive, or negative. Then, provide a summary of the feedback: {{fb}} """) Feedback triage(@V("feedback") String fb); }
  26. @edeandrea LangChain4j AI Services - Chat Memory - You can

    plug a ChatMemory to an AI service to automatically add and evict messages var memory = MessageWindowChatMemory .builder() .id( "user-id") .maxMessages( 3) .build(); var assistant = AiServices.builder(Assistant.class) .chatLanguageModel( model) .chatMemory( memory) .build();
  27. @edeandrea What’s the difference between these? Application Database Application Service

    CRUD application Microservice Application Model AI-Infused application
  28. @edeandrea What’s the difference between these? Application Database Application Service

    CRUD application Microservice Application Model AI-Infused application Integration Points
  29. @edeandrea What’s the difference between these? Application Database Application Service

    CRUD application Microservice Application Model AI-Infused application Integration Points Observability (metrics, tracing, auditing) Fault-Tolerance (timeout, circuit-breaker, non-blocking, fallbacks…)
  30. @edeandrea Quarkus AI Services Application Component AI Service - Define

    the API (Interface) - Configure the prompt for each method - Configure the tools, memory… Chat Model Tools Memory Retriever Audit Moderation Model (RAG) (Observability) (Agent) Inject and invoke (Manage the context using CDI scopes)
  31. @edeandrea Quarkus AI Services Map LLM interaction to Java interfaces

    - Based on LangChain4j AI Service - Made CDI aware - Injectable - Scope - Dev UI, Templating… - Metrics, Audit, Tracing… @Inject Assistant assistant; @ActivateRequestContext public int run() { println(assistant.answer("My name is Clement, can you say \"Hello World\" in Greek?")); println(assistant.answer( "What's my name?")); return 0; } @RegisterAiService interface Assistant { String answer(String question); } Injectable bean, Request scope by default
  32. @edeandrea Quarkus AI Services - Scopes and memory Request scope

    by default - Overridable - Keep messages for the duration of the scope - Request - the request only - Application - the lifetime of the application - Because it’s risky, you need a memory id - Session - the lifetime of the websocket session @RegisterAiService @RequestScoped interface ShortMemoryAssistant { String answer(String question); } @RegisterAiService @ApplicationScoped interface LongMemoryAssistant { String answer(@MemoryId int id, @UserMessage String question); } @RegisterAiService @SessionScoped interface ConversationalMemoryAssistant { String answer(String question); }
  33. @edeandrea Quarkus AI Services - Custom Memory Memory Provider -

    You can implement a custom memory provider - Can implement persistence - Conversation represented by MemoryId - For session - it’s the WS session ID. @ApplicationScoped public class MyMemoryStore implements ChatMemoryStore { public List<ChatMessage> getMessages( Object memoryId) { // … } public void updateMessages(Object memoryId, List<ChatMessage> messages) // … } public void deleteMessages( Object memoryId){ // … } }
  34. @edeandrea Quarkus AI Services - Parameter and Structured Output Prompt

    can be parameterized - Use Qute template engine - Can contain logic Structured output - Based on Jackson @UserMessage(""" What are the {number}th last teams in which {player} played? Only return the team names. """) List<String> ask(int number, String player); @UserMessage(""" What are the last team in which {question.player} played? Return the team and the last season. """) Entry ask(Question question); record Question(String player) {} record Entry(String team, String years) {} Single {}
  35. @edeandrea Quarkus AI Services - Complex templating @SystemMessage(""" Given the

    following conversation and a follow-up question, rephrase the follow-up question to be a standalone question. Context: {#for m in chatMessages} {#if m.type.name() == "USER"} User: {m.text()} {/if} {#if m.type.name() == "AI"} Assistant: {m.text()} {/if} {/for} """) String rephrase(List<ChatMessage> chatMessages, @UserMessage String question);
  36. @edeandrea Quarkus AI Services Application Component AI Service Quarkus Extended

    with Quarkus capabilities (REST client, Metrics, Tracing…)
  37. @edeandrea Quarkus AI Services - Observability Collect metrics - Exposed

    as Prometheus OpenTelemetry Tracing - Trace interactions with the LLM <dependency> <groupId>io.quarkus</groupId> <artifactId> quarkus-opentelemetry </artifactId> </dependency> <dependency> <groupId> io.quarkiverse.micrometer.registry </groupId> <artifactId> quarkus-micrometer-registry-otlp </artifactId> </dependency>
  38. @edeandrea Quarkus AI Services - Auditing Audit Service - Allow

    keeping track of interactions with the LLM - Can be persisted - Implemented by the application code @Override public void initialMessages( Optional<SystemMessage> systemMessage, UserMessage userMessage ) { } @Override public void addLLMToApplicationMessage ( Response<AiMessage> response) {} @Override public void onFailure(Exception e) {} @Override public void onCompletion(Object result) {} Deprecated - to be re-written!!
  39. @edeandrea Quarkus AI Services - Fault Tolerance Retry / Timeout

    / Fallback / Circuit Breaker / Rate Limiting… - Protect against error - Graceful recovery There are other resilience patterns (guardrails) @UserMessage("…") @Retry(maxRetries = 2) @Timeout(value = 1, unit = MINUTES) @Fallback(fallbackMethod = "fallback") Entry ask(Question question); default Entry fallback(Question question) { return new Entry("Unknown", "Unknown"); } <dependency> <groupId>io.quarkus</groupId> <artifactId> quarkus-smallrye-fault-tolerance </artifactId> </dependency>
  40. @edeandrea Retrieval Augmented Generation (RAG) Enhance LLM knowledge by providing

    relevant information in real-time from other sources – Dynamic data that changes frequently Fine-tuning is expensive! 2 stages Indexing / Ingestion Retrieval / Augmentation
  41. @edeandrea Indexing / Ingestion What do I need to think

    about? What is the representation of the data? How do I want to split? Per document? Chapter? Sentence? How many tokens do I want to end up with?
  42. @edeandrea Indexing / Ingestion Compute an embedding (numerical vector) representing

    semantic meaning of each segment. Requires an embedding model In-process/Onnx, Amazon Bedrock, Azure OpenAI, Cohere, DashScope, Google Vertex AI, Hugging Face, Jine, Jlama, LocalAI, Mistral, Nomic, Ollama, OpenAI, OVHcloud, Voyage AI, Cloudfare Workers AI, Zhipu AI
  43. @edeandrea Store embedding alone or together with segment. Requires a

    vector store In-memory, Chroma, Elasticsearch, Milvus, Neo4j, OpenSearch, Pinecone, PGVector, Redis, Vespa, Weaviate, Qdrant Indexing / Ingestion
  44. @edeandrea Indexing / Ingestion var ingestor = EmbeddingStoreIngestor.builder() .embeddingModel(embeddingModel) .embeddingStore(embeddingStore)

    // Add userId metadata entry to each Document to be able to filter by it later .documentTransformer(document -> { document.metadata().put("userId", "12345"); return document; }) // Split each Document into TextSegments of 1000 tokens each with a 200-token overlap .documentSplitter(DocumentSplitters.recursive(1000, 200)) // Add the name of the Document to each TextSegment to improve the quality of search .textSegmentTransformer(textSegment -> TextSegment.from( textSegment.metadata().getString("file_name") + "\n" + textSegment.text(), textSegment.metadata() ) ) .build(); // Get the path of where the documents are and load them recursively Path path = Path.of(...); List<Document> documents = FileSystemDocumentLoader.loadDocumentsRecursively(path); // Ingest the documents into the embedding store ingestor.ingest(documents);
  45. @edeandrea Retrieval / Augmentation Compute an embedding (numerical vector) representing

    semantic meaning of the query. Requires an embedding model.
  46. @edeandrea Retrieval / Augmentation Retrieve & rank relevant content based

    on cosine similarity or other similarity/distance measures.
  47. @edeandrea Retrieval / Augmentation Augment input to the LLM with

    related content. What do I need to think about? Will I exceed the max number of tokens? How much chat memory is available?
  48. @edeandrea Retrieval / Augmentation public class RagRetriever { @Produces @ApplicationScoped

    public RetrievalAugmentor create(EmbeddingStore store, EmbeddingModel model) { var contentRetriever = EmbeddingStoreContentRetriever. builder() .embeddingModel(model) .embeddingStore(store) .maxResults( 3) .minScore( 0.75) .filter( metadataKey("userId").isEqualTo("12345")) .build(); return DefaultRetrievalAugmentor. builder() .contentRetriever(contentRetriever) .build(); } }
  49. @edeandrea public class RagRetriever { @Produces @ApplicationScoped public RetrievalAugmentor create(EmbeddingStore

    store, EmbeddingModel model) { var embeddingStoreRetriever = EmbeddingStoreContentRetriever.builder() .embeddingModel(model) .embeddingStore(store) .maxResults(3) .minScore(0.75) .filter(metadataKey("userId").isEqualTo("12345")) .build(); var googleSearchEngine = GoogleCustomWebSearchEngine.builder() .apiKey(System.getenv("GOOGLE_API_KEY")) .csi(System.getenv("GOOGLE_SEARCH_ENGINE_ID")) .build(); var webSearchRetriever = WebSearchContentRetriever.builder() .webSearchEngine(googleSearchEngine) .maxResults(3) .build(); return DefaultRetrievalAugmentor.builder() .queryRouter(new DefaultQueryRouter(embeddingStoreRetriever, webSearchRetriever)) .build(); } } Advanced RAG https://github.com/cescoffier/langchain4j-deep-dive/blob/main/4-rag/src/main/java/dev/langchain4j/quarkus/deepdive/RagRetriever.java
  50. @edeandrea public class RagRetriever { @Produces @ApplicationScoped public RetrievalAugmentor create(EmbeddingStore

    store, EmbeddingModel model, ChatLanguageModel chatModel) { var embeddingStoreRetriever = ... var webSearchRetriever = ... var queryRouter = LanguageModelQueryRouter.builder() .chatLanguageModel(chatModel) .fallbackStrategy(FallbackStrategy.ROUTE_TO_ALL) .retrieverToDescription( Map.of( embeddingStoreContentRetriever, “Local Documents”, webSearchContentRetriever, “Web Search” ) ) .build(); return DefaultRetrievalAugmentor.builder() .queryRouter(queryRouter) .build(); } } Advanced RAG https://github.com/cescoffier/langchain4j-deep-dive/blob/main/4-rag/src/main/java/dev/langchain4j/quarkus/deepdive/RagRetriever.java
  51. @edeandrea application.properties quarkus.langchain4j.easy-rag.path=path/to/files quarkus.langchain4j.easy-rag.max-segment-size=1000 quarkus.langchain4j.easy-rag.max-overlap-size=200 quarkus.langchain4j.easy-rag.max-results=3 quarkus.langchain4j.easy-rag.ingestion-strategy=on|off quarkus.langchain4j.easy-rag.reuse-embeddings=true|false pom.xml <dependency>

    <groupId>io.quarkiverse.langchain4j</groupId> <artifactId>quarkus-langchain4j-easy-rag</artifactId> <version>${quarkus-langchain4j.version}</version> </dependency> <!-- Need an extension providing an embedding model --> <dependency> <groupId>io.quarkiverse.langchain4j</groupId> <artifactId>quarkus-langchain4j-openai</artifactId> <version>${quarkus-langchain4j.version}</version> </dependency> <!-- Also need an extension providing a vector store --> <!-- Otherwise an in-memory store is provided automatically --> <dependency> <groupId>io.quarkiverse.langchain4j</groupId> <artifactId>quarkus-langchain4j-pgvector</artifactId> <version>${quarkus-langchain4j.version}</version> </dependency> Easy RAG!
  52. @edeandrea Agent and Tools A tool is a function that

    the model can call: - Tools are parts of CDI beans - Tools are defined and described using @Tool Prompt (Context) Extend the context with tool descriptions Invoke the model The model asks for a tool invocation (name + parameters) The tool is invoked (on the caller) and the result sent to the model The model computes the response using the tool result Response
  53. @edeandrea <~~ My prompt <~~ Tool invocation request <~~ Tool

    invocation response <~~ Model Response
  54. @edeandrea Tools - A tool is just a method -

    It can access databases, or invoke a remote service - It can also use another LLM Tools require memory Application
  55. @edeandrea Using tools with LangChain4j Assistant assistant = AiServices.builder(Assistant.class) .chatLanguageModel(

    model) .tools(new Calculator()) .chatMemory( MessageWindowChatMemory .withMaxMessages(10)) .build(); static class Calculator { @Tool("Calculates the length of a string") int stringLength(String s) { return s.length(); } @Tool("Calculates the square root of a number" ) double sqrt(int x) { System.out.println("Called sqrt() with x=" + x); return Math.sqrt(x); } } Objects to use as tools Declare an tool method (description optional)
  56. @edeandrea Using tools with Quarkus LangChain4j @RegisterAiService interface Assistant {

    @ToolBox(Calculator.class) String chat(String userMessage ); } @ApplicationScoped static class Calculator { @Tool("Calculates the length of a string" ) int stringLength(String s) { return s.length(); } } Class of the bean declaring tools Declare an tool method (description optional) Must be a bean (singleton and dependant supported) Tools can be listed in the `tools` attribute
  57. @edeandrea Giving access to database (Quarkus Panache) @ApplicationScoped public class

    BookingRepository implements PanacheRepository<Booking> { @Tool("Cancel a booking" ) @Transactional public void cancelBooking(long bookingId, String customerFirstName , String customerLastName ) { var booking = getBookingDetails( bookingId, customerFirstName, customerLastName); delete(booking); } @Tool("List booking for a customer" ) public List<Booking> listBookingsForCustomer (String customerName , String customerSurname ) { var found = Customer.find("firstName = ?1 and lastName = ?2", customerName, customerSurname).singleResultOptional(); return list("customer", found.get()); } }
  58. @edeandrea Web Search Tools (Tavily) @UserMessage(""" Search for information about

    the user query: {query}, and answer the question. """) @ToolBox(WebSearchTool.class) String chat(String query); Provided by quarkus-langchain4j-tavily Can also be used with RAG
  59. @edeandrea Risks • Things can go wrong quickly • Risk

    of prompt injection ◦ Access can be protected in Quarkus • Audit is very important to check the parameters • Distinction between read and write beans Application
  60. @edeandrea Guardrails - Functions used to validate the input and

    output of the model - Detect invalid input - Detect prompt injection - Detect hallucination - Chain of guardrails - Sequential - Stop at first failure Quarkus LangChain4j only (for now)
  61. @edeandrea Retry and Reprompt Output guardrails can have 4 different

    outcomes: - Success - the response is passed to the caller or next guardrail - Fatal - we stop and throw an exception - Retry - we call the model again with the same context (we never know ;-) - Reprompt - we call the model again with another message in the model indicating how to fix the response
  62. @edeandrea Implement an input guardrail @ApplicationScoped public class UppercaseInputGuardrail implements

    InputGuardrail { @Override public InputGuardrailResult validate(UserMessage userMessage ) { var message = userMessage.singleText(); var isAllUppercase = message.chars().filter(Character::isLetter) .allMatch( Character::isUpperCase); return isAllUppercase ? success() : failure( "The input must be in uppercase." ); } } CDI beans Interface to implement Can also access the chat memory and the augmentation results OK Failure
  63. @edeandrea Implement an output guardrail @ApplicationScoped public class UppercaseOutputGuardrail implements

    OutputGuardrail { @Override public OutputGuardrailResult validate(OutputGuardrailParams params ) { System.out.println("response is: " + params.responseFromLLM().text() + " / " + params.responseFromLLM().text().toUpperCase()); var message = params.responseFromLLM().text(); var isAllUppercase = message.chars().filter(Character::isLetter).allMatch(Character::isUpperCase); return isAllUppercase ? success() : reprompt( "The output must be in uppercase." , "Please provide the output in uppercase." ); } } CDI beans Interface to implement Can also access the chat memory and the augmentation results OK Reprompt
  64. @edeandrea Declaring guardrails @RegisterAiService public interface Assistant { @InputGuardrails(UppercaseInputGuardrail .class)

    @OutputGuardrails(UppercaseOutputGuardrail .class) String chat(String userMessage ); } Both can receive multiple values
  65. @edeandrea Process or Generate images Image Model - Image Models

    are specialized for … Images - Can generate images from text - Can process images from input (like the OCR demo) - Chat Model: GPT4-o | Image Model: Dall-e - Important: Not every model serving provider provides image support (as it needs specialized models)
  66. @edeandrea Processing picture from AI Services @RegisterAiService @ApplicationScoped public interface

    ImageDescriber { @UserMessage(""" Describe the given message. """) String describe(@ImageUrl Image image); } Indicate to the model to use the image Can be String, URL, URI, or Image
  67. @edeandrea Using Image Model to generate pictures @Inject ImageModel model;

    @Override public void run(String... args) throws IOException { var prompt = "Generate a picture of a rabbit software developers coming to Devoxx" ; var response = model.generate(prompt); System.out.println(response.content().url()); } Image Model (can also be created with a builder) Response<Image> quarkus.langchain4j.openai.timeout =1m quarkus.langchain4j.openai.image-model.size =1024x1024 quarkus.langchain4j.openai.image-model.quality =standard quarkus.langchain4j.openai.image-model.style =vivid quarkus.langchain4j.openai.image-model.persist =true Print the persisted image
  68. @edeandrea Generating images from AI Services @RegisterAiService @ApplicationScoped public interface

    ImageGenerator { Image generate(String userMessage ); } Indicate to use the image model to generate the picture var prompt = "Generate a picture of a rabbit going to Devoxx. The rabbit should be wearing a Quarkus tee-shirt."; var response = generator.generate(prompt); var file = Paths.get("rabbit-at-devoxx.jpg"); Files.copy(response.url().toURL().openStream(), file, StandardCopyOption.REPLACE_EXISTING);
  69. @edeandrea The almost-all-in-one demo - React - Quarkus WebSockets.NEXT -

    Quarkus Quinoa - Ollama - Guardrails - RAG - Ingest data from filesystem - Tools - Update database - Send email - Observability - OpenTelemetry
  70. @edeandrea What did we see? How to Build AI-Infused applications

    in Java https://docs.quarkiverse.io/ quarkus-langchain4j/dev https://docs.langchain4j.dev Code Slides Langchain4J Quarkus Chat Models RAG PROMPT MESSAGES AI SERVICE MEMORY CONTEXT TOOLS FUNCTION CALLING GUARDRAILS IMAGE MODELS OBSERVABILITY audit TRACING agent https://github.com/cescoffier/langchain4j-deep-dive https://speakerdeck.com/edeandrea/25-central-iowa-java-users-group-java-meets-ai-build-llm-powered-apps-with-langchain4j