Upgrade to Pro — share decks privately, control downloads, hide ads and more …

Supercharge Your Applications with Java, Graphs...

Sponsored · Ship Features Fearlessly Turn features on and off without deploys. Used by thousands of Ruby developers.

Supercharge Your Applications with Java, Graphs, and a Touch of AI

We all want to use GenAI to give ourselves superpowers, right? It’s fun to imagine and build innovative applications, but what does it take to move beyond simple examples? In this session, the presenters will build a RAG application using Quarkus and Neo4j, highlighting the differences from a traditional, non-AI approach. We’ll explore how Quarkus leverages familiar Java development patterns while supporting the performance and efficiency needed for AI workloads. Along the way, we’ll discuss design considerations, common challenges, and developer-friendly features that simplify building smarter apps. From concepts to coding, join us to learn how to take your Java applications to the next level.
Code:
- https://github.com/ebullient/quarkus-soloplay
- https://github.com/ebullient/quarkus-ironsworn

Avatar for Jennifer Reif

Jennifer Reif

March 05, 2026
Tweet

More Decks by Jennifer Reif

Other Decks in Technology

Transcript

  1. Agenda: Beyond the chatbot When AI meets real world Use

    case - solo RPG with AI narrator Challenges Structured, consistent outputs Narrative continuity (gameplay) Game / chat state (memory) Guardrails Solution(s) - 2 apps with different approaches
  2. The dream Role-playing games typically designed for a group Requires

    a story leader as DM Research, rules, story development, etc Time, co-location, scheduling to play IRL What if an LLM could drive the adventure?
  3. Why Neo4j? Retrieve semantic text chunks with previous + next

    Filter on entity (labels), content type (metadata) Store Adventure journey path, gameplay interactions
  4. Spelljammer vs Ironsworn Spelljammer* (soloplay) D&D meets Star Trek GM-driven

    narrative: player reacts Rules-heavy: Skill checks, combat mechanics Spelljammer: Setting-specific rules, creatures, etc. Ironsworn** Player-driven narrative: player as co-GM Rules light: fiction first LLM doesn't override player assertions Model has pre-trained knowledge (rules) * © Wizards of the Coast; **© Tomkin Press
  5. Plain chat @RegisterAiService() public interface ChatAssistant { @SystemMessage(""" You are

    a helpful AI assistant. Be conversational and friendly. Provide clear, concise answers. When uncertain, say so rather than guessing. Format your response using GitHub-flavored Markdown. """) String chat(@UserMessage String userMessage); }
  6. Plain chat: what the model knows What is a Damselfly?

    (Spelljammer) How do you play Ironsworn? How do you make a promise? (Ironsworn)
  7. Ironsworn: prompt influence The LLM already knows how to play

    Ironsworn. Low effort, decent results. Use the System Prompt to scope the answers. @SystemMessage(""" You are an Ironsworn assistant. Help players understand the rules, mechanics, and setting of the Ironsworn tabletop RPG. Answer questions about moves, oracles, character creation, and the Ironlands setting. Format your response using GitHub-flavored Markdown. """) String rules(@UserMessage String userMessage);
  8. RAG Flavor 1 with Upfront Ingestion Spelljammer lore is not

    in the training data. Solution: ingest the corpus once, retrieve on demand.
  9. IngestService if (content.contains(TOOLS_DOC_SEPARATOR)) { String[] parts = content.split(TOOLS_DOC_SEPARATOR); int processedCount

    = 0; for (String part : parts) { String trimmed = part.trim(); if (!trimmed.isBlank() && !trimmed.equals(TOOLS_DOC_SEPARATOR)) { List<String> chunkIds = processStructuredMarkdown(...); allChunkIds.addAll(chunkIds); processedCount++; } } } else { List<String> chunkIds = processStructuredMarkdown(...); allChunkIds.addAll(chunkIds); } ... if (isAdventureFile && !allChunkIds.isEmpty()) { addLabelToNodes(allChunkIds, "Adventure"); if (allChunkIds.size() > 1) { createChunkRelationships(allChunkIds); } }
  10. Markdown: semantic chunking * Code trimmed a lot for brevity.

    See repo for full. private List<String> processStructuredMarkdown(...) { Map<String, Object> yamlMetadata = parseYamlFrontmatter(content); List<TextSegment> segments = new ArrayList<>(); if (prefix.length() + cleanContent.length() > chunkSize) { String[] sections = SECTION_HEADER_PATTERN.split(cleanContent); for (String section : sections) { // If section is still too large, chunk it again if (section.length() > chunkSize * 2) { var doc = Document.from(enrichedSection); var splitter = DocumentSplitters.recursive(...); List<TextSegment> subSegments = splitter.split(doc); segments.addAll(subSegment); } else { TextSegment segment = TextSegment.from(section, common); segments.add(segment); } } } List<Embedding> embeddings = embeddingModel.embedAll(segments).content(); List<String> chunkIds = embeddingStore.addAll(embeddings, segments); return chunkIds; }
  11. createChunkRelationships() String getOrderedChunks = """ MATCH (d:Document) WHERE d.id IN

    $chunkIds RETURN d.id as id ORDER BY d.sequenceNumber """; Iterable<Map<String, Object>> results = session.query( getOrderedChunks, Map.of("chunkIds", chunkIds)); List<String> orderedIds = new ArrayList<>(); for (Map<String, Object> row : results) { orderedIds.add((String) row.get("id")); } for (int i = 0; i < orderedIds.size() - 1; i++) { String createNext = """ MATCH (d1:Document) WHERE d1.id = $fromId MATCH (d2:Document) WHERE d2.id = $toId MERGE (d1)-[:NEXT]->(d2) """; session.query(createNext, Map.of( "fromId", orderedIds.get(i), "toId", orderedIds.get(i + 1))); }
  12. LoreRetriever CypherRetriever (filter) + ContentInjector (metadata) Filtered Cypher with 5-result

    candidate pool Neighbor traversal (prev + next) Try to filter on content, otherwise skip
  13. Vector search * Code trimmed for brevity. See repo for

    full. private List<Content> executeVectorSearch(<params>) { try { if (contentType != null && !contentType.isBlank()) { cypher = """ CALL db.index.vector.queryNodes( $indexName, $maxResults * 5, $embedding) YIELD node, score WHERE score >= $minScore AND node.contentType = $contentType OPTIONAL MATCH (prev)-[:NEXT]->(node) OPTIONAL MATCH (node)-[:NEXT]->(next) RETURN node.text AS text, node.name AS name, node.filename AS filename, node.contentType AS contentType, node.sourceFile AS sourceFile, score, prev.text AS prevText, next.text AS nextText ORDER BY score DESC LIMIT $maxResults """; params = Map.of(<params>); } else { //skip contentType filter }
  14. Soloplay: prompt augmentation + RAG LoreAssistant wiring it all with

    one annotation*. * We have some extra goodies here, this is a non-trivial application @RequestScoped @RegisterAiService(retrievalAugmentor = LoreRetriever.class, chatMemoryProviderSupplier = InMemoryChatMemoryProviderSupplier.class) public interface LoreAssistant { @SystemMessage(fromResource = "prompts/lore-assistant.txt") @ToolBox(LoreTools.class) @OutputGuardrails(JsonChatResponseGuardrail.class) JsonChatResponse lore(@UserMessage String question); }
  15. Prompt: tool calling, too You are a lorekeeper for tabletop

    roleplaying games with access to setting and rules documents. === TOOL USE (IMPORTANT) === You MUST use getLoreDocument when: - Retrieved context or prior responses mention document paths - User asks about something referenced in a previous answer - Context contains links like [Name](path/to/file.md) Call the tool BEFORE saying you don't have information. === BOUNDARIES === - This is out-of-character reference discussion, not gameplay - Present options rather than making GM decisions
  16. Soloplay: lore chat for what RAG knows Specialized improvements. How

    do you play Spelljammer? What is a damselfly?
  17. RAG -> Gameplay Not a 1:1 mapping RAG solved knowledge

    retrieval But gameplay needs more Challenges: Continuity Context Creativity
  18. What D&D taught us Too much to ask of a

    local LLM: Reasoning limitations Track NPCs, locations, story position simultaneously Decide when to call which tool Manage state across a long session Drive the narrative and run the game Some local models better-suited than others!
  19. The real problem: context The prompt was fine. The context

    was a mess. Current scene Party state Adventure module content Session history All competing for a limited context window. Wrong context → wrong output. Every time.
  20. What helped Separation of concerns: LLM proposes. Engine commits. Engine

    owns the flow. LLM handles one scoped task at a time.
  21. ActorCreationEngine Engine decides what happens next. LLM doesn't. public GameResponse

    processRequest(GameState game, String playerInput, GameEventEmitter emitter) { var trimmed = playerInput.trim(); // all handled by the engine if (isHelpCommand(trimmed)) return help(game); if ("/cancel".equalsIgnoreCase(trimmed)) return cancelCreation(game); if ("/status".equalsIgnoreCase(trimmed)) return showStatus(game); if ("/done".equalsIgnoreCase(trimmed)) return finishCharacter(game); // engine-managed state CharacterCreationStage stage = game.getCharacterCreationStage(); if (trimmed.isBlank() || trimmed.equals("/start") || trimmed.equals("/newcharacter")) return promptForCurrentStage(game, emitter); // engine -> LLM return processStageInput(game, stage, trimmed, emitter); }
  22. ActorCreationAssistant @RegisterAiService @SessionScoped @SystemMessage(...) public interface ActorCreationAssistant { @UserMessage(...) @OutputGuardrails(ActorCreationResponseGuardrail.class)

    ActorCreationResponse promptForStage( @MemoryId String chatMemoryId, String gameId, String adventureName, String stage, PlayerActor currentActor); @UserMessage(...) @OutputGuardrails(ActorCreationResponseGuardrail.class) ActorCreationResponse processStageInput( @MemoryId String chatMemoryId, String gameId, String adventureName, String stage, PlayerActor currentActor, String playerInput); }
  23. User prompt with context @UserMessage(""" Current stage: {stage} {#if currentActor}

    Character so far: {#if currentActor.name}- Name: {currentActor.name} {/if}{#if currentActor.actorClass}- Class: {currentActor.actorClass} {/if}{#if currentActor.level}- Level: {currentActor.level} {/if}{#if currentActor.summary}- Summary: {currentActor.summary} {/if}{#if currentActor.description}- Description: {currentActor.description} {/if}{#if currentActor.tags}- Tags: {#each currentActor.tags}{it}{#if it_hasNext}, {/if}{/each} {/if}{/if} Ask the player for the {stage} information. Be friendly and give examples if helpful. """) @OutputGuardrails(ActorCreationResponseGuardrail.class) ActorCreationResponse promptForStage( @MemoryId String chatMemoryId, String gameId, String adventureName, String stage, PlayerActor currentActor);
  24. Structured output LLM returns a patch. Engine applies it. Guardrail

    verifies the returned type public record ActorCreationResponse( String messageMarkdown, CharacterPatch patch) { public record CharacterPatch( String name, String actorClass, Integer level, String summary, String description, List<String> tags, List<String> aliases) { } }
  25. Guardrails Reduce bad output to the engine @Override public OutputGuardrailResult

    validate(AiMessage responseFromLLM) { try { ActorCreationResponse response = objectMapper.readValue( responseFromLLM.text(), ActorCreationResponse.class); // Validate required field is present if (response.messageMarkdown() == null ...) { return reprompt("Missing messageMarkdown", "Your response must include a 'messageMarkdown' field" + "with your message to the player. Return JSON like:" + "{\"messageMarkdown\": \"your message here\"," + "\"patch\": {...}}"); } return OutputGuardrailResult.successWith( responseFromLLM.text(), response); } catch (JsonProcessingException e) { return reprompt(REPROMPT_MESSAGE, e, REPROMPT_PROMPT); } }
  26. The pattern scales ActorCreationEngine → character creation GamePlayEngine → active

    gameplay Same pattern. More context. But: more complexity.
  27. Ironsworn built differently Player-driven narrative Story stored in a markdown

    file (journal) Balance continuity vs. context window limit
  28. RAG Flavor 2: Incremental Indexing The model knows Ironsworn rules

    and lore. No corpus to ingest upfront. The "documents" are written as you play.
  29. StoryMemoryIndexer Create embeddings for journal contents Debounced. Runs on a

    virtual thread. Gameplay not blocked by indexing. public void requestIndex(String campaignId) { scheduleIndex(campaignId, debounceMillis); }
  30. Only embed what changed * Code trimmed for brevity. See

    repo for full. List<String> newHashes = new ArrayList<>(exchanges.size()); for (JournalExchange ex : exchanges) { newHashes.add(sha256(ex.content())); } ... int min = Math.min(oldHashes.size(), newHashes.size()); while (firstDiff < min && Objects.equals(...) { firstDiff++; } ... for (int i = firstDiff; i < exchanges.size(); i++) { JournalExchange exchange = exchanges.get(i); String narrative = JournalParser.stripNonNarrative(...); if (narrative.isBlank()) { continue; } ids.add(embeddingId(campaignId, i)); segments.add(TextSegment.from(narrative, metadata(campaignId, i))); } ... List<Embedding> embeddings = embeddingModel.embedAll(segments).content(); ... embeddingStore.addAll(ids.subList(0, n), embeddings.subList(0, n), segments.subList(0, n));
  31. StoryMemoryService Not "find turn 7." Find what matters now. public

    String relevantMemory(String campaignId, String query) { Embedding queryEmbedding = embeddingModel.embed(query).content(); Filter filter = metadataKey("campaignId").isEqualTo(campaignId); EmbeddingSearchRequest request = EmbeddingSearchRequest.builder() .queryEmbedding(queryEmbedding) .maxResults(maxResults) .minScore(minScore) .filter(filter) .build(); return format(embeddingStore.search(request)); }
  32. Context engineering Ironsworn doesn't dump everything into the prompt. The

    engine assembles context deliberately: Recent journal → local continuity StoryMemoryService → semantically relevant exchanges Move + outcome → what just happened Right context. Right amount. Right order.
  33. Two flavors. One database. "Ingestion" soloplay ironsworn When upfront every

    turn Source uploaded files journal exchanges Scope global lore per campaign Change detection re-upload SHA-256 hash Neo4j handles both.
  34. ...But that's not all! New dream: migrate soloplay to an

    agentic approach offload LLM even further by loading onto agent determinism++
  35. Soloplay — Application Architecture CLIENT Browser (Renarde Qute Templates +

    Vanilla JS) chat-interface.js | play-interface.js (WebSocket) | game-index.js | inspect.js REST WebSocket REST API / TRANSPORT ChatResource /api/chat PlayWebSocket /ws/play/{gameId} LoreResource | GameResource /api/lore | /api/game ENGINE / ORCHESTRATION GameEngine Phase routing: CREATE → INIT → PLAY GamePlayEngine Turn processing + world-state updates ActorCreationEngine Multi-stage character creation flow AgentOrchestrator (Level 4 — Agentic Gameplay) Coordinates 5 specialized @RegisterAiService agents per turn 1. DiceAgent Sync — runs first Decides if a roll is needed Stateless 2. NarrationAgent Sync — informed by dice Scene narration + storytelling Stateful (memory + tools) SuggestionAgent 2–3 action choices Stateless CheckpointAgent Milestone detection Stateless RecapAgent 1–2 sentence summary Stateless 3. Concurrent (CompletableFuture) NarrationAgent Tool Access LoreTools (RAG search) + GameTools (actors, locations, recap) GamePlayResponse (assembled) narration + dice + suggestions + checkpoint + recap AI SERVICES (LangChain4j @RegisterAiService) ChatAssistant Plain LLM chat LoreAssistant RAG + LoreRetriever ActorCreationAssistant Session memory + JSON output 5 Gameplay Agents Narration, Dice, Suggestion, Checkpoint, Recap INFRASTRUCTURE Ollama (Local LLM) llama3.2 / qwen3 14b + nomic-embed-text Neo4j (Graph DB) Game state (OGM) + Vector embeddings (768d) IngestService Parse → Chunk → Embed → Store Agent flow Data flow Assembly Agentic zone
  36. General LLM lessons learned LLM-friendly domain is a head start

    Pick a problem scope the model is good at Automatically simpler integration, immediate results Know what the model knows System prompt = knowledge it has RAG = knowledge it doesn't
  37. App architecture lessons learned Prompt engineering is alchemy (art and

    science) Treat prompts like contracts Validate outputs. Retry on failure Tool calling is hit or miss (model capability) Keep the LLM out of your state machine Manipulate text or images with it (Reliably) commit state without it
  38. Data lessons learned Manage the data in your RAG Data

    quality (ingest) impacts results Chunking quality matters (overlap, size, etc) Chain appropriate chunks, separate where sensible Context engineering matters Can "perfect" the prompt BUT wrong context still = wrong output Decide: What goes in. How much. In what order.
  39. What's next? The "engine drives the flow" pattern isn't custom

    Use LLM for bounded tasks (even with tool calling) You don't have to build this interaction from scratch Frameworks like Embabel work
  40. Resources Code: Adventures : free at ironswornrpg.com : D&D Beyond

    Research — Michael Hannecke https:/ /github.com/ebullient/quarkus-soloplay https:/ /github.com/ebullient/quarkus-ironsworn Ironsworn Spelljammer Beyond JSON: Picking the Right Format for LLM Pipelines