Upgrade to Pro — share decks privately, control downloads, hide ads and more …

Java + LLMs: A hands-on guide to building LLM A...

Shaaf Syed
January 28, 2025
12

Java + LLMs: A hands-on guide to building LLM Apps in Java with Jakarta

Java + LLMs: A hands-on guide to building LLM Apps in Java with Jakarta

Shaaf Syed

January 28, 2025
Tweet

Transcript

  1. 1 Java + LLMs: A hands-on guide to building LLM

    Apps in Java with Jakarta Syed M Shaaf Developer Advocate Red Hat Bazlur Rahman Java Champion 🏆 Staff Software Developer at DNAstack
  2. • Systems do not speak Natural language, can’t translate and

    lack context outside of system boundaries. (e.g. sentiment) • Generating content is costly and sometimes hard. • Rapid data growth • Rising Expectations: Customers demand instant, personalized solutions. • Inefficiency: Manual processes increase costs and slow operations. • Skill Gaps: Limited expertise in AI adoption. Systems, Data, Networks and a Solution?
  3. • Rigid Systems: Rule-based approaches can’t adapt to complex needs.

    • Real-Time Challenges: Businesses struggle with instant, context-aware decisions. • Lost Opportunities: AI-driven companies grow 2x faster. Furthermore
  4. Understanding the journey that brought us here... Expert System Machine

    learning Deep learning Foundation models No use of data Manually authored rules Brittle Labour intensive Data prep, feature eng. Supervised learning, unsupervised learning, classification Learning without labels, adapt, tune, massive data appetite
  5. Foundation models Learning without labels, adapt, tune, massive data appetite

    • Tasks ◦ Translation, Summarization, Writing, Q&A • “Attention is All you need”, Transformer architecture • Recognize, Predict, and Generate text • Trained on a Billions of words • Can also be tuned further A LLM predicts the next token based on its training data and statistical deduction Large Language Models
  6. Tokens Tokenization: breaking down text into tokens. e.g., Byte Pair

    Encoding (BPE) or WordPiece); handle diverse languages and manage vocabulary size efficiently. [12488, 6391, 4014, 316, 1001, 6602, 11, 889, 1236, 4128, 25, 3862, 181386, 364, 61064, 9862, 1299, 166700, 1340, 413, 12648, 1511, 1991, 20290, 15683, 290, 27899, 11643, 25, 93643, 248, 52622, 122, 279, 168191, 328, 9862, 22378, 2491, 2613, 316, 2454, 1273, 1340, 413, 73263, 4717, 25, 220, 7633, 19354, 29338, 15] https://platform.openai.com/tokenizer "Running", “unpredictability” (word-based tokenization). Or: "run" " ning" ; “un” “predict” “ability” (subword-based tokenization, used by many LLMs). “Building Large Language Models from scratch” - Sebastian Raschka
  7. LLMs offer a scalable, adaptive way forward. Let’s see how

    Java and Jakarta EE simplify this journey.
  8. The Importance of Jakarta EE • Jakarta EE is an

    important part of the Java ecosystem • 25-35% of Java applications run on Jakarta EE runtimes • WildFly, Payara, GlassFish, JBoss EAP, WebSphere/Liberty, WebLogic • 70-80% of Java applications depend on at least one or more Jakarta EE APIs • Tomcat, Hibernate, ActiveMQ, Jetty, Jersey, RESTEasy, Quarkus, MicroProfile, Spring Boot 2024 Jakarta EE Developer Survey: https://outreach.eclipse.foundation/jakarta-ee-developer-survey-2024
  9. A Simple chat bot - Basic htmx - Chat window

    - Backend sends question to the LLM. - Streaming is also an option
  10. Prompts System prompt - Define the task - Set the

    expectations - Provide examples User prompt - Specific to the input When to use system vs user? What is a good prompt!? - E.g. Structure your input and output, (different LLMs behave differently) ** Try not to migrate prompts across models
  11. Reasoning - Chain of Thought - TOT reasoning - Tree

    of Thought (Thinking, Organizing, Translating)
  12. Few-Shot , Zero Shot Zero-Shot - No data collection needed

    - Better accuracy with minimal examples - Lower accuracy on complex tasks Few-Shot - Fast implementation - Adaptable to niche tasks - Sensitive to example quality/order
  13. Whats an AI Service? - AI Services, tailored for Java

    - similar to Spring Data JPA or Retrofit - handle the most common operations
  14. Function calling / Tools @Tool double squareRoot(double x) { return

    Math.sqrt(x); } - Call other services or functions to enhance the response. - E.g. Web APIs, internal system requests
  15. Enhanced Information Retrieval RAG combines the strengths of retrieval-based and

    generative models, enabling it to pull in relevant, up-to-date information from external sources and databases. This ensures that responses are not only contextually accurate but also rich in current and specific details. Improved Answer Accuracy By integrating a retrieval component, RAG can provide more accurate answers, especially for factual questions. It retrieves relevant documents or snippets from a large corpus and uses these to guide the generative model, which helps in generating responses that are factually correct and informative. Versatile Applications RAG is highly versatile and can be applied across various domains such as customer support, knowledge management, and research assistance. Its ability to combine retrieved data with generative responses makes it suitable for complex tasks that require both extensive knowledge and contextual understanding. Retrieval Augmented Generation
  16. Retrieval Augmented Generation What is the representation of the data?

    How do I want to split? Per document Chapter Sentence How many tokens do I want to end up with? How much overlap is there between segments?
  17. BYO Prompting Tuning Real-time Augmented Gen. Complexity and work Balancing

    Act How to build software with LLMs “We chose RAG for Kai because it provides targeted, context-specific solutions without the complexity of model fine-tuning required by other approaches.” – John Matthews, Sr. Principal Software Engineer Red Hat
  18. 2 4 Thank you! Syed M Shaaf Developer Advocate Red

    Hat Bazlur Rahman Java Champion 🏆 Empowering Developers through Speaking 🗣 Writing ✍ Mentoring 🤝 & Community Building 🌍 Published Author 📖 Contributing Editor at InfoQ and Foojay.IO fosstodon.org/@shaaf sshaaf https://www.linkedin.com/in/shaaf/ shaaf.dev https://bsky.app/profile/shaaf.dev https://x.com/bazlur_rahman rokon12 https://www.linkedin.com/in/bazlur/ https://bazlur.ca/ https://bsky.app/profile/bazlur.ca Source for the demo https://github.com/rokon12/llm-jakarta https://docs.langchain4j.dev/ LangChain4J