Spring AI vs LangChain4J (meetup in Paris on 05/12/2024)

LangChain4j vs Spring AI Julien Dubois / Michael Isvy

Who we are • Julien Dubois ◦ DevRel manager at
Microsoft, JHipster creator ◦ Second contributor to the LangChain4j project • Michael Isvy ◦ VP of Engineering at cynapse.ai (Computer Vision) in Singapore ◦ 17 years of Spring, 2 years of AI

Agenda • The AI Space • AI in Java: myth
or reality? • Prompting with LangChain4J and Spring AI • RAG and Vector Databases

The AI space M

Artificial Intelligence: which Way? Conventional AI Generative AI LLM, ChatGPT,
Claude, Gemini, Mistral, Ollama. llama… Programming: mostly API calls Based on custom-trained models Programming: requires AI engineers

Conventional AI example: Licence Plate Recognition • Find a base
model online Typically on Github or HuggingFace • Evaluate the model Identify gaps (example: doesn’t work with Singapore truck license plates) • Prepare a fine-tuning dataset • Spend 3-4 days training the model

Generative AI: usually an API call! • Example: API call
to ChatGPT #!/bin/bash API_KEY="sk-proj-7devtvnBIsYXVHJuHBQAT3BlbkFJNBB4uz8Iog5F2y" curl https://api.openai.com/v1/chat/completions \ -H "Content-Type: application/json" -H "Authorization: Bearer $API_KEY" \ -d '{ "model": "gpt-4o", "messages": [ {"role": "user", "content": "Tell me a joke." } ] }' Linux/OSX In GenAI, models are much more complex. But most of the time you don’t need to build them

Quiz • In the below example, there is something you
should never do. What is it? #!/bin/bash API_KEY="sk-proj-7devtvnBIsYXVHJuHBQAT3BlbkFJNBB4uz8Iog5F2y" curl https://api.openai.com/v1/chat/completions \ -H "Content-Type: application/json" -H "Authorization: Bearer $API_KEY" \ -d '{ "model": "gpt-4o", "messages": [ {"role": "user", "content": "Tell me a joke." } ] }' Linux/OSX

AI in Java: Myth or reality?

Conventional AI with Java? • Java has a tiny ecosystem
in Conventional AI • NVIDIA rules the game ◦ Native NVIDIA libraries: CUDA, TensorRT, DeepStream… ◦ Work natively with Python and/or C++ only

Generative AI with Java • Java is getting a lot
of momentum for GenAI ◦ GenAI calls: mostly about calling API and using (Vector) databases ◦ Java is good at calling APIs and databases! • Similar shift as database access ◦ Java developers aren’t great at developing a database ◦ We’re good at accessing it (JDBC) ◦ We’re exceptional at working with advanced tools to use it effectively (Hibernate) J

TL;DR (Too Long; Didn’t Read) LangChain4J Spring AI Sponsor Community
+ Red Hat @IBM Spring team @Broadcom Well-designed API? Good Great Documentation Good Great Community Support Great Great Support for models All models supported no Google Gemini yet RAG and Vector db support Great (lots of options) Good

In the Cloud On premise / on your laptop The
Generative AI landscape OpenAI ChatGPT Mistral AI Mistral Anthropic Claude Google Gemini Spin off from OpenAI Ollama Llama3.1 Mixtral LLava tinyllama … Choose your local model J

Spring AI

The Spring AI ecosystem • Current version: Spring AI 1.0.0-M4
◦ Not in final release version yet! • Based on Spring and Spring Boot

Generating a Spring AI project • Use start.spring.io

• Best practice: store the API key is an env
variable application.properties spring.application.name=demo-pring-ai spring.ai.openai.api-key=${OPENAI_API_KEY} spring.ai.openai.chat.options.model=gpt-4o application.properties IntelliJ Community Edition

Making a call @Service public class MusicService { private final
ChatClient chatClient; public MusicService(ChatClient.Builder builder) { this.chatClient = builder.build(); } public String findBestSongs() { return this.chatClient.prompt() .user("which were the best songs in 1993?") .call().content(); } } • Use ChatClient’s fluent API Generic (does not depend on the LLM implementation) Demo

Calling from a JUnit test • Run configuration should include
the environment variable @SpringBootTest class MusicServiceTest { @Autowired private MusicService musicService; private static final Logger logger = LoggerFactory.getLogger(MusicService.class); @Test void shouldFindBestSongs() { String response = this.musicService.findBestSongs(); logger.info(response); } } Demo

Mapping a prompt to an entity @Service public class ChatService
{ private final ChatClient chatClient; public ChatService(ChatClient.Builder builder) { this.chatClient = builder.build(); } public ActorFilms generateResponse() { return this.chatClient.prompt() .user("Generate the 10 most popular movies starring Bruce Willis") .call() .entity(ActorFilms.class); } } public record ActorFilms(String actor, List<String> movies) {} works with Java records Demo

JSON schema under the hood public ActorFilms generateResponse() { return
this.chatClient.prompt() .user("Generate the 10 most popular movies starring Bruce Willis") .call() .entity(ActorFilms.class); } public record ActorFilms(String actor, List<String> movies) {} Do not include any explanations, only provide an RFC8259 compliant JSON response … { \"$schema\" : \"https://json-schema.org/draft/2020-12/schema\", \"type\" : \"object\", \"properties\" : { \"actor\" : { \"type\" : \"string\" }, \"movies\" : { \"type\" : \"array\", \"items\" : { \"type\" : \"string\" } } }

public String recommendMovie(String topic) { return this.chatClient.prompt() .user( userSpec ->
userSpec.text("Can you recommend a movie about {topic}") .param("topic", topic)) .call() .content(); } Using a prompt with Parameters var response = this.movieService.recommendMovie("computers"); this.logger.info(response); Certainly! One highly regarded film that delves into the world of computers is "The Imitation Game" (2014). This biographical drama stars Benedict Cumberbatch as Alan Turing, a pioneering computer scientist and mathematician.

Working with images @Service class ImageService { @Value("classpath:images/scientist.jpg") private Resource
imageResourceScientist; //… public String describeScientist() { return this.chatClient.prompt() .user( userSpec -> userSpec.text("can you describe this person? And what is written on top of his head?") .media(MimeTypeUtils.IMAGE_PNG, this.imageResourceScientist) ).call().content(); } } Import image file as a Resource Optical Character Recognition Demo

Spring AI - Fluent API calls return this.chatClient .prompt() .user("which
were the best songs in 1993?") .call().content(); • Simple call Fluent API call General trend in the Spring team: Client over Template See: RestClient over RestTemplate, JdbcClient over JdbcTemplate etc.

Spring AI - unified API return this.chatClient .prompt() .user("which were
the best songs in 1993?") .call() .content(); • Simple call • Image return this.chatClient .prompt() .user( userSpec -> userSpec.text("what’s the weather like?") .media(MimeTypeUtils.IMAGE_PNG, this.image)) .call().content(); return this.chatClient .prompt() .user("which were the best songs in 1993?") .call() .content(); • Using a Vector DB and RAG Vector DB: requires specific configuration

LangChain4j

Introduction • Current version: 0.36.2 ◦ Not in final release
version yet! • No need for a framework to use it • Great integration with Quarkus, strong relationship with Red Hat • Spring Boot starter integration

Full demo available All examples (and more!) are available on
GitHub at: https://github.com/jdubo is/jdubois-langchain4j-d emo

Making a call

Configuration using a Spring Bean This is a manual configuration,
not using the Spring Boot starter

Similar code for creating an image

Using local models with Ollama M

What is Ollama • Desktop software (Linux, Windows, OSX) •
Allows to run LLM models locally • Exposes an API so models can be queried • Ollama is different from Llama ◦ Ollama is a platform that runs local models ◦ llama is a series of open source models from Meta

Ollama is inspired by Docker • Ollama has been founded
by 2 former Docker employees • Aims to run models in the way Docker runs containers • Uses Modelfile (similar to Dockerfile) • Also written in Go

Ollama installation • Download from https://ollama.com/download • Once installed, Ollama
runs on port 11434:

Ollama basic commands • List models michael@macbook-air % ollama list
NAME ID SIZE MODIFIED tinyllama:latest 2644915ede 0.6 GB 5 days ago llama3.1:latest f66fc8dc39ea 4.7 GB 5 days ago mistral:latest f974a74358d6 4.1 GB 2 weeks ago llama3.1:8b 91ab477bec9d 4.7 GB 3 weeks ago michaelisvy2@macbook-air documentation % • Install a model michael@macbook-air % ollama pull tinyllama pulling manifest pulling 2af3b81862c6... 100% ▕█████████████████████████████▏ 637 MB verifying sha256 digest writing manifest success

Ollama - running models • Language model • Vision model
michael@macbook-air % ollama run tinyllama >>> tell me a joke Sure, here's a joke for you: Q: What does an AI robot have in common with a smartphone? A: It's always looking at its screen. michael@macbook-air % ollama run llava >>> what is the weather like in Singapore? /Users/michael//spring-ai/lab-files/lab-02/singapore-weather.png Added image '/Users/michael/singapore-weather.png' The image displays a weather forecast for Singapore on a particular day. According to the forecast, it will be 34 degrees Celsius (93.2 degrees Fahrenheit) with humidity at 76%.

Retrieval- Augmented Generation M

Quiz • Using Google Gemini’s prompt context window, up to
how many Harry Potter books can I fit at most? (7 books, 600 pages in average) ◦ 10% of a book ◦ 2 books ◦ All of them Context window Cost for providing such a large context: between $0.20 and $1 dependending on LLM provider

RAG / Vector Databases • Retrieval Augmented Generation ◦ Queries
to the LLM should have context • Vector Database ◦ Subset of RAG ◦ Context is broken into multiple chunks and stored in a Vector database

Vector databases

Solution: Vector databases • Split your data into chunks, and
encode each chunk into numbers that the ML model can understand {0.345, 0.465, 0.856, …, 0.1543} {0.545, 0.665, 0.056, …, 0.3543} {0.645, 0.765, 0.156, …, 0.4543} “My house is black” “My garden is big” “My dog is playful” AI Model Each Vector is an array of 1,536 numbers

Definition of a Vector • A Vector is just a
type of data ◦ Typically an array of 1,536 decimal numbers ◦ Value between -1 and 1 CREATE TABLE paragraph ( id SERIAL PRIMARY KEY, paragraph_text TEXT, vector VECTOR(1536) ); example with pgvector {-0.345, 0.465, 0.856, …, 0.1543} Each Vector is an array of 1,536 numbers “Vector” and “Embedding” are similar concepts. For simplicity, we use the word “Vector” whenever possible in this course

How are Vectors created? • Vectors typically represent the output
generated by Machine Learning models {0.345, 0.465, 0.856, 0.1543} {0.545, 0.665, 0.056, 0.3543} {0.645, 0.765, 0.156, 0.4543} “My house is black” “My garden is big” “My dog is playful” AI Model The above example is simplified and uses random numbers It is based on text AI models. Vectors may also be used with Computer Vision AI models or audio-based AI models

How to search vectors: Similarity Search ◦ Selects the closest
Vector(s) ◦ OpenAI recommends using cosine similarity {0.345, -0.465, 0.856, 0.1543} {-0.436, 0.578, 0.935, 0.2193} {-0.445, 0.565, 0.956, 0.2543} {0.545, 0.665, 0.056, 0.3543} {0.645, 0.765, 0.156, -0.4543} {0.745, 0.865, 0.256, 0.5543} {0.845, 0.965, -0.356, 0.6543} SELECT id, name, vector <=> '[0.436, 0.578, 0.935, 0.2193]' AS distance FROM items ORDER BY distance LIMIT 10; sample SQL query with pgvector

RAG Demo J

RAG pattern: ingestion Testing Document splitters: -https://langchain-text-split ter.streamlit.app

RAG pattern: retrieval

Putting it all together • LangChain4j’s EasyRAG makes it easy
to use the RAG pattern ◦ Sensible defaults ◦ Tooling ◦ Easy to extend • Best practices ◦ Ingestion: clean up the text, test with different models and splitters ◦ Retrieval: limit the prompt size ($$$), improve the prompt, test different models

Conclusion

Cost of 3 months learning Spring AI / LangChain4J •
OpenAI ◦ $1.36 • Ollama ◦ $0.00 https://platform.openai.com/usage M

Our favorite Resources online • https://www.youtube.com/@DanVega (Dan Vega) • https://www.youtube.com/@springinaction
(Craig Walls) • RAG from dumb implementation to serious results (Guillaume Laforge) • https://github.com/ThomasVitale/llm-apps-java-spring-ai/ (best Spring AI samples online!) • https://course.fast.ai/ Practical deep learning • Our demos: • https://github.com/michaelisvy/demo-spring-ai • https://github.com/michaelisvy/demo-langchain4j • https://github.com/jdubois/jdubois-langchain4j-demo 52

Thank you!

Spring AI vs LangChain4J (meetup in Paris on 05...

Spring AI vs LangChain4J (meetup in Paris on 05/12/2024)

More Decks by Michael Isvy

Featured

Transcript