Slide 1

Slide 1 text

Thomas Vitale Spring I/O May 31st, 2024 Concerto for Java and AI Building Production-Ready LLM Applications @vitalethomas

Slide 2

Slide 2 text

Systematic • Software Engineer • CNCF Ambassador, Oracle ACE Pro, Testcontainers Community Champion • Author of “Cloud Native Spring in Action” (Manning). • OSS contributor (Java, Spring, Cloud Native Technologies) Thomas Vitale thomasvitale.com @vitalethomas

Slide 3

Slide 3 text

Generative AI @vitalethomas LLM RAG Prompting Embeddings Vector Stores Hallucinations

Slide 4

Slide 4 text

One Buzzword To Rule Them All @vitalethomas

Slide 5

Slide 5 text

The WHY Factor @vitalethomas

Slide 6

Slide 6 text

The WHY Factor @vitalethomas What problem does it solve? How ready is it for production? Yeah, but how about the DevEx?

Slide 7

Slide 7 text

Large Language Models @vitalethomas

Slide 8

Slide 8 text

AI != ML @vitalethomas

Slide 9

Slide 9 text

ML != LLM @vitalethomas

Slide 10

Slide 10 text

Machine Learning Subset of Arti fi cial Intelligence @vitalethomas Model Training Model Fine-Tuning Model Inference ML Engineers Platform/Infrastructure Platform Engineers HTTP API Application Developer

Slide 11

Slide 11 text

If you like it, you should put an API on it @vitalethomas

Slide 12

Slide 12 text

Model Inference via HTTP APIs Spring AI @vitalethomas Application Model Inference Service Do you wanna build a snowman? HTTP Spring Data Application Database Service DELETE * FROM HYPE; JDBC

Slide 13

Slide 13 text

Classification @vitalethomas

Slide 14

Slide 14 text

Classification @vitalethomas Application Model Service Classify HARMONY

Slide 15

Slide 15 text

LLM Security Risks (1) OWASP Top 10 for LLM Prompt Injection Model Denial of Service Sensitive Information Disclosure @vitalethomas OWASP Top 10 LLM Applications and Generative AI https://genai.owasp.org/

Slide 16

Slide 16 text

Resilience LLM Applications @vitalethomas Resilience4J

Slide 17

Slide 17 text

Observability Distributed Tracing Request Rate Errors Duration Prompt Content Token Usage Context Window @vitalethomas

Slide 18

Slide 18 text

Semantic Search @vitalethomas

Slide 19

Slide 19 text

Semantic Search @vitalethomas From Keywords to Meaning Application Melancholic Embedding Model Melancholic [42…] LIKE ‘%melancholic%' Database Vector Store [42…]

Slide 20

Slide 20 text

Data Ingestion LLM Applications @vitalethomas Document Reader Document Transformer Document Writer JobRunr

Slide 21

Slide 21 text

Question Answering @vitalethomas

Slide 22

Slide 22 text

Question Answering with Docs @vitalethomas Retrieval Augmented Generation Application Melancholic instrument? Embedding Model Melancholic instrument? [42…] Get Similar Documents Vector Database Model Service Question + Similar Documents

Slide 23

Slide 23 text

One Does Not Simply Test LLM Applications @vitalethomas

Slide 24

Slide 24 text

Structured Data Extraction @vitalethomas

Slide 25

Slide 25 text

Structured Data Extraction @vitalethomas From Text to JSON Application Text Text to Structured JSON Model Service Database Save Structured JSON

Slide 26

Slide 26 text

LLM Security Risks (2) OWASP Top 10 for LLM Insecure Output Handling Excessive Agency Insecure Plugin Design @vitalethomas OWASP Top 10 LLM Applications and Generative AI https://genai.owasp.org/

Slide 27

Slide 27 text

Hallucinations @vitalethomas

Slide 28

Slide 28 text

RFBD @vitalethomas Randomly Failing By Design

Slide 29

Slide 29 text

Data Validation JSON Schema Humans in the Loop Optional Values @vitalethomas Mitigating hallucination risks

Slide 30

Slide 30 text

Speech Transcription @vitalethomas

Slide 31

Slide 31 text

Speech Transcription @vitalethomas From Speech to Text Application Audio Audio to Text Audio Model Chat Model Text to Structured JSON

Slide 32

Slide 32 text

Privacy @vitalethomas

Slide 33

Slide 33 text

Build & Deploy Cloud Native Buildpacks Kubernetes Service Binding Native Executables with GraalVM @vitalethomas Going to Production

Slide 34

Slide 34 text

Service Bindings for Spring AI @vitalethomas

Slide 35

Slide 35 text

Composer Assistant @vitalethomas https://github.com/ThomasVitale/concerto-for-java-and-ai

Slide 36

Slide 36 text

Thomas Vitale @vitalethomas thomasvitale.com Concerto for Java and AI Building Production-Ready LLM Applications