Slide 1

Slide 1 text

Thomas Vitale GOTO Copenhagen Oct 2nd, 2024 Concerto for Java and AI Building Production-Ready LLM Applications @vitalethomas

Slide 2

Slide 2 text

Systematic • Software Engineer • CNCF Ambassador, Oracle ACE Pro, Testcontainers Community Champion • Author of “Cloud Native Spring in Action” (Manning). • OSS contributor (Java, Spring, Cloud Native Technologies) Thomas Vitale thomasvitale.com @vitalethomas

Slide 3

Slide 3 text

@vitalethomas LLM RAG Prompting Embeddings Vector Stores Hallucinations Agents Generative AI Ph. Francisco Emilio Diaz

Slide 4

Slide 4 text

One Buzzword To Rule Them All @vitalethomas

Slide 5

Slide 5 text

The WHY Factor @vitalethomas

Slide 6

Slide 6 text

The WHY Factor @vitalethomas What problem does it solve? How ready is it for production? Yeah, but how about the DevEx?

Slide 7

Slide 7 text

Large Language Models @vitalethomas

Slide 8

Slide 8 text

Machine Learning Subset of Arti fi cial Intelligence @vitalethomas Model Training Model Fine-Tuning Model Inference ML Engineers Platform/Infrastructure Platform Engineers HTTP API Application Developer

Slide 9

Slide 9 text

If you like it, you should put an API on it @vitalethomas

Slide 10

Slide 10 text

Model Inference via HTTP APIs @vitalethomas Application Model Inference Service Do you wanna build a snowman? HTTP Application Database Service DELETE * FROM HYPE; JDBC

Slide 11

Slide 11 text

Text Classification @vitalethomas

Slide 12

Slide 12 text

Text Classification @vitalethomas Application Model Service Classify HARMONY

Slide 13

Slide 13 text

LLM Security Risks (1) OWASP Top 10 for LLM Prompt Injection Model Denial of Service Sensitive Information Disclosure @vitalethomas OWASP Top 10 LLM Applications and Generative AI https://genai.owasp.org/

Slide 14

Slide 14 text

Resilience LLM Applications @vitalethomas Resilience4J

Slide 15

Slide 15 text

Observability Distributed Tracing Request Rate Errors Duration Prompt Content Token Usage Context Window @vitalethomas

Slide 16

Slide 16 text

Semantic Search @vitalethomas

Slide 17

Slide 17 text

Semantic Search @vitalethomas From Keywords to Meaning Application Melancholic Embedding Model Melancholic [42…] LIKE ‘%melancholic%' Database Vector Store [42…]

Slide 18

Slide 18 text

Data Ingestion LLM Applications @vitalethomas Document Reader Document Transformer Document Writer JobRunr

Slide 19

Slide 19 text

Question Answering @vitalethomas

Slide 20

Slide 20 text

Question Answering with Docs @vitalethomas Retrieval Augmented Generation Application Melancholic instrument? Embedding Model Melancholic instrument? [42…] Get Similar Documents Vector Database Model Service Question + Similar Documents

Slide 21

Slide 21 text

One Does Not Simply Test LLM Applications @vitalethomas

Slide 22

Slide 22 text

Structured Data Extraction @vitalethomas

Slide 23

Slide 23 text

Structured Data Extraction @vitalethomas From Text to JSON Application Text Text to Structured JSON Model Service Database Save Structured JSON

Slide 24

Slide 24 text

LLM Security Risks (2) OWASP Top 10 for LLM Insecure Output Handling Excessive Agency Insecure Plugin Design @vitalethomas OWASP Top 10 LLM Applications and Generative AI https://genai.owasp.org/

Slide 25

Slide 25 text

Hallucinations @vitalethomas

Slide 26

Slide 26 text

Data Validation JSON Schema Humans in the Loop Optional Values @vitalethomas Mitigating hallucination risks

Slide 27

Slide 27 text

Speech Transcription @vitalethomas

Slide 28

Slide 28 text

Speech Transcription @vitalethomas From Speech to Text Application Audio Audio to Text Audio Model Chat Model Text to Structured JSON

Slide 29

Slide 29 text

Privacy @vitalethomas

Slide 30

Slide 30 text

Build & Deploy Cloud Native Buildpacks Kubernetes Service Binding Native Executables with GraalVM @vitalethomas Going to Production

Slide 31

Slide 31 text

Service Bindings for Spring AI @vitalethomas

Slide 32

Slide 32 text

@vitalethomas Ph. Francisco Emilio Diaz Composer Assistant https://github.com/ThomasVitale/concerto-for-java-and-ai

Slide 33

Slide 33 text

Thomas Vitale @vitalethomas thomasvitale.com Concerto for Java and AI Building Production-Ready LLM Applications