Slide 1

Slide 1 text

Bringing your first LLM into production beware of your boss’s nephew Christian Hidber, bSquare Oliver Zeigermann, TK LLM Soirée, Azure Zurich User Group, November 2024

Slide 2

Slide 2 text

Boss: my nephew does GPT too…

Slide 3

Slide 3 text

LLM Intro

Slide 4

Slide 4 text

H o w d o e s a D e c o d e r M o d e l w o r k ? Q: what is pluvia ? • Trained on huge datasets • Does not change • Same for all users • «the model» • Depends on users goal • Unique for each chat & user • Contains the chat history • «the context»

Slide 5

Slide 5 text

Q: what is pluvia ? • Trained on huge datasets • Does not change • Same for all users • «the model» • Depends on users goal • Unique for each chat & user • Contains the chat history • «the context» • Single «word» • Depends on context and model • «the token» Pluvia H o w d o e s a D e c o d e r M o d e l w o r k ? A: Pluvia

Slide 6

Slide 6 text

Q: what is pluvia ? A: Pluvia is • Trained on huge datasets • Does not change • Same for all users • «the model» • Depends on users goal • Unique for each chat & user • Contains the chat history • «the context» • Single «word» • Depends on context and model • «the token» H o w d o e s a D e c o d e r M o d e l w o r k ?

Slide 7

Slide 7 text

Q: what is pluvia ? A: Pluvia is a latin … • Trained on huge datasets • Does not change • Same for all users • «the model» • Depends on users goal • Unique for each chat & user • Contains the chat history • «the context» • Single «word» • Depends on context and model • «the token»

Slide 8

Slide 8 text

Q: what is pluvia ? A: Pluvia is a latin word meaning rainfall. EOT • Trained on huge datasets • Does not change • Same for all users • «the model» • Depends on users goal • Unique for each chat & user • Contains the chat history • «the context» • Single «word» • Depends on context and model • «the token» H o w d o e s a D e c o d e r M o d e l w o r k ?

Slide 9

Slide 9 text

Hypothetical GeberitBot: Generating an Answer User: Asking a Question N a ï v e A p p r o a c h answer You are an expert in ….. Q: What is Pluvia ? A: Pluvia is a latin word meaning rainfall.

Slide 10

Slide 10 text

Llm: Generating an Answer User: Asking a Question R A G S y s t e m A r c h i t e c t u r e chunks Vector DB: Searching facts matching the question Anonymizer: Enforcing Privacy chunks Doc Loader: Image2Text chunks question answer

Slide 11

Slide 11 text

Demo: RAG Applications

Slide 12

Slide 12 text

Choosing an application

Slide 13

Slide 13 text

Low Risk, but nice benefit Low Risk ● What is the worst thing that could happen and how to mitigate that? ● Low profile ● Failures should be ok ● Human in the loop Nice Benefit ● Impossible to do by humans or ● Humans don’t like to do ● Let the whole organization learn ● Management likes it, but is afraid ● Can it be used for (internal) marketing?

Slide 14

Slide 14 text

From Prompt Hacking to Production 14

Slide 15

Slide 15 text

Writing a PoC vs. Engineering Task Ad-hoc prompting is something very different from writing a prompt for a service With ad-hoc prompting • you can immediately see if it works. • there’s a high level of human oversight. • it only needs to work for a specific example With prompting for a system • It needs to generalize for all expected use cases • Has no or less human supervision • Stability is expected

Slide 16

Slide 16 text

Evaluation

Slide 17

Slide 17 text

E v a l u a t i o n o n t e x t r e s u l t s Llm: Generating an Answer User: Asking a Question answer Question • What is Pluvia ? Answer • Pluvia is a latin word meaning rainfall. • The latin word for rainfall. • …. => equality not an option Human Eval

Slide 18

Slide 18 text

E v a l u a t i o n o n t e x t r e s u l t s Evaluation Criteria: • Correct • Complete • Concise • Relevant • Contradiction free • Language • Style • … • Generation successful Statistics Human Eval Llm: Generating an Answer User: Asking a Question answer

Slide 19

Slide 19 text

E v a l u a t i o n o n t e x t r e s u l t s LLM as a Judge Llm: Generating an Anwer User: Asking a Question answer Evaluation Criteria: • Correct • Complete • Concise • Relevant • Contradiction free • Language • Style • … • Generation successful Statistics Human Eval

Slide 20

Slide 20 text

Demo: Evaluation Notebook

Slide 21

Slide 21 text

Llm: Generating an Anwer User: Asking a Question R A G S y s t e m A r c h i t e c t u r e : E v a l u a t i o n question chunks answer Vector DB: Searching facts matching the question Anonymizer: Enforcing Privacy chunks Doc Loader: Image2Text chunks SystemPrompt Question Chunks Contextual Relevance Faithfulness Answer Relevance Conciseness

Slide 22

Slide 22 text

Online Eval: Example

Slide 23

Slide 23 text

Online Eval: Example

Slide 24

Slide 24 text

PDF Tables

Slide 25

Slide 25 text

P D F & Ta b l e s

Slide 26

Slide 26 text

P D F Ta b l e s : s i m p l e c h u n k i n g Art der Flächen Spitzen-\nabfluss-\nbeiwert \nCSMittlerer \nAbfluss-\nbeiwert \nCM\nWasser- \nundurch-lässige Flächen, z. B. Flachdach (≤ 3°) 1,0 0,9\nBetonflächen 1,0 0,9\nRampen 1,0 1,0\nBefestigte Flächen mit Fugendichtung1,0 0,8\nSchwarzdecken (Asphalt) 1,0 0,9\nPflaster mit Fugenver- guss1,0 0,8\nKiesschüttdächer 0,8 0,8\nBegrünte Dach- flächenFür Intensivbegrünun-gen ab 30 cm Aufbau- dicke (≤ 5°)0,2 0,1\nFür Extensivbegrünun-gen ab 10 cm Aufbau-dicke (≤ 5°)0,4 0,2\nFür Extensivbegrünun- gen unter 10 cm Aufbau-dicke (≤ 5°)0,5 0,3\nFür Extensivbegrünung (> 5°)0,7 0,4QRArCS\uf0d7\uf0d7

Slide 27

Slide 27 text

P D F Ta b l e s : m u l t i v e c t o r Table Caption: Tabelle 85: Dachaufbauten und Abflussbeiwerte…. Table Summary: Die Tabelle beschreibt….. Table CSV: ,Art der Flächen,Spitzenabflussbeiwert,…. Wasser-undurchlässige Flächen, Flachdach (<=3),1.0,0.9 ….

Slide 28

Slide 28 text

Demo: Azure Document Intelligence...

Slide 29

Slide 29 text

P D F Ta b l e s : m u l t i v e c t o r Vector DB: Searching facts matching the question

Slide 30

Slide 30 text

Reranking

Slide 31

Slide 31 text

R e r a n k i n g Vector DB: Searching facts matching the question … Rank the following documents according to their relevance for the given question. Q: What is Pluvia ? 1 2 3 4 5 6 6 1 2 5 3 4 Llm: Generating an Answer

Slide 32

Slide 32 text

Wrap Up

Slide 33

Slide 33 text

W r a p U p Scoping Prompting Evaluation Table Parsing Reranking

Slide 34

Slide 34 text

N o n e th e l e ss beware of your boss’s nephew…

Slide 35

Slide 35 text

Thank you