Large Language Models: From Prototype to Production (EuroPython keynote)

Slide 1

Slide 1 text

Ines Montani Explosion LARGE LANGUAGE LARGE LANGUAGE MODELS ✨ CHATGPT " ARTIFICIAL INTELLIGENCE # MACHINE LEARNING ✨ PROTOTYPE TO PRODUCTION MODELS FROM LLAMA $ NATURAL LANGUAGE PROCESSING % ✨ OPEN SOURCE & PYTHON ' PROMPT ENGINEERING ⚙ ZERO-SHOT LEARNING ) GPT-4 EVALUATION * COPILOT + GENERATIVE AI , Ines Montani - Explosion

Slide 2

Slide 2 text

SPACY SPACY.IO & @SPACY_IO ✍ SPACY.TV / GITHUB.COM/EXPLOSION/SPACY Open-source library for industrial-strength Natural Language Processing 150m+ downloads

Slide 3

Slide 3 text

SPACY SPACY.IO & @SPACY_IO ✍ SPACY.TV / GITHUB.COM/EXPLOSION/SPACY Open-source library for industrial-strength Natural Language Processing 150m+ downloads ChatGPT can write spaCy code!

Slide 4

Slide 4 text

PRODIGY Modern scriptable annotation tool for machine learning developers PRODIGY.AI & GITHUB.COM/EXPLOSION/PRODIGY-RECIPES 8k+ users 700+ companies

Slide 5

Slide 5 text

PRODIGY Modern scriptable annotation tool for machine learning developers PRODIGY.AI & GITHUB.COM/EXPLOSION/PRODIGY-RECIPES 8k+ users 700+ companies

Slide 6

Slide 6 text

0 single/multi-doc summarization ✅ problem solving ✍ paraphrasing 2 reasoning 3 style transfer Generative ❓question answering 5 text classification 6 entity recognition 7 relation extraction 8 grammar & morphology ) semantic parsing 9 coreference resolution % discourse structure Predictive UNDERSTANDING NLP TASKS

Slide 7

Slide 7 text

Slide 8

Slide 8 text

THE HISTORY OF FUTURE TECHNOLOGY

Slide 9

Slide 9 text

THE HISTORY OF FUTURE TECHNOLOGY How people in 1900 imagined the year 2000

Slide 10

Slide 10 text

THE HISTORY OF FUTURE TECHNOLOGY How people in 1900 imagined the year 2000

Slide 11

Slide 11 text

THE HISTORY OF FUTURE TECHNOLOGY

Slide 12

Slide 12 text

THE HISTORY OF FUTURE TECHNOLOGY manual calculation vs. calculator

Slide 13

Slide 13 text

THE HISTORY OF FUTURE TECHNOLOGY manual calculation vs. calculator

Slide 14

Slide 14 text

THE HISTORY OF FUTURE TECHNOLOGY “knocker-uppers” vs. alarm clock manual calculation vs. calculator

Slide 15

Slide 15 text

THE HISTORY OF FUTURE TECHNOLOGY

Slide 16

Slide 16 text

THE HISTORY OF FUTURE TECHNOLOGY human assistant vs. calendar apps Calendly Fantastical

Slide 17

Slide 17 text

THE HISTORY OF FUTURE TECHNOLOGY human assistant vs. calendar apps Calendly Fantastical WHAT’S NEXT?

Slide 18

Slide 18 text

NLP IN THE AGE OF LLMS

Slide 19

Slide 19 text

NLP IN THE AGE OF LLMS SQL is all you need dialogue is all you need : %

Slide 20

Slide 20 text

NLP IN THE AGE OF LLMS SQL is all you need dialogue is all you need : % lots of humans is all you need prompting is all you need ; "

Slide 21

Slide 21 text

COMPANY COMPANY MONEY INVESTOR “Hooli raises $5m to revolutionize search, led by ACME Ventures” 5923214 1681056 CLASSIC NLP PROBLEM: EXTRACT STRUCTURED DATA Database

Slide 22

Slide 22 text

COMPANY COMPANY MONEY INVESTOR “Hooli raises $5m to revolutionize search, led by ACME Ventures” 5923214 1681056 CLASSIC NLP PROBLEM: EXTRACT STRUCTURED DATA Database named entity recognition

Slide 23

Slide 23 text

COMPANY COMPANY MONEY INVESTOR “Hooli raises $5m to revolutionize search, led by ACME Ventures” 5923214 1681056 CLASSIC NLP PROBLEM: EXTRACT STRUCTURED DATA Database named entity recognition entity disambiguation

Slide 24

Slide 24 text

Slide 25

Slide 25 text

Slide 26

Slide 26 text

Slide 27

Slide 27 text

VISION #1 dialogue is all you need % < LLM = user actions or information natural language input

Slide 28

Slide 28 text

VISION #1 dialogue is all you need % < LLM = user actions or information natural language input LLM is the system and needs to manage the whole interaction

Slide 29

Slide 29 text

VISION #2 prompting is all you need " < LLM 0 text % prompt > system = user ? structured data

Slide 30

Slide 30 text

VISION #2 prompting is all you need " < LLM 0 text % prompt > system = user LLM replaces the specific ML model ? structured data

Slide 31

Slide 31 text

VISION #3 modern practical NLP - @ developer 2 code < LLM 5 training data > system = user ? structured data ⚙ ML system

Slide 32

Slide 32 text

VISION #3 modern practical NLP - @ developer 2 code < LLM 5 training data > system = user ? structured data ⚙ ML system LLM helps with building the pipeline

Slide 33

Slide 33 text

VISION #3 modern practical NLP - @ developer 2 code < LLM 5 training data > system = user ? structured data ⚙ ML system LLM helps with building the pipeline

Slide 34

Slide 34 text

VISION #3 modern practical NLP - @ developer 2 code < LLM 5 training data > system = user ? structured data ⚙ ML system LLM helps with building the pipeline

Slide 35

Slide 35 text

Slide 36

Slide 36 text

LLMS VS. TASK- SPECIFIC MODELS Text Classification accuracy on % of examples SST2 AG News Banking77 GPT-3 baseline 65 70 75 80 85 90 95 100 1% 5% 10% 20% 50% 100% Explosion (2023), to be released

Slide 37

Slide 37 text

LLMS VS. TASK- SPECIFIC MODELS Text Classification accuracy on % of examples SST2 AG News Banking77 GPT-3 baseline 65 70 75 80 85 90 95 100 1% 5% 10% 20% 50% 100% Explosion (2023), to be released

Slide 38

Slide 38 text

LLMS VS. TASK- SPECIFIC MODELS Text Classification accuracy on % of examples SST2 AG News Banking77 GPT-3 baseline 65 70 75 80 85 90 95 100 1% 5% 10% 20% 50% 100% Explosion (2023), to be released

Slide 39

Slide 39 text

LLMS VS. TASK- SPECIFIC MODELS Text Classification accuracy on % of examples SST2 AG News Banking77 GPT-3 baseline 65 70 75 80 85 90 95 100 1% 5% 10% 20% 50% 100% Explosion (2023), to be released

Slide 40

Slide 40 text

LLMS VS. TASK- SPECIFIC MODELS F-Score Speed (words/s) GPT-3.5 1 78.6 < 100 GPT-4 1 83.5 < 100 spaCy 91.6 4,000 Flair 93.1 1,000 SOTA 2023 2 94.6 1,000 SOTA 2003 3 88.8 > 20,000 1. Ashok and Lipton (2023), 2. Wang et al. (2021), 3. Florian et al. (2003) SOTA on few- shot prompting RoBERTa-base CoNLL 2003 NER Text Classification accuracy on % of examples SST2 AG News Banking77 GPT-3 baseline 65 70 75 80 85 90 95 100 1% 5% 10% 20% 50% 100% Explosion (2023), to be released

Slide 41

Slide 41 text

< Large Language Model in-context learning knows a lot about what the text means doesn’t really know what you want it to do

Slide 42

Slide 42 text

⚙ Task-Specific Model fine-tuning BERT etc. knows less about what the text means can encode exactly what you want it to do < Large Language Model in-context learning knows a lot about what the text means doesn’t really know what you want it to do

Slide 43

Slide 43 text

Slide 44

Slide 44 text

Slide 45

Slide 45 text

Slide 46

Slide 46 text

Slide 47

Slide 47 text

Slide 48

Slide 48 text

Slide 49

Slide 49 text

Slide 50

Slide 50 text

NLP IN THE AGE OF LLMS SQL is all you need dialogue is all you need : % lots of humans is all you need prompting is all you need ; " modern practical NLP -

Slide 51

Slide 51 text

NLP IN THE AGE OF LLMS SQL is all you need dialogue is all you need : % lots of humans is all you need prompting is all you need ; " modern practical NLP - structured data

Slide 52

Slide 52 text

NLP IN THE AGE OF LLMS SQL is all you need dialogue is all you need : % lots of humans is all you need prompting is all you need ; " modern practical NLP - structured data humans in the loop

Slide 53

Slide 53 text

NLP IN THE AGE OF LLMS SQL is all you need dialogue is all you need : % lots of humans is all you need prompting is all you need ; " modern practical NLP - structured data fast prototyping humans in the loop

Slide 54

Slide 54 text

Slide 55

Slide 55 text

Slide 56

Slide 56 text

LLM-POWERED NLP IN PRACTICE LLM-powered collaborative data development environment @

Slide 57

Slide 57 text

LLM-POWERED NLP IN PRACTICE LLM-powered collaborative data development environment @ Assign labeling tasks to LLMs "

Slide 58

Slide 58 text

LLM-POWERED NLP IN PRACTICE LLM-powered collaborative data development environment @ Assign labeling tasks to LLMs " Review label decisions, correct errors A

Slide 59

Slide 59 text

LLM-POWERED NLP IN PRACTICE LLM-powered collaborative data development environment @ Assign labeling tasks to LLMs " Review label decisions, correct errors A Tune prompts and compare LLMs empirically ?

Slide 60

Slide 60 text

Slide 61

Slide 61 text

8 PRODIGY.AI/FEATURES/LARGE-LANGUAGE-MODELS

Slide 62

Slide 62 text

8 correct mistakes PRODIGY.AI/FEATURES/LARGE-LANGUAGE-MODELS

Slide 63

Slide 63 text

PRODIGY.AI/FEATURES/LARGE-LANGUAGE-MODELS correct mistakes

Slide 64

Slide 64 text

add correct answer to prompt to tune it PRODIGY.AI/FEATURES/LARGE-LANGUAGE-MODELS correct mistakes

Slide 65

Slide 65 text

GITHUB.COM/EXPLOSION/SPACY-LLM TOWARDS STRUCTURED DATA Prompt Template < LLM London is bigger than Berlin LOCATION: London, Berlin LOCATION

Slide 66

Slide 66 text

GITHUB.COM/EXPLOSION/SPACY-LLM TOWARDS STRUCTURED DATA Prompt Template < LLM London is bigger than Berlin LOCATION: London, Berlin LOCATION

Slide 67

Slide 67 text

GITHUB.COM/EXPLOSION/SPACY-LLM % unstructured text input ? structured Doc object

Slide 68

Slide 68 text

GITHUB.COM/EXPLOSION/SPACY-LLM Named Entity Recognition Text Classification Relation Extraction Lemma- tization % unstructured text input ? structured Doc object

Slide 69

Slide 69 text

GITHUB.COM/EXPLOSION/SPACY-LLM Named Entity Recognition Text Classification Relation Extraction Lemma- tization % unstructured text input ? structured Doc object < LLM ⚙ Supervised Model ✍ Rules mix, match and replace techniques

Slide 70

Slide 70 text

EASIER ISN'T AMBITIOUS ENOUGH. Let’s not settle for systems that are worse than what we’ve been building.

Slide 71

Slide 71 text

SPECIFIC Task-specific models powered by LLMs IS BETTER.

Slide 72

Slide 72 text

SMALLER & FASTER Task-specific models powered by LLMs IS BETTER.

Slide 73

Slide 73 text

PRIVATE Task-specific models powered by LLMs IS BETTER.

Slide 74

Slide 74 text

BETTER Task-specific models powered by LLMs IS BETTER.

Slide 75

Slide 75 text

THANK YOU! - Explosion – explosion.ai B spaCy – spacy.io ✨ Prodigy – prodigy.ai C Twitter – @_inesmontani D Mastodon – @[email protected] E LinkedIn