Large Language Models: From Prototype to Production (EuroPython keynote)

Ines Montani Explosion LARGE LANGUAGE LARGE LANGUAGE MODELS ✨ CHATGPT
" ARTIFICIAL INTELLIGENCE # MACHINE LEARNING ✨ PROTOTYPE TO PRODUCTION MODELS FROM LLAMA $ NATURAL LANGUAGE PROCESSING % ✨ OPEN SOURCE & PYTHON ' PROMPT ENGINEERING ⚙ ZERO-SHOT LEARNING ) GPT-4 EVALUATION * COPILOT + GENERATIVE AI , Ines Montani - Explosion

SPACY SPACY.IO & @SPACY_IO ✍ SPACY.TV / GITHUB.COM/EXPLOSION/SPACY Open-source library
for industrial-strength Natural Language Processing 150m+ downloads

SPACY SPACY.IO & @SPACY_IO ✍ SPACY.TV / GITHUB.COM/EXPLOSION/SPACY Open-source library
for industrial-strength Natural Language Processing 150m+ downloads ChatGPT can write spaCy code!

PRODIGY Modern scriptable annotation tool for machine learning developers PRODIGY.AI
& GITHUB.COM/EXPLOSION/PRODIGY-RECIPES 8k+ users 700+ companies

0 single/multi-doc summarization ✅ problem solving ✍ paraphrasing 2 reasoning
3 style transfer Generative ❓question answering 5 text classification 6 entity recognition 7 relation extraction 8 grammar & morphology ) semantic parsing 9 coreference resolution % discourse structure Predictive UNDERSTANDING NLP TASKS

3 style transfer Generative ❓question answering 5 text classification 6 entity recognition 7 relation extraction 8 grammar & morphology ) semantic parsing 9 coreference resolution % discourse structure Predictive UNDERSTANDING NLP TASKS human-readable machine-readable

THE HISTORY OF FUTURE TECHNOLOGY

THE HISTORY OF FUTURE TECHNOLOGY How people in 1900 imagined
the year 2000

THE HISTORY OF FUTURE TECHNOLOGY manual calculation vs. calculator

THE HISTORY OF FUTURE TECHNOLOGY “knocker-uppers” vs. alarm clock manual
calculation vs. calculator

THE HISTORY OF FUTURE TECHNOLOGY human assistant vs. calendar apps
Calendly Fantastical

THE HISTORY OF FUTURE TECHNOLOGY human assistant vs. calendar apps
Calendly Fantastical WHAT’S NEXT?

NLP IN THE AGE OF LLMS

NLP IN THE AGE OF LLMS SQL is all you
need dialogue is all you need : %

need dialogue is all you need : % lots of humans is all you need prompting is all you need ; "

COMPANY COMPANY MONEY INVESTOR “Hooli raises $5m to revolutionize search,
led by ACME Ventures” 5923214 1681056 CLASSIC NLP PROBLEM: EXTRACT STRUCTURED DATA Database

led by ACME Ventures” 5923214 1681056 CLASSIC NLP PROBLEM: EXTRACT STRUCTURED DATA Database named entity recognition

led by ACME Ventures” 5923214 1681056 CLASSIC NLP PROBLEM: EXTRACT STRUCTURED DATA Database named entity recognition entity disambiguation

led by ACME Ventures” 5923214 1681056 CLASSIC NLP PROBLEM: EXTRACT STRUCTURED DATA Database named entity recognition entity disambiguation custom database lookup

led by ACME Ventures” 5923214 1681056 CLASSIC NLP PROBLEM: EXTRACT STRUCTURED DATA Database named entity recognition entity disambiguation custom database lookup currency normalization

led by ACME Ventures” 5923214 1681056 CLASSIC NLP PROBLEM: EXTRACT STRUCTURED DATA Database named entity recognition entity disambiguation custom database lookup currency normalization entity relation extraction

VISION #1 dialogue is all you need % < LLM
= user actions or information natural language input

VISION #1 dialogue is all you need % < LLM
= user actions or information natural language input LLM is the system and needs to manage the whole interaction

VISION #2 prompting is all you need " < LLM
0 text % prompt > system = user ? structured data

VISION #2 prompting is all you need " < LLM
0 text % prompt > system = user LLM replaces the specific ML model ? structured data

VISION #3 modern practical NLP - @ developer 2 code
< LLM 5 training data > system = user ? structured data ⚙ ML system

VISION #3 modern practical NLP - @ developer 2 code
< LLM 5 training data > system = user ? structured data ⚙ ML system LLM helps with building the pipeline

3 style transfer Generative ❓question answering 5 text classification 6 entity recognition 7 relation extraction 8 grammar & morphology ) semantic parsing 9 coreference resolution % discourse structure Predictive UNDERSTANDING NLP TASKS

LLMS VS. TASK- SPECIFIC MODELS Text Classification accuracy on %
of examples SST2 AG News Banking77 GPT-3 baseline 65 70 75 80 85 90 95 100 1% 5% 10% 20% 50% 100% Explosion (2023), to be released

LLMS VS. TASK- SPECIFIC MODELS F-Score Speed (words/s) GPT-3.5 1
78.6 < 100 GPT-4 1 83.5 < 100 spaCy 91.6 4,000 Flair 93.1 1,000 SOTA 2023 2 94.6 1,000 SOTA 2003 3 88.8 > 20,000 1. Ashok and Lipton (2023), 2. Wang et al. (2021), 3. Florian et al. (2003) SOTA on few- shot prompting RoBERTa-base CoNLL 2003 NER Text Classification accuracy on % of examples SST2 AG News Banking77 GPT-3 baseline 65 70 75 80 85 90 95 100 1% 5% 10% 20% 50% 100% Explosion (2023), to be released

< Large Language Model in-context learning knows a lot about
what the text means doesn’t really know what you want it to do

⚙ Task-Specific Model fine-tuning BERT etc. knows less about what
the text means can encode exactly what you want it to do < Large Language Model in-context learning knows a lot about what the text means doesn’t really know what you want it to do

the text means can encode exactly what you want it to do < Large Language Model in-context learning knows a lot about what the text means doesn’t really know what you want it to do @ developer

the text means can encode exactly what you want it to do < Large Language Model in-context learning knows a lot about what the text means doesn’t really know what you want it to do @ developer prompt engineering

the text means can encode exactly what you want it to do < Large Language Model in-context learning knows a lot about what the text means doesn’t really know what you want it to do @ developer prompt engineering problem definition

the text means can encode exactly what you want it to do < Large Language Model in-context learning knows a lot about what the text means doesn’t really know what you want it to do @ developer prompt engineering data annotation problem definition

the text means can encode exactly what you want it to do < Large Language Model in-context learning knows a lot about what the text means doesn’t really know what you want it to do @ developer prompt engineering data annotation model training problem definition

the text means can encode exactly what you want it to do < Large Language Model in-context learning knows a lot about what the text means doesn’t really know what you want it to do @ developer prompt engineering data annotation evaluation model training problem definition

the text means can encode exactly what you want it to do < Large Language Model in-context learning knows a lot about what the text means doesn’t really know what you want it to do @ developer prompt engineering data annotation evaluation model training + production problem definition

need dialogue is all you need : % lots of humans is all you need prompting is all you need ; " modern practical NLP -

need dialogue is all you need : % lots of humans is all you need prompting is all you need ; " modern practical NLP - structured data

need dialogue is all you need : % lots of humans is all you need prompting is all you need ; " modern practical NLP - structured data humans in the loop

need dialogue is all you need : % lots of humans is all you need prompting is all you need ; " modern practical NLP - structured data fast prototyping humans in the loop

need dialogue is all you need : % lots of humans is all you need prompting is all you need ; " modern practical NLP - structured data fast prototyping humans in the loop powered by open source

need dialogue is all you need : % lots of humans is all you need prompting is all you need ; " modern practical NLP - structured data fast prototyping humans in the loop powered by open source conversational and graphical interfaces

LLM-POWERED NLP IN PRACTICE LLM-powered collaborative data development environment @

Assign labeling tasks to LLMs "

Assign labeling tasks to LLMs " Review label decisions, correct errors A

Assign labeling tasks to LLMs " Review label decisions, correct errors A Tune prompts and compare LLMs empirically ?

Assign labeling tasks to LLMs " Review label decisions, correct errors A Tune prompts and compare LLMs empirically ? Build data sets to train and evaluate e icient, production-ready pipelines +

8 PRODIGY.AI/FEATURES/LARGE-LANGUAGE-MODELS

8 correct mistakes PRODIGY.AI/FEATURES/LARGE-LANGUAGE-MODELS

PRODIGY.AI/FEATURES/LARGE-LANGUAGE-MODELS correct mistakes

add correct answer to prompt to tune it PRODIGY.AI/FEATURES/LARGE-LANGUAGE-MODELS correct
mistakes

GITHUB.COM/EXPLOSION/SPACY-LLM TOWARDS STRUCTURED DATA Prompt Template < LLM London is
bigger than Berlin LOCATION: London, Berlin LOCATION

GITHUB.COM/EXPLOSION/SPACY-LLM % unstructured text input ? structured Doc object

GITHUB.COM/EXPLOSION/SPACY-LLM Named Entity Recognition Text Classification Relation Extraction Lemma- tization
% unstructured text input ? structured Doc object

GITHUB.COM/EXPLOSION/SPACY-LLM Named Entity Recognition Text Classification Relation Extraction Lemma- tization
% unstructured text input ? structured Doc object < LLM ⚙ Supervised Model ✍ Rules mix, match and replace techniques

EASIER ISN'T AMBITIOUS ENOUGH. Let’s not settle for systems that
are worse than what we’ve been building.

SPECIFIC Task-specific models powered by LLMs IS BETTER.

SMALLER & FASTER Task-specific models powered by LLMs IS BETTER.

PRIVATE Task-specific models powered by LLMs IS BETTER.

BETTER Task-specific models powered by LLMs IS BETTER.

THANK YOU! - Explosion – explosion.ai B spaCy – spacy.io
✨ Prodigy – prodigy.ai C Twitter – @_inesmontani D Mastodon – @[email protected] E LinkedIn

Large Language Models: From Prototype to Produc...

Large Language Models: From Prototype to Production (EuroPython keynote)

Video

Resources

Twitter thread

LinkedIn article

The AI Revolution Will Not Be Monopolized

More Decks by Ines Montani

Other Decks in Technology

Featured

Transcript