Large Language Models: From Prototype to Production (PyData London keynote)

Ines Montani Explosion LARGE LANGUAGE LARGE LANGUAGE MODELS ✨ CHATGPT
" ARTIFICIAL INTELLIGENCE # MACHINE LEARNING ✨ PROTOTYPE TO PRODUCTION MODELS FROM LLAMA $ NATURAL LANGUAGE PROCESSING % ✨ OPEN SOURCE & PYTHON ' PROMPT ENGINEERING ⚙ ZERO-SHOT LEARNING ) GPT-4 EVALUATION * COPILOT + GENERATIVE AI , Ines Montani - Explosion

SPACY SPACY.IO & @SPACY_IO ✍ SPACY.TV / GITHUB.COM/EXPLOSION/SPACY Open-source library
for industrial-strength Natural Language Processing 140m+ downloads

SPACY SPACY.IO & @SPACY_IO ✍ SPACY.TV / GITHUB.COM/EXPLOSION/SPACY Open-source library
for industrial-strength Natural Language Processing 140m+ downloads ChatGPT can write spaCy code!

PRODIGY Modern scriptable annotation tool for machine learning developers PRODIGY.AI
& GITHUB.COM/EXPLOSION/PRODIGY-RECIPES 8k+ users 700+ companies

THE HISTORY OF FUTURE TECHNOLOGY

THE HISTORY OF FUTURE TECHNOLOGY How people in 1900 imagined
the year 2000

THE HISTORY OF FUTURE TECHNOLOGY manual calculation vs. calculator

THE HISTORY OF FUTURE TECHNOLOGY “knocker-uppers” vs. alarm clock manual
calculation vs. calculator

THE HISTORY OF FUTURE TECHNOLOGY human assistant vs. calendar apps
Calendly Fantastical

THE HISTORY OF FUTURE TECHNOLOGY human assistant vs. calendar apps
Calendly Fantastical WHAT’S NEXT?

NLP IN THE AGE OF LLMS

NLP IN THE AGE OF LLMS SQL is all you
need dialogue is all you need 0 %

need dialogue is all you need 0 % lots of humans is all you need prompting is all you need 1 "

VISION #1 dialogue is all you need % 2 LLM
3 user actions or information natural language input

VISION #1 dialogue is all you need % 2 LLM
3 user actions or information natural language input LLM is the system and needs to manage the whole interaction

VISION #2 prompting is all you need " 2 LLM
4 text % prompt 5 system 3 user 6 structured data

VISION #2 prompting is all you need " 2 LLM
4 text % prompt 5 system 3 user LLM replaces the specific ML model 6 structured data

VISION #3 modern practical NLP - 7 developer 8 code
2 LLM 9 training data 5 system 3 user 6 structured data ⚙ ML system

VISION #3 modern practical NLP - 7 developer 8 code
2 LLM 9 training data 5 system 3 user 6 structured data ⚙ ML system LLM helps with building the pipeline

COMPANY COMPANY MONEY INVESTOR “Hooli raises $5m to revolutionize search,
led by ACME Ventures” 5923214 1681056 CLASSIC NLP PROBLEM: EXTRACT STRUCTURED DATA Database

Database CLASSIC NLP PROBLEM: EXTRACT STRUCTURED DATA

Database CLASSIC NLP PROBLEM: EXTRACT STRUCTURED DATA named entity recognition

entity disambiguation

entity disambiguation custom database lookup

entity disambiguation custom database lookup currency normalization

entity disambiguation custom database lookup currency normalization entity relation extraction

entity disambiguation custom database lookup currency normalization entity relation extraction 6 structured data 2 LLM 7 developer quick prototype

entity disambiguation custom database lookup currency normalization entity relation extraction 6 structured data 2 LLM 7 developer quick prototype ⚡ fast to develop, slow to run, hard to improve

need dialogue is all you need 0 % lots of humans is all you need prompting is all you need 1 " modern practical NLP -

need dialogue is all you need 0 % lots of humans is all you need prompting is all you need 1 " modern practical NLP - structured data

need dialogue is all you need 0 % lots of humans is all you need prompting is all you need 1 " modern practical NLP - structured data humans in the loop

need dialogue is all you need 0 % lots of humans is all you need prompting is all you need 1 " modern practical NLP - structured data fast prototyping humans in the loop

need dialogue is all you need 0 % lots of humans is all you need prompting is all you need 1 " modern practical NLP - structured data fast prototyping humans in the loop powered by open source

need dialogue is all you need 0 % lots of humans is all you need prompting is all you need 1 " modern practical NLP - structured data fast prototyping humans in the loop powered by open source conversational and graphical interfaces

need dialogue is all you need 0 % lots of humans is all you need prompting is all you need 1 " modern practical NLP - structured data fast prototyping humans in the loop powered by open source robust evaluation conversational and graphical interfaces

A CASE FOR LLM PRAGMATISM EXPLOSION.AI/BLOG/AGAINST-LLM-MAXIMALISM NOOO YOU CAN'T JUST
MIX UP ALL THE STEPS OF YOUR TASK AND ASK AN LLM TO DO IT ALL. HOW WILL YOU EVER MAKE A RELIABLE AND EXTENSIBLE SYSTEM THAT WAY? HAHA LLM GO BRRR

MIX UP ALL THE STEPS OF YOUR TASK AND ASK AN LLM TO DO IT ALL. HOW WILL YOU EVER MAKE A RELIABLE AND EXTENSIBLE SYSTEM THAT WAY? HAHA LLM GO BRRR avoid coupling prediction tasks to arbitrary business logic

MIX UP ALL THE STEPS OF YOUR TASK AND ASK AN LLM TO DO IT ALL. HOW WILL YOU EVER MAKE A RELIABLE AND EXTENSIBLE SYSTEM THAT WAY? HAHA LLM GO BRRR avoid coupling prediction tasks to arbitrary business logic design modular solutions

MIX UP ALL THE STEPS OF YOUR TASK AND ASK AN LLM TO DO IT ALL. HOW WILL YOU EVER MAKE A RELIABLE AND EXTENSIBLE SYSTEM THAT WAY? HAHA LLM GO BRRR avoid coupling prediction tasks to arbitrary business logic design modular solutions prototype modules with LLMs

MIX UP ALL THE STEPS OF YOUR TASK AND ASK AN LLM TO DO IT ALL. HOW WILL YOU EVER MAKE A RELIABLE AND EXTENSIBLE SYSTEM THAT WAY? HAHA LLM GO BRRR avoid coupling prediction tasks to arbitrary business logic design modular solutions prototype modules with LLMs evaluate alternatives

TRADE-OFFS performance on the bar exam kentlaw.iit.edu

TRADE-OFFS performance on the bar exam kentlaw.iit.edu OpenAI API latency
promptlayer.com

TRADE-OFFS Supervised 1 LLM 2 accuracy words/s accuracy words/s Textcat
on SST2 (Stanford Sentiment Treebank) 0.9 4019 0.9 <100 Textcat on Banking77 (intent recognition) 0.9 3234 0.7 <100 NER on AnEm (anatomical entity mentions) 0.7 5146 0.1 <100 1. RoBERTa-base with spaCy, 2. text-davinci-003 zero-shot ongoing experiments comparing LLMS to task-specific models performance on the bar exam kentlaw.iit.edu OpenAI API latency promptlayer.com

LLM-POWERED NLP IN PRACTICE LLM-powered collaborative data development environment 7

Assign labeling tasks to LLMs "

Assign labeling tasks to LLMs " Review label decisions, correct errors ;

Assign labeling tasks to LLMs " Review label decisions, correct errors ; Tune prompts and compare LLMs empirically 6

Assign labeling tasks to LLMs " Review label decisions, correct errors ; Tune prompts and compare LLMs empirically 6 Build data sets to train and evaluate e icient, production-ready pipelines +

8 PRODIGY.AI/FEATURES/LARGE-LANGUAGE-MODELS

8 correct mistakes PRODIGY.AI/FEATURES/LARGE-LANGUAGE-MODELS

PRODIGY.AI/FEATURES/LARGE-LANGUAGE-MODELS correct mistakes

add correct answer to prompt to tune it PRODIGY.AI/FEATURES/LARGE-LANGUAGE-MODELS correct
mistakes

PRODIGY.AI/FEATURES/LARGE-LANGUAGE-MODELS

PRODIGY.AI/FEATURES/LARGE-LANGUAGE-MODELS query LLM and parse response

PRODIGY.AI/FEATURES/LARGE-LANGUAGE-MODELS query LLM and parse response tune prompt if needed

GITHUB.COM/EXPLOSION/SPACY-LLM TOWARDS STRUCTURED DATA Prompt Template 2 LLM London is
bigger than Berlin LOCATION: London, Berlin LOCATION

SPECIFIC Task-specific models powered by LLMS IS BETTER.

SMALLER & FASTER Task-specific models powered by LLMS IS BETTER.

PRIVATE Task-specific models powered by LLMS IS BETTER.

BETTER Task-specific models powered by LLMS IS BETTER.

THANK YOU! - Explosion – explosion.ai < spaCy – spacy.io
✨ Prodigy – prodigy.ai = Twitter – @_inesmontani > Mastodon – @[email protected] ? LinkedIn

Large Language Models: From Prototype to Produc...

Large Language Models: From Prototype to Production (PyData London keynote)

More Decks by Ines Montani

Other Decks in Technology

Featured

Transcript