Slide 1

Slide 1 text

Ines Montani Explosion

Slide 2

Slide 2 text

Open-source library for industrial-strength natural language processing spacy.io 470m+ downloads

Slide 3

Slide 3 text

Open-source library for industrial-strength natural language processing spacy.io LLMs are really good at spaCy code! 470m+ downloads

Slide 4

Slide 4 text

12k+ users 1000+ companies Modern scriptable annotation tool for machine learning developers prodigy.ai

Slide 5

Slide 5 text

12k+ users 1000+ companies fully scriptable in Python Alex Smith Developer Kim Miller Analyst GPT-5 API Modern scriptable annotation tool for machine learning developers prodigy.ai

Slide 6

Slide 6 text

of coding assistants

Slide 7

Slide 7 text

help developer implement code for the given tools of coding assistants

Slide 8

Slide 8 text

help developer implement code for the given tools of coding assistants Jay Alammar: PyData London Keynote

Slide 9

Slide 9 text

help developer implement code for the given tools of coding assistants Jay Alammar: PyData London Keynote

Slide 10

Slide 10 text

help developer implement code for the given tools help developer pick the right tools and implement code of coding assistants

Slide 11

Slide 11 text

help developer implement code for the given tools help developer pick the right tools and implement code of coding assistants solve a business problem

Slide 12

Slide 12 text

help developer implement code for the given tools help developer pick the right tools and implement code of coding assistants solve a business problem “I need to analyze these company reports and create a table of the total spending on di ff erent types of IT services over time.” 2025.pdf 2024.pdf 2023.pdf

Slide 13

Slide 13 text

reverse these strings

Slide 14

Slide 14 text

reverse these strings write a script to reverse strings

Slide 15

Slide 15 text

reverse these strings write a script to reverse strings prompt program

Slide 16

Slide 16 text

No content

Slide 17

Slide 17 text

reproducible

Slide 18

Slide 18 text

scalable reproducible

Slide 19

Slide 19 text

scalable maintainable reproducible

Slide 20

Slide 20 text

scalable maintainable reproducible extensible

Slide 21

Slide 21 text

scalable maintainable faster reproducible extensible

Slide 22

Slide 22 text

scalable maintainable cheaper faster reproducible extensible

Slide 23

Slide 23 text

scalable maintainable cheaper faster reproducible extensible OSS ecosystem

Slide 24

Slide 24 text

scalable maintainable cheaper faster reproducible extensible OSS ecosystem coding assistants

Slide 25

Slide 25 text

scalable maintainable cheaper faster reproducible extensible OSS ecosystem coding assistants experience

Slide 26

Slide 26 text

as the system

Slide 27

Slide 27 text

as the system to build

Slide 28

Slide 28 text

list all company names in the text write a script to extract company names from text

Slide 29

Slide 29 text

natural language structured data vs.

Slide 30

Slide 30 text

natural language structured data vs. consumed by humans

Slide 31

Slide 31 text

natural language structured data vs. consumed by humans consumed by machines

Slide 32

Slide 32 text

natural language structured data vs. consumed by humans consumed by machines “I need to analyze these company reports and create a table of the total spending on di ff erent types of IT services over time.”

Slide 33

Slide 33 text

natural language structured data vs. consumed by humans consumed by machines “I need to analyze these company reports and create a table of the total spending on di ff erent types of IT services over time.” parse PDFs

Slide 34

Slide 34 text

natural language structured data vs. consumed by humans consumed by machines “I need to analyze these company reports and create a table of the total spending on di ff erent types of IT services over time.” parse PDFs extract expenses

Slide 35

Slide 35 text

natural language structured data vs. consumed by humans consumed by machines “I need to analyze these company reports and create a table of the total spending on di ff erent types of IT services over time.” parse PDFs extract expenses classify expense type

Slide 36

Slide 36 text

natural language structured data vs. consumed by humans consumed by machines “I need to analyze these company reports and create a table of the total spending on di ff erent types of IT services over time.” parse PDFs extract expenses classify expense type do math

Slide 37

Slide 37 text

natural language structured data vs. consumed by humans consumed by machines “I need to analyze these company reports and create a table of the total spending on di ff erent types of IT services over time.” parse PDFs extract expenses classify expense type do math create table

Slide 38

Slide 38 text

natural language structured data vs. consumed by humans consumed by machines “I need to analyze these company reports and create a table of the total spending on di ff erent types of IT services over time.” parse PDFs extract expenses classify expense type do math create table

Slide 39

Slide 39 text

natural language structured data vs. consumed by humans consumed by machines Most industry applications of NLP are part of a larger system. “I need to analyze these company reports and create a table of the total spending on di ff erent types of IT services over time.” parse PDFs extract expenses classify expense type do math create table

Slide 40

Slide 40 text

natural language structured data vs. consumed by humans consumed by machines Most industry applications of NLP are part of a larger system. “I need to analyze these company reports and create a table of the total spending on di ff erent types of IT services over time.” parse PDFs extract expenses classify expense type do math create table “The results will then be added to our internal database so we can predict future spending.”

Slide 41

Slide 41 text

natural language structured data vs. consumed by humans consumed by machines Most industry applications of NLP are part of a larger system. “I need to analyze these company reports and create a table of the total spending on di ff erent types of IT services over time.” parse PDFs extract expenses classify expense type do math create table populate database “The results will then be added to our internal database so we can predict future spending.”

Slide 42

Slide 42 text

natural language structured data vs. consumed by humans consumed by machines Most industry applications of NLP are part of a larger system. “I need to analyze these company reports and create a table of the total spending on di ff erent types of IT services over time.” parse PDFs extract expenses classify expense type do math create table populate database “The results will then be added to our internal database so we can predict future spending.” model predictions

Slide 43

Slide 43 text

At their core, many NLP systems consist of flat classifications. You can shove them into a single prompt, or you can decompose them into smaller pieces. Many classification tasks are straightforward to solve nowadays – but they become vastly more complicated if one model needs to do them all at once. explosion.ai/blog/human-in-the-loop-distillation

Slide 44

Slide 44 text

explosion.ai/blog/human-in-the-loop-distillation LLM Human-in-the-loop

Slide 45

Slide 45 text

explosion.ai/blog/human-in-the-loop-distillation continuous evaluation baseline LLM Human-in-the-loop

Slide 46

Slide 46 text

explosion.ai/blog/human-in-the-loop-distillation continuous evaluation baseline LLM prompting Human-in-the-loop

Slide 47

Slide 47 text

explosion.ai/blog/human-in-the-loop-distillation continuous evaluation baseline LLM prompting Human-in-the-loop

Slide 48

Slide 48 text

explosion.ai/blog/human-in-the-loop-distillation continuous evaluation baseline LLM prompting transfer learning Human-in-the-loop

Slide 49

Slide 49 text

explosion.ai/blog/human-in-the-loop-distillation continuous evaluation baseline LLM prompting transfer learning distilled model Human-in-the-loop

Slide 50

Slide 50 text

explosion.ai/blog/human-in-the-loop-distillation continuous evaluation baseline LLM prompting transfer learning distilled model Human-in-the-loop production

Slide 51

Slide 51 text

sort these documents into custom categories create data and train a classifier for custom categories

Slide 52

Slide 52 text

No content

Slide 53

Slide 53 text

Pareto Frontier for AI models Cost Accuracy

Slide 54

Slide 54 text

Pareto Frontier for AI models Cost Accuracy LLMs as developer tools change the calculation!

Slide 55

Slide 55 text

Pareto Frontier for AI models Cost Accuracy LLMs as developer tools change the calculation! runtime → development use LLMs to create runtime system

Slide 56

Slide 56 text

Pareto Frontier for AI models Cost Accuracy LLMs as developer tools change the calculation! write code runtime → development use LLMs to create runtime system

Slide 57

Slide 57 text

Pareto Frontier for AI models Cost Accuracy LLMs as developer tools change the calculation! write code create data runtime → development use LLMs to create runtime system

Slide 58

Slide 58 text

Pareto Frontier for AI models Cost Accuracy LLMs as developer tools change the calculation! write code create data train classifiers runtime → development use LLMs to create runtime system

Slide 59

Slide 59 text

Pareto Frontier for AI models Cost Accuracy LLMs as developer tools change the calculation! write code create data train classifiers strategize runtime → development use LLMs to create runtime system

Slide 60

Slide 60 text

Use LLMs to build the system, not as the system.

Slide 61

Slide 61 text

Use LLMs to build the system, not as the system. There’s no need to compromise on development best practices or privacy.

Slide 62

Slide 62 text

Use LLMs to build the system, not as the system. There’s no need to compromise on development best practices or privacy. Code is more important than ever – not less!

Slide 63

Slide 63 text

Use LLMs to build the system, not as the system. There’s no need to compromise on development best practices or privacy. Code is more important than ever – not less!* * This includes the open-source ecosystem!

Slide 64

Slide 64 text

Explosion spaCy Prodigy Bluesky Mastodon explosion.ai spacy.io prodigy.ai @inesmontani.bsky.social @[email protected] LinkedIn