Slide 1

Slide 1 text

v3 FROM PROTOTYPE TO State-of-the-art NLP BRINGING Production @_inesmontani @honnibal

Slide 2

Slide 2 text

Open-source library for industrial-strength Natural Language Processing spacy.io

Slide 3

Slide 3 text

Open-source library for industrial-strength Natural Language Processing spacy.io 23m+ DOWNLOADS 17k+ GITHUB STARS

Slide 4

Slide 4 text

spacy.io cheap to run easy to use

Slide 5

Slide 5 text

spacy.io cheap to run easy to use big ecosystem

Slide 6

Slide 6 text

spacy.io cheap to run easy to use big ecosystem v3 now even more powerful

Slide 7

Slide 7 text

State-of-the-art TRANSFORMER-BASED PIPELINES

Slide 8

Slide 8 text

Benchmarks spacy.io/usage/facts- fi gures Parser Tagger NER Speed en_core_web_trf 95.8 98.1 90.6 4k wps en_core_web_lg 92.0 97.2 87.0 30k wps SOTA 96.2 98.3 89.7 - NEW

Slide 9

Slide 9 text

spacy.io/usage/embeddings-transformers Entity Recognizer Dependency Parser Transformer ... packages with state- of-the-art trained pipelines en _core _web _trf

Slide 10

Slide 10 text

spacy.io/usage/embeddings-transformers Entity Recognizer Dependency Parser Transformer ... BERT DistilBERT RoBERTa ... easily use any transformer in your pipeline packages with state- of-the-art trained pipelines en _core _web _trf

Slide 11

Slide 11 text

spacy.io/usage/embeddings-transformers Entity Recognizer Dependency Parser Transformer ... share one transformer across your whole pipeline BERT DistilBERT RoBERTa ... easily use any transformer in your pipeline packages with state- of-the-art trained pipelines en _core _web _trf

Slide 12

Slide 12 text

spacy.io/usage/embeddings-transformers Entity Recognizer Dependency Parser Transformer ... share one transformer across your whole pipeline BERT DistilBERT RoBERTa ... easily use any transformer in your pipeline update one transformer from multiple components packages with state- of-the-art trained pipelines en _core _web _trf

Slide 13

Slide 13 text

Configuration DECLARATIVE SYSTEM

Slide 14

Slide 14 text

CONFIGURE ALL THE THINGS!

Slide 15

Slide 15 text

CONFIGURE ALL THE THINGS! MODEL ARCHITECTURES!

Slide 16

Slide 16 text

CONFIGURE ALL THE THINGS! MODEL ARCHITECTURES! HYPERPARAMETERS!

Slide 17

Slide 17 text

CONFIGURE ALL THE THINGS! MODEL ARCHITECTURES! HYPERPARAMETERS! TRAINING SETTINGS!

Slide 18

Slide 18 text

CONFIGURE ALL THE THINGS! MODEL ARCHITECTURES! HYPERPARAMETERS! TRAINING SETTINGS! LOADING!

Slide 19

Slide 19 text

CONFIGURE ALL THE THINGS! MODEL ARCHITECTURES! HYPERPARAMETERS! TRAINING SETTINGS! LOADING! METRICS!

Slide 20

Slide 20 text

problems.py

Slide 21

Slide 21 text

problem #1: nested defaults become hidden defaults problems.py

Slide 22

Slide 22 text

problem #1: nested defaults become hidden defaults problem #2: defaults will conflict problems.py

Slide 23

Slide 23 text

problem #1: nested defaults become hidden defaults problem #3: difficult to iterate, swap, mix and match problem #2: defaults will conflict problems.py

Slide 24

Slide 24 text

con fi g.cfg

Slide 25

Slide 25 text

con fi g.cfg structured sections

Slide 26

Slide 26 text

con fi g.cfg registered functions structured sections

Slide 27

Slide 27 text

con fi g.cfg registered functions resolved bottom-up structured sections

Slide 28

Slide 28 text

con fi g.cfg registered functions resolved bottom-up variable interpolation structured sections

Slide 29

Slide 29 text

code.py con fi g.cfg

Slide 30

Slide 30 text

code.py con fi g.cfg custom functions

Slide 31

Slide 31 text

code.py con fi g.cfg type-based validation custom functions

Slide 32

Slide 32 text

spacy.io/usage/training

Slide 33

Slide 33 text

spacy.io/usage/training training config

Slide 34

Slide 34 text

custom registered functions and code spacy.io/usage/training training config

Slide 35

Slide 35 text

custom registered functions and code config overrides (data paths) spacy.io/usage/training training config

Slide 36

Slide 36 text

spacy.io/usage/training

Slide 37

Slide 37 text

spacy.io/usage/training generate starter config

Slide 38

Slide 38 text

components to train spacy.io/usage/training generate starter config

Slide 39

Slide 39 text

optimize model settings for accuracy or efficiency components to train spacy.io/usage/training generate starter config

Slide 40

Slide 40 text

step #1: train and package pipeline

Slide 41

Slide 41 text

step #1: train and package pipeline step #2: get pip-installable Python package

Slide 42

Slide 42 text

step #1: train and package pipeline step #2: get pip-installable Python package en_your_pipeline

Slide 43

Slide 43 text

step #1: train and package pipeline step #2: get pip-installable Python package en_your_pipeline step #3: ship and use pipeline

Slide 44

Slide 44 text

Workflows FOR END-TO-END PROJECTS

Slide 45

Slide 45 text

spacy.io/usage/projects

Slide 46

Slide 46 text

spacy.io/usage/projects project CLI

Slide 47

Slide 47 text

spacy.io/usage/projects project CLI

Slide 48

Slide 48 text

project.yml

Slide 49

Slide 49 text

project.yml data assets

Slide 50

Slide 50 text

project.yml data assets named workflows

Slide 51

Slide 51 text

project.yml data assets named workflows commands

Slide 52

Slide 52 text

project.yml data assets named workflows commands file dependencies

Slide 53

Slide 53 text

github.com/explosion/projects

Slide 54

Slide 54 text

con fi g.cfg built-in logger for tracking experiments

Slide 55

Slide 55 text

con fi g.cfg built-in logger for tracking experiments

Slide 56

Slide 56 text

con fi g.cfg built-in logger for tracking experiments log all config settings & discover correlations

Slide 57

Slide 57 text

parallel & distributed training

Slide 58

Slide 58 text

parallel & distributed training asynchronous stochastic gradient descent

Slide 59

Slide 59 text

build interactive spaCy apps

Slide 60

Slide 60 text

build interactive spaCy apps visualizer.py

Slide 61

Slide 61 text

build interactive spaCy apps visualizer.py

Slide 62

Slide 62 text

Components TRAINABLE & RULE-BASED

Slide 63

Slide 63 text

Doc Text

Slide 64

Slide 64 text

Doc Dependency Parser Text

Slide 65

Slide 65 text

Doc Part-of-speech Tagger Dependency Parser Text

Slide 66

Slide 66 text

Doc Part-of-speech Tagger Dependency Parser Entity Recognizer Text

Slide 67

Slide 67 text

Doc Part-of-speech Tagger Dependency Parser Entity Recognizer Text Categorizer Text

Slide 68

Slide 68 text

Doc Part-of-speech Tagger Dependency Parser Entity Recognizer Text Categorizer Entity Linker Text

Slide 69

Slide 69 text

Doc Part-of-speech Tagger Dependency Parser Entity Recognizer Lemmatizer Text Categorizer Entity Linker Text

Slide 70

Slide 70 text

Doc Part-of-speech Tagger Dependency Parser Entity Recognizer Lemmatizer Text Categorizer Entity Linker Entity Ruler Text

Slide 71

Slide 71 text

Doc Part-of-speech Tagger Dependency Parser Entity Recognizer Lemmatizer Text Categorizer Entity Linker Sentence Recognizer Entity Ruler Text

Slide 72

Slide 72 text

Doc Part-of-speech Tagger Dependency Parser Entity Recognizer Lemmatizer Text Categorizer Entity Linker Sentence Recognizer Entity Ruler ... Text

Slide 73

Slide 73 text

con fi g.cfg customize settings, hyperparameters and architectures

Slide 74

Slide 74 text

con fi g.cfg

Slide 75

Slide 75 text

con fi g.cfg source components from trained pipelines

Slide 76

Slide 76 text

con fi g.cfg choose which components to update during training source components from trained pipelines

Slide 77

Slide 77 text

spacy.io/usage/processing-pipelines custom pipeline component

Slide 78

Slide 78 text

spacy.io/usage/processing-pipelines custom pipeline component

Slide 79

Slide 79 text

spacy.io/usage/processing-pipelines custom pipeline component con fi g.cfg

Slide 80

Slide 80 text

custom pipeline component factory

Slide 81

Slide 81 text

con fi g.cfg custom pipeline component factory

Slide 82

Slide 82 text

con fi g.cfg factory function custom pipeline component factory

Slide 83

Slide 83 text

con fi g.cfg factory function factory arguments custom pipeline component factory

Slide 84

Slide 84 text

Any framework CUSTOM MODELS IN

Slide 85

Slide 85 text

thinc.ai type-checked functional programming API for composing models

Slide 86

Slide 86 text

thinc.ai type-checked functional programming API for composing models wrap layers defined in any framework

Slide 87

Slide 87 text

spacy.io/usage/layers-architectures wrap custom PyTorch model con fi g.cfg set parameters in config

Slide 88

Slide 88 text

Type hints AND TYPE-BASED VALIDATION

Slide 89

Slide 89 text

Model

Slide 90

Slide 90 text

Model Model[InputT, OutputT] generic types

Slide 91

Slide 91 text

Model custom array types Floats2d Model[InputT, OutputT] generic types

Slide 92

Slide 92 text

Model custom array types Floats2d Ints1d ... Padded Ragged Model[InputT, OutputT] generic types

Slide 93

Slide 93 text

No content

Slide 94

Slide 94 text

expected return types

Slide 95

Slide 95 text

Y: Floats3d Incompatible return value type (got "Tuple[Floats3d, Callable[[Any], Any]]", expected return types

Slide 96

Slide 96 text

Y: Floats3d Incompatible return value type (got "Tuple[Floats3d, Callable[[Any], Any]]", expected return types static analysis: catch errors as you type

Slide 97

Slide 97 text

No content

Slide 98

Slide 98 text

mypy.ini optional mypy plugin for more checks

Slide 99

Slide 99 text

Relu: Relu Layer outputs type (thinc.types.Floats2d) but the next layer expects (thinc.types.Ragged) as an input mypy.ini optional mypy plugin for more checks

Slide 100

Slide 100 text

Relu: Relu Layer outputs type (thinc.types.Floats2d) but the next layer expects (thinc.types.Ragged) as an input static analysis: catch errors as you type mypy.ini optional mypy plugin for more checks

Slide 101

Slide 101 text

spacy.io

Slide 102

Slide 102 text

Base support & trained pipelines for many languages spacy.io

Slide 103

Slide 103 text

Base support & trained pipelines for many languages Multi-task learning with transformers like BERT spacy.io

Slide 104

Slide 104 text

Base support & trained pipelines for many languages Multi-task learning with transformers like BERT State-of-the- art speed spacy.io

Slide 105

Slide 105 text

Base support & trained pipelines for many languages Multi-task learning with transformers like BERT State-of-the- art speed Components for NER, tagging, parsing, text classification, entity linking & more spacy.io

Slide 106

Slide 106 text

Base support & trained pipelines for many languages Multi-task learning with transformers like BERT State-of-the- art speed Custom trainable & rule-based components Components for NER, tagging, parsing, text classification, entity linking & more spacy.io

Slide 107

Slide 107 text

Base support & trained pipelines for many languages Multi-task learning with transformers like BERT State-of-the- art speed Custom trainable & rule-based components Components for NER, tagging, parsing, text classification, entity linking & more Production-ready training system, model packaging & workflow management spacy.io

Slide 108

Slide 108 text

Base support & trained pipelines for many languages Awesome ecosystem Multi-task learning with transformers like BERT State-of-the- art speed Custom trainable & rule-based components Components for NER, tagging, parsing, text classification, entity linking & more Production-ready training system, model packaging & workflow management spacy.io

Slide 109

Slide 109 text

spacy.io/usage/v3 @spacy_io @_inesmontani @honnibal

Slide 110

Slide 110 text

spacy.io/usage/v3 @spacy_io @_inesmontani @honnibal install spaCy v3 from pip or conda

Slide 111

Slide 111 text

spacy.io/usage/v3 @spacy_io @_inesmontani @honnibal documentation and quickstart install spaCy v3 from pip or conda

Slide 112

Slide 112 text

spacy.io/usage/v3 @spacy_io @_inesmontani @honnibal documentation and quickstart install spaCy v3 from pip or conda thank you! —

Slide 113

Slide 113 text

No content