Huge transformer models like BERT, GPT-2 and XLNet have set a new standard for accuracy on almost every Natural Language Processing leaderboard. However, these models are very new, and most of the software ecosystem surrounding them is oriented towards the many opportunities for further research that they provide. In this talk, I’ll describe how you can now use these models in spaCy, a popular library for putting Natural Language Processing to work on real problems. I’ll also discuss the many opportunities that new transfer learning technologies can offer production NLP, regardless of which specific software packages you choose to get the job done.
PhD in Computer Science in 2009.
10 years publishing research on state-of-the-
art natural language understanding systems.
Left academia in 2014 to develop spaCy.
Programmer and front-end developer with
degree in media science and linguistics.
Has been working on spaCy since its first
release. Lead developer of Prodigy.
100k+ users worldwide
15k stars on GitHub
60+ extension packages
2500+ users, including
1200+ forum members
spaCy’s NLP pipeline
with shared representations
without shared representations
• Functions should be small
• Avoid state and side-
• Lots of systems from
• Small functions make you
• Without state, models
• ML models aren’t really
What you can do
• Easy network design
• Great accuracy
• Need few annotated
• Slow / expensive
• Need large batches
• Bleeding edge
• pip install spacy-transformers
• Supports textcat, aligned tokenization, custom models
• Coming soon: NER, tagging, dependency parsing
• Coming soon: RPC for the transformer components
• Coming soon: Transformers support in Prodigy
Follow us on Twitter