spaCy meets Transformers

spaCy meets Transformers Matthew Honnibal Explosion

Matthew Honnibal CO-FOUNDER PhD in Computer Science in 2009. 10
years publishing research on state-of-the- art natural language understanding systems. Left academia in 2014 to develop spaCy. Ines Montani CO-FOUNDER Programmer and front-end developer with degree in media science and linguistics. Has been working on spaCy since its first release. Lead developer of Prodigy.

100k+ users worldwide 15k stars on GitHub 400 contributors 60+
extension packages https://spacy.io

https://prodi.gy

2500+ users, including 250+ companies 1200+ forum members https://prodi.gy

ELMo ULMFiT BERT

Tokenization alignment

Fine-tuning

spaCy’s NLP pipeline

Processing pipeline

Processing pipeline with shared representations

Processing pipeline without shared representations

Modular architecture • Functions should be small and self-contained •
Avoid state and side- effects • Lots of systems from fewer parts Speed and accuracy • Small functions make you repeat work • Without state, models lose information • ML models aren’t really interchangeable anyway

What you can do with transformers

Transformers: Pros • Easy network design • Great accuracy •
Need few annotated examples Transformers: Cons • Slow / expensive • Need large batches • Bleeding edge

github.com/explosion/spacy-transformers

• pip install spacy-transformers • Supports textcat, aligned tokenization, custom
models • Coming soon: NER, tagging, dependency parsing • Coming soon: RPC for the transformer components • Coming soon: Transformers support in Prodigy Conclusion

Thank you! Explosion explosion.ai Follow us on Twitter @honnibal @explosion_ai

spaCy meets Transformers

spaCy meets Transformers

Matthew Honnibal PRO

More Decks by Matthew Honnibal

Other Decks in Technology

Featured

Transcript