Slide 1

Slide 1 text

I am Olga Minguett I am a Data Scientist I work with Optum I am a Masters in AI student Hello!

Slide 2

Slide 2 text

© 2021 Optum, Inc. All rights reserved. Confidential property of Optum. Do not distribute or reproduce without express permission from Optum. 2

Slide 3

Slide 3 text

Olga Minguett 17 August 2021 Text Classification using Huggingface Transformers

Slide 4

Slide 4 text

Theory Huggingface Transformers 1

Slide 5

Slide 5 text

Key terms © 2021 Optum, Inc. All rights reserved. Confidential property of Optum. Do not distribute or reproduce without express permission from Optum. 5 Natural Language Processing — NLP Ability of a computer program to process and understand human language, as it is spoken and written. Natural Language Understanding — NLU Enables computers to interpret human language using syntactic and semantic analysis of text and speech. Natural Language Generation — NLG Generates written or spoken human language from structured data generated by the system to respond. Examples: ● customer feedback analysis ● automatic translation ● email classification ● Text summarisation

Slide 6

Slide 6 text

What is … Huggingface Transformers? © 2021 Optum, Inc. All rights reserved. Confidential property of Optum. Do not distribute or reproduce without express permission from Optum. 6 • APIs to download and use those pretrained models on a given text, and fine-tune them on your own datasets. • Each module defining an architecture is fully standalone and can be modified to enable research experiments. • Integration to Jax, PyTorch and TensorFlow. Library that contains state-of-the-art pretrained models for NLP to perform tasks on texts such as: • Feature Extraction • Fill-Mask • Named Entity Recognition • Question Answering • Sentiment Analysis • Summarisation • Text Generation • Translation

Slide 7

Slide 7 text

What is … Huggingface Transformers? © 2021 Optum, Inc. All rights reserved. Confidential property of Optum. Do not distribute or reproduce without express permission from Optum. 7 Overview of the pretrained Transformers models : • GPT-like (also called auto- regressive Transformer models) • BERT-like (also called auto- encoding Transformer models) • BART/T5-like (also called sequence-to- sequence Transformer models) Transformers are Language Models: • General trained model on large amounts of raw text in a self-supervised fashion, where the objective is automatically computed from the inputs of the model. Without human-annotated labels. • Specific trained model goes through a process of Transfer Learning, where the objective is fine-tuned in a supervised fashion. With human-annotated labels.

Slide 8

Slide 8 text

What is … Huggingface Transformers? © 2021 Optum, Inc. All rights reserved. Confidential property of Optum. Do not distribute or reproduce without express permission from Optum. 8 • Decoders (GPT-2): • Tasks: • Causal Language Modeling • Natural Language Generation • Encoders (BERT): • Tasks: • Sequence Classification, • Question Answering, • Masked Language Modeling • Natural Language Understanding • Sequence-to-Sequence (BART/T5): • Tasks: • Translation • Summarization • NLU / NLG

Slide 9

Slide 9 text

Pre-Requisites © 2021 Optum, Inc. All rights reserved. Confidential property of Optum. Do not distribute or reproduce without express permission from Optum. 9 Task Describes the use-cases for that can be perform over different model configuration. 22 Tasks Libraries Collection of resources used to optimize tasks 22 Libraries 1178 Datasets Collection of data used for the task and to fine-tune the models over Dataset 13524 Models Language pre-trained transformer models Model T L D M

Slide 10

Slide 10 text

10 © 2021 Optum, Inc. All rights reserved. Confidential property of Optum. Do not distribute or reproduce without express permission from Optum.

Slide 11

Slide 11 text

11 © 2021 Optum, Inc. All rights reserved. Confidential property of Optum. Do not distribute or reproduce without express permission from Optum.

Slide 12

Slide 12 text

Practice Huggingface Transformers 2

Slide 13

Slide 13 text

© 2021 Optum, Inc. All rights reserved. Confidential property of Optum. Do not distribute or reproduce without express permission from Optum. 13 Google Collab

Slide 14

Slide 14 text

Any questions? You can find me at: • Linkedin.com/in/olgaminguett/ • [email protected][email protected] Thsnks!

Slide 15

Slide 15 text

References Website https://huggingface.co/ GitHub https://github.com/huggingface Attention is all you need https://arxiv.org/abs/1706.03762

Slide 16

Slide 16 text

Credits • Photographs by Paramount Pictures and DreamWorks