BERT for Text Classification with Keras/TensorFlow 2

Slide 1

Slide 1 text

BERT for Text Classiﬁcation with Keras/TensorFlow 2 Galuh Sahid Data Scientist, Gojek / ML GDE

Slide 2

Slide 2 text

What will we do today?

Slide 3

Slide 3 text

This movie is awesome! Positive

Slide 4

Slide 4 text

Positive Negative This movie is thrilling! Such a disappointing ending.

Slide 5

Slide 5 text

This movie is thrilling! Positive Model

Slide 6

Slide 6 text

1. Train everything from scratch 2. Use a pre-trained model Ways to do training

Slide 7

Slide 7 text

A deep learning model is trained on a large dataset, then used to perform similar tasks on another dataset (e.g. text classiﬁcation) Transfer learning

Slide 8

Slide 8 text

What is BERT?

Slide 9

Slide 9 text

BERT: Bidirectional Encoder Representations from Transformers

Slide 10

Slide 10 text

“...we train a general-purpose ‘language understanding’ model on a large text corpus (like Wikipedia), and then use that model for downstream NLP tasks that we care about (like question answering)” https://github.com/google-research/bert

Slide 11

Slide 11 text

“BERT outperforms previous methods because it is the ﬁrst unsupervised, deeply bidirectional system for pre-training NLP.” https://github.com/google-research/bert

Slide 12

Slide 12 text

BERT was trained using only a plain text corpus Unsupervised

Slide 13

Slide 13 text

● Pre-trained representations can also either be context-free or contextual Bidirectional bank bank deposit river bank

Slide 14

Slide 14 text

● Contextual representations can further be unidirectional or bidirectional Bidirectional I made a bank deposit I made a bank deposit

Slide 15

Slide 15 text

● Starts from the very bottom of a deep neural network Deeply bidirectional

Slide 16

Slide 16 text

BERT Training Strategies

Slide 17

Slide 17 text