Slide 1

Slide 1 text

BERT for Text Classification with Keras/TensorFlow 2 Galuh Sahid Data Scientist, Gojek / ML GDE

Slide 2

Slide 2 text

What will we do today?

Slide 3

Slide 3 text

This movie is awesome! Positive

Slide 4

Slide 4 text

Positive Negative This movie is thrilling! Such a disappointing ending.

Slide 5

Slide 5 text

This movie is thrilling! Positive Model

Slide 6

Slide 6 text

1. Train everything from scratch 2. Use a pre-trained model Ways to do training

Slide 7

Slide 7 text

A deep learning model is trained on a large dataset, then used to perform similar tasks on another dataset (e.g. text classification) Transfer learning

Slide 8

Slide 8 text

What is BERT?

Slide 9

Slide 9 text

BERT: Bidirectional Encoder Representations from Transformers

Slide 10

Slide 10 text

“...we train a general-purpose ‘language understanding’ model on a large text corpus (like Wikipedia), and then use that model for downstream NLP tasks that we care about (like question answering)” https://github.com/google-research/bert

Slide 11

Slide 11 text

“BERT outperforms previous methods because it is the first unsupervised, deeply bidirectional system for pre-training NLP.” https://github.com/google-research/bert

Slide 12

Slide 12 text

BERT was trained using only a plain text corpus Unsupervised

Slide 13

Slide 13 text

● Pre-trained representations can also either be context-free or contextual Bidirectional bank bank deposit river bank

Slide 14

Slide 14 text

● Contextual representations can further be unidirectional or bidirectional Bidirectional I made a bank deposit I made a bank deposit

Slide 15

Slide 15 text

● Starts from the very bottom of a deep neural network Deeply bidirectional

Slide 16

Slide 16 text

BERT Training Strategies

Slide 17

Slide 17 text

Positive Negative This movie is thrilling! Such a disappointing ending.

Slide 18

Slide 18 text

● Masked language model ● Next sentence prediction Training strategies

Slide 19

Slide 19 text

Input: the man went to the [MASK1] . he bought a [MASK2] of milk. Labels: [MASK1] = store; [MASK2] = gallon Masked language model

Slide 20

Slide 20 text

Sentence A: the man went to the store . Sentence B: he bought a gallon of milk . Label: IsNextSentence Next sentence prediction

Slide 21

Slide 21 text

Sentence A: the man went to the store . Sentence B: penguins are flightless . Label: NotNextSentence Next sentence prediction

Slide 22

Slide 22 text

● https://github.com/google-research/bert References

Slide 23

Slide 23 text

Hands-on Practice

Slide 24

Slide 24 text

bit.ly/wtm-bert-colab