Slide 44
Slide 44 text
実践者向けディープラーニング勉強会 第3回 - 15/May/2019 Kazuki Motohashi - Skymind K.K.
44
BERT
E
[CLS]
E
1
E
[SEP]
... E
N
E
1
’ ... E
M
’
C T
1
T
[SEP]
... T
N
T
1
’ ... T
M
’
[CLS]
Tok
1
[SEP]
... Tok
N
Tok
1
... Tok
M
Question Paragraph
BERT
E
[CLS]
E
1
E
2
E
N
C T
1
T
2
T
N
Single Sentence
...
...
BERT
Tok 1 Tok 2 Tok N
...
[CLS]
E
[CLS]
E
1
E
2
E
N
C T
1
T
2
T
N
Single Sentence
B-PER
O O
...
...
E
[CLS]
E
1
E
[SEP]
Class
Label
... E
N
E
1
’ ... E
M
’
C T
1
T
[SEP]
... T
N
T
1
’ ... T
M
’
Start/End Span
Class
Label
BERT
Tok 1 Tok 2 Tok N
...
[CLS] Tok 1
[CLS]
[CLS]
Tok
1
[SEP]
... Tok
N
Tok
1
... Tok
M
Sentence 1
...
Sentence 2
Figure 3: Our task specific models are formed by incorporating BERT with one additional output layer, so a
minimal number of parameters need to be learned from scratch. Among the tasks, (a) and (b) are sequence-level
tasks while (c) and (d) are token-level tasks. In the figure, E represents the input embedding, Ti
represents the
contextual representation of token i, [CLS] is the special symbol for classification output, and [SEP] is the special
symbol to separate non-consecutive token sequences.
QNLI Question Natural Language Inference is the goal is to predict whether an English sentence