͓ͲΖ͖ͷઌߦݚڀͨͪ
On the Cross-lingual Transferability of Monolingual Representations
(Artetxe et al., 2020)
Encoder
L1 Embeddings
L1 Pretraining
🇬🇧
L2 Embeddings
Encoder
❄
L2 Pretraining
🇪🇸
Encoder
L2 Embeddings
L2 Evaluation
🇪🇸
Encoder
L1 Embeddings
L1 Fine-tuning
🇬🇧
❄
ΤϯίʔμͷॏΈӳޠͰ͔͠Ξοϓσʔτ͞Ε͍ͯͳ͍͕ɺεϖΠϯޠͷλεΫ͕ղ͚Δɻ
Slide 10
Slide 10 text
͓ͲΖ͖ͷઌߦݚڀͨͪ
Using Transfer to Study Linguistic Structure in Language Models
(Papadimitriou and Jurafsky, 2020)
Encoder
L1 Embeddings
L1 Pretraining
♪
L2 Embeddings
Encoder
❄
L2 Training
🇪🇸
Encoder
L2 Embeddings
L2 Evaluation
🇪🇸
ָේσʔλͰ܇࿅͞ΕͨΤϯίʔμ͕ɺεϖΠϯޠͷϞσϦϯάʹ͋Δఔ͑Δɻ
ਓݴޠ
'1539', '3283', '2412', '6587', '5401', '26', '9138', '3192', '904', '7458'
w ୯ޠͷΘΓʹࣈͱه߸ͷཏྻ͔ΒͳΔɻ
w Կ͔͠ΒͷTFNBOUJDTʹάϥϯσΟϯά͞Ε͍ͯΔΘ͚
Ͱͳ͘ɺͨͩߏΛͭɻ
w ਓݴޠͷจαϯϓϦϯά͞Εͯੜ͞ΕΔɻ
Slide 15
Slide 15 text
ਓݴޠͷจΛαϯϓϦϯά͢Δ
l ∼ plen
(l)
w ·ͣจͷ͞ΛԿ͔͠Βͷ͕Βαϯϓϧ͢Δɻ
w ͦͷ͚ͩτʔΫϯΛαϯϓϧ͢Δɻ͜͜ͰΘΕΔΞϧΰϦζϜ͕
ਓݴޠΛಛ͚ͮΔɻ
Slide 16
Slide 16 text
ࣗવݴޠʹ͓͚Δ୯ޠϥϯμϜʹݱΕΔΘ͚Ͱͳ͍ɻ
୯ޠͷΛϞσϦϯά͢Δ
• සΜͰ͍Δ͠…
• จͷ୯ޠԿ͔͠Βͷؔ࿈ੑΛ࣋ͭɻ
“A dog and cat are fighting over food.”
Slide 17
Slide 17 text
Uniform Language
p(w) =
1
|
𝒱
|
୯ޠҰ༷͔ΒαϯϓϦϯά͞ΕΔ
͜Ε୯ͳΔϕʔεϥΠϯɻ
Slide 18
Slide 18 text
Zipf Language
p(w) ∝
1
rank(w)
୯ޠ Zipf ͷ͔ΒαϯϓϦϯά͞ΕΔɻ
Slide 19
Slide 19 text
Log-Linear Language
୯ޠจຖʹҟͳΔ͔ΒαϯϓϦϯά͞ΕΔɻ
p(w|s) ∝ exp( ⃗
c s
⋅ ⃗
v w
)
⃗
c s
⃗
v w
Discourse vector: ͜ΕͦΕͧΕͷจʹରͯ͠ɺਖ਼ن
͔ΒϥϯμϜʹαϯϓϧ͞ΕΔɻ
Word vectors: ͦΕͧΕͷ୯ޠ͕ϕΫτϧΛ࣋ͭɻ͜ͷϕΫ
τϧਖ਼ن͔ΒϥϯμϜʹαϯϓϦϯά͞ΕΔɻ
Slide 20
Slide 20 text
จͷ୯ޠҰఆͷϧʔϧʹैͬͯஔ͞ΕΔɻ
จͷߏΛϞσϦϯά͢Δ
I dog
saw a
nsubj obj
det
7&3#
130 /06/
%&5
• ࠓճґଘߏΛͨ͠ͷΛ࡞Δɻ
Ϟσϧ
• Transformer (300 dim, 3 layers)
ࣄલֶशͷσʔλ
12.8M จΛ֤ݴޠʹ͍ͭͯαϯϓϦϯάɻ
• Artificial languages
• Natural languages (Wikipedia dumps of en, es, ja)
ධՁλεΫͷσʔλ (Fine-tuning and test)
• the Penn Treebank Corpus
࣮ݧઃఆ
(LSTM ࢼͯ͠େମಉ͡)
Slide 29
Slide 29 text
τʔΫϯͷͷӨڹʁ
• Log-linear Language ͷΑ͏ͳ୯ޠ͕͋Δͱɺͦͦ͑͜͜ΔΤϯίʔμʹͳΔɻ