Japanese Zero Anaphora Resolution Can Benefit from Parallel Texts Through Neural Transfer Learning

Japanese Zero Anaphora Resolution Can Benefit from Parallel Texts Through
Neural Transfer Learning Masato Umakoshi, Yugo Murawaki and Sadao Kurohashi Kyoto University 1

Zero Anaphora Resolution (ZAR) 2 • Detect omission of expression,
zero pronoun (ZP) • Identify its antecedents • SOTA: BERT-based multi-task learning [Ueda+, 2020] 妻が息⼦にいくつかおもちゃを買ってあげた。ϕ! =NOM ⾚い⾞を特に気に⼊っている。 wife=NOM several son=DAT toy=ACC red buy.GER=give.PST especially 𝜙! =NOM like.GER=be.NPST car=ACC

Parallel Texts as Implicit Annotations for ZAR • Some omitted
expressions in Japanese are obligatory in English 3 妻が息⼦にいくつかおもちゃを買ってあげた。ϕ! =NOM ⾚い⾞を特に気に⼊っている。 wife=NOM several son=DAT toy=ACC red buy.GER=give.PST especially 𝜙! =NOM like.GER=be.NPST car=ACC My wife got my son several toys. He especially likes the red car.

妻が息⼦にいくつかおもちゃを買ってあげた。ϕ! =NOM ⾚い⾞を特に
気に⼊っている。 Existing Approaches 4 • A line of research tried rule-based methods to annotate translation pairs with ZAR annotation (Nakaiwa, 1999; Furukawa+, 17) • Rely on word alignment, dependency parsing, and coreference resolution • However, their methods met limitation due to error propagation nsubj Coreference Word alignment My wife got my son several toys. He especially likes the red car.

Idea: Neural Transfer Learning 5 • We propose neural transfer
learning from MT • By generating English sentence, the MT model is forced to implicitly recover ZP • Toward the discrepancy between MT(encoder-decoder) and ZAR(encoder), we adopt BERT-based method • To mitigate catastrophic forgetting, we test to incorporate masked language modeling (MLM) during MT

6 Baseline Method • Argument selection, which is analogous to
head selection in dependency parsing[Ueda+, 2020]

7 Proposed Method Target task (ZAR w/ related tasks) Intermediate
task (MT w/ MLM) Pretraining (MLM)

Results 8 Web News Ueda et al. (2020) 70.3 56.7
+ MT 70.5 57.7 + MT w/ MLM 71.9 58.3 • MT contributes ZAR • Incorporating MLM leads to further gain

Example 9 ZP was translated correctly (the school) and its
antecedent was successfully identified OURS BASELINE GOLD Source Text Translation by model 第七⼗四回全国⾼校ラグビーフットボール⼤会準決勝の五⽇、近鉄花園ラグビー場のスタンドでは ϕ! -NOM 同ラグビー場からわずか⼀・五キロの⾄近距離にありながら、 .. same rugby 𝜙! -NOM 1.5km=GEN distance=DAT be-GER only Twenty-two students of Osaka Korean pro-Pyongyang Korean high school students watched the final at Kintetsu National High School's Hanazono Stadium on Sunday . Although the school is only about five kilometers away from the stadium, .. ORD 74 CLF rugby national high.school football tournament semi final=GEN 5day Kintetsu Hanazono rugby field=GEN stands=LOC=TOP Osaka korea high school 25 CLF=NOM blue wind breaker rugby club.member appearance=INS watch=do.PST field=ABL but ⼤阪朝鮮⾼級学校ラグビー部員⼆⼗五⼈が⻘いウインドブレーカー姿で観戦した。

Conclusion • We propose neural transfer learning from MT •
Our experiments show that ZAR can benefit from parallel texts thanks to the flexibility of neural networks • Our framework revived the old idea(1999!) • In addition, we find further gains can be obtained by incorporating MLM 10

Japanese Zero Anaphora Resolution Can Benefit f...

Japanese Zero Anaphora Resolution Can Benefit from Parallel Texts Through Neural Transfer Learning

Masato Umakoshi

More Decks by Masato Umakoshi

Other Decks in Technology

Featured

Transcript

Japanese Zero Anaphora Resolution Can Benefit from Parallel Texts Through

Zero Anaphora Resolution (ZAR) 2 • Detect omission of expression,

Parallel Texts as Implicit Annotations for ZAR • Some omitted

妻が息⼦にいくつかおもちゃを買ってあげた。ϕ! =NOM ⾚い⾞を特に

Idea: Neural Transfer Learning 5 • We propose neural transfer

6 Baseline Method • Argument selection, which is analogous to

7 Proposed Method Target task (ZAR w/ related tasks) Intermediate

Results 8 Web News Ueda et al. (2020) 70.3 56.7

Example 9 ZP was translated correctly (the school) and its

Conclusion • We propose neural transfer learning from MT •