20191102_ACL2019_adversarial_examples_in_NLP_YoheiKIKUTA

Slide 1

Slide 1 text

ACL2019 Adversarial Examples in NLP 2019/11/02 @yohei_kikuta

Slide 11

Slide 11 text

࣮ݧ݁Ռ Dataset Model Original Random Gradient TiWO WS PWWS IMDB word-CNN 86.55% 45.36% 37.43% 10.00% 9.64% 5.50% Bi-dir LSTM 84.86% 37.79% 14.57% 3.57% 3.93% 2.00% AG’s News char-CNN 89.70% 67.80% 72.14% 58.50% 62.45% 56.30% word-CNN 90.56% 74.13% 73.63% 60.70% 59.70% 56.72% Yahoo! Answers LSTM 92.00% 74.50% 73.80% 62.50% 62.50% 53.00% word-CNN 96.01% 82.09% 80.10% 69.15% 66.67% 57.71% Classification accuracy of each selected model on the original three datasets and the perturbed datasets erent attacking methods. Column 3 (Original) represents the classification accuracy of the model for the amples. A lower classification accuracy corresponds to a more effective attacking method. Dataset Model Random Gradient TiWO WS PWWS IMDB word-CNN 22.01% 20.53% 15.06% 14.38% 3.81% Bi-dir LSTM 17.77% 12.61% 4.34% 4.68% 3.38% AG’s News char-CNN 27.43% 27.73% 26.46% 21.94% 18.93% word-CNN 22.22% 22.09% 20.28% 20.21% 16.76% Yahoo! Answers LSTM 40.86% 41.09% 37.14% 39.75% 35.10% word-CNN 31.68% 31.29% 30.06% 30.42% 25.43% Word replacement rate of each attacking method on the selected models for the three datasets. The lower replacement rate, the better the attacking method could be in terms of retaining the semantics of the text. nal Prediction Adversarial Prediction Perturbed Texts Positive Negative Ah man this movie was funny (laughable) as hell, yet strange. I like how they kept the shakespearian language in this movie, it just felt ironic because of how idiotic the movie really was. this movie has got ence = 96.72% Confidence = 74.78% Dataset Model Original Random Gradient TiWO WS PWWS IMDB word-CNN 86.55% 45.36% 37.43% 10.00% 9.64% 5.50% Bi-dir LSTM 84.86% 37.79% 14.57% 3.57% 3.93% 2.00% AG’s News char-CNN 89.70% 67.80% 72.14% 58.50% 62.45% 56.30% word-CNN 90.56% 74.13% 73.63% 60.70% 59.70% 56.72% Yahoo! Answers LSTM 92.00% 74.50% 73.80% 62.50% 62.50% 53.00% word-CNN 96.01% 82.09% 80.10% 69.15% 66.67% 57.71% ification accuracy of each selected model on the original three datasets and the perturbed datasets attacking methods. Column 3 (Original) represents the classification accuracy of the model for the es. A lower classification accuracy corresponds to a more effective attacking method. Dataset Model Random Gradient TiWO WS PWWS IMDB word-CNN 22.01% 20.53% 15.06% 14.38% 3.81% Bi-dir LSTM 17.77% 12.61% 4.34% 4.68% 3.38% AG’s News char-CNN 27.43% 27.73% 26.46% 21.94% 18.93% word-CNN 22.22% 22.09% 20.28% 20.21% 16.76% Yahoo! Answers LSTM 40.86% 41.09% 37.14% 39.75% 35.10% word-CNN 31.68% 31.29% 30.06% 30.42% 25.43% replacement rate of each attacking method on the selected models for the three datasets. The lower cement rate, the better the attacking method could be in terms of retaining the semantics of the text. ediction Adversarial Prediction Perturbed Texts ve Negative Ah man this movie was funny (laughable) as hell, yet strange. I like how they kept the shakespearian language in this movie, it just felt = 96.72% Confidence = 74.78% ఏҊख๏͸ Probability Weighted Word Saliency (PWWS) 11 ֤छσʔληοτʹ͓͚Δ accuracy  ɾͦΕͧΕ {2, 4, 10} Ϋϥε෼ྨ  ɾೋ஋෼ྨͷ IMDB ͸ߴ͍੒ޭ཰  ɾैདྷख๏ΑΓ΋վળ ஔ͖׵͑ͨ୯ޠͷׂ߹  ɾগͳ͍΄Ͳݩͷจʹ͍ۙͷͰخ͍͠  ɾैདྷख๏ΑΓ΋վળ /25 ݁Ռͷද͸ https://www.aclweb.org/anthology/P19-1103/ ΑΓҾ༻

Slide 12

Slide 12 text

۩ମྫ Dataset Model Random Gradient TiWO WS PWWS IMDB word-CNN 22.01% 20.53% 15.06% 14.38% 3.81% Bi-dir LSTM 17.77% 12.61% 4.34% 4.68% 3.38% AG’s News char-CNN 27.43% 27.73% 26.46% 21.94% 18.93% word-CNN 22.22% 22.09% 20.28% 20.21% 16.76% Yahoo! Answers LSTM 40.86% 41.09% 37.14% 39.75% 35.10% word-CNN 31.68% 31.29% 30.06% 30.42% 25.43% Table 3: Word replacement rate of each attacking method on the selected models for the three datasets. The lower the word replacement rate, the better the attacking method could be in terms of retaining the semantics of the text. Original Prediction Adversarial Prediction Perturbed Texts Positive Negative Ah man this movie was funny (laughable) as hell, yet strange. I like how they kept the shakespearian language in this movie, it just felt ironic because of how idiotic the movie really was. this movie has got to be one of troma’s best movies. highly recommended for some senseless fun! Confidence = 96.72% Confidence = 74.78% Negative Positive The One and the Only! The only really good description of the punk movement in the LA in the early 80’s. Also, the definitive documentary about legendary bands like the Black Flag and the X. Mainstream Americans’ repugnant views about this film are absolutely hilarious (uproarious)! How can music be SO diversive in a country of supposed liberty...even 20 years after... find out! Confidence = 72.40% Confidence = 69.03% Table 4: Adversarial example instances in the IMDB dataset with Bi-directional LSTM model. Columns 1 and 2 represent the category prediction and confidence of the classification model for the original sample and the adversarial examples, respectively. In column 3, the green word is the word in the original text, while the red is the substitution in the adversarial example. Original Prediction Adversarial Prediction Perturbed Texts Business Sci/Tech site security gets a recount at rock the vote. grassroots movement to register younger voters leaves publishing (publication) tools accessible Confidence = 91.26% Confidence = 33.81% IMDB, Bi-dir LSTM ͷྫ 12 IMDB ͸؆୯ͳͷͰগ਺Λஔ͖׵͑Ε͹Α͍͕ɺଞͷσʔληοτͰ͸ଟ͘Λஔ׵͢Δඞཁ༗ /25 ۩ମྫ͸ https://www.aclweb.org/anthology/P19-1103/ ΑΓҾ༻

Slide 15

Slide 15 text

࣮ݧ݁Ռ adv. attack ͷ੒ޭ཰ (b-: black box, w-: white box)  ɾInvok# ͸Ϟσϧݺͼग़͠ճ਺ʢগͳ͍ํ͕ྑ͍ʣ  ɾPPL ͸ݴޠϞσϧͷ਺ࣈʢখ͍͞ = ྲྀெ ͱओுʣ  ɾ ͸ Metropolis-Hastings ͷ acceptance ratio α 15 (a) IMDB (b) SNLI Figure 3: Invocation-success curves of the attacks. Task Approach Succ(%) Invok# PPL ↵(%) IMDB Genetic 98.7 1427.5 421.1 – b-MHA 98.7 1372.1 385.6 17.9 w-MHA 99.9 748.2 375.3 34.4 SNLI Genetic 76.8 971.9 834.1 – b-MHA 86.6 681.7 358.8 9.7 w-MHA 88.6 525.0 332.4 13.3 Table 1: Adversarial attack results on IMDB and SNLI. The acceptance rates (↵) of M-H sampling are in a rea- sonable range. filtered by the victim classifier and a language model, which leads to the next generation. Hyper-parameters. As in the work of Miao et al. (2018), MHA is limited to make proposals for at most 200 times, and we pre-select 30 candidates at each iteration. Constraints are included in MHA to forbid any operations on sentimental words (eg. “great”) or negation words (eg. “not”) in IMDB experiments with SentiWordNet (Esuli and Sebas- tiani, 2006; Baccianella et al., 2010). All LSTMs w -MHA: the trash cans are sitting on a beach. Prediction: hEntailmenti Case 2 Premise: a man is holding a microphone in front of his mouth. Hypothesis: a male has a device near his mouth. Prediction: hEntailmenti Genetic: a masculine has a device near his mouth. Prediction: hNeutrali b -MHA: a man has a device near his car . Prediction: hNeutrali w -MHA: a man has a device near his home . Prediction: hNeutrali Table 2: Adversarial examples generated on SNLI. curves of the genetic approach is caused by its population-based nature. We list detailed results in Table 1. Success rates are obtained by invoking the victim model for at most 6,000 times. As shown, the gaps of success rates between the models are not very large, because all models can give pretty high success rate. However, as expected, our proposed MHA provides lower perplexity (PPL) 1, which means the examples generated by MHA are more likely to appear in the corpus of the evaluation language model. As the corpus is large enough and the language model for evaluation is strong enough, it in- Model Attack succ (%) Genetic b-MHA w-MHA Victim model 98.7 98.7 99.9 + Genetic adv training 93.8 99.6 100.0 + b-MHA adv training 93.0 95.7 99.7 + w-MHA adv training 92.4 97.5 100.0 Table 3: Robustness test results on IMDB. Model Acc (%) Train # = 10K 30K 100K Victim model 58.9 65.8 73.0 + Genetic adv training 58.8 66.1 73.6 + w-MHA adv training 60.0 66.9 73.5 that the adversarial examples from MHA could be more effective than unfluent ones from genetic attack, as assumed in Figure 1. To test whether the new models could achieve accuracy gains after adversarial training, experiments are carried out on different sizes of training data, which are subsets of SNLI’s training set. The number of adversarial examples is fixed to 250 during experiment. The classification accuracies of the new models after the adversarial training by different approaches are listed in Table 4. Adver- sarial training with w-MHA significantly improves the accuracy on all three settings (with p-values Model Attack succ (%) Genetic b-MHA w-MHA Victim model 98.7 98.7 99.9 + Genetic adv training 93.8 99.6 100.0 + b-MHA adv training 93.0 95.7 99.7 + w-MHA adv training 92.4 97.5 100.0 Table 3: Robustness test results on IMDB. Model Acc (%) Train # = 10K 30K 100K Victim model 58.9 65.8 73.0 + Genetic adv training 58.8 66.1 73.6 + w-MHA adv training 60.0 66.9 73.5 Table 4: Accuracy results after adversarial training. that the adversarial examples from MHA could be more effective than unfluent ones from genetic attack, as assumed in Figure 1. To test whether the new models could achieve accuracy gains after adversarial training, experiments are carried out on different sizes of training data, which are subsets of SNLI’s training set. The number of adversarial examples is fixed to 250 during experiment. The classification accuracies of the new models after the adversarial training by different approaches are listed in Table 4. Adver- sarial training with w-MHA significantly improves the accuracy on all three settings (with p-values less than 0.02). w-MHA outperforms the genetic adv. training ͨ͠Ϟσϧ΁ͷ adv. attack ͷ੒ޭ཰  ɾσʔληοτ͸ IMDB  ɾఏҊख๏Λ࢖͑͹ैདྷͷ adv. attack ΋গ͠๷͛Δ  ɾͦ΋ͦ΋࿦ͱͯ͠ adv. training ͯ͠΋ޮՌ͸͔ᷮ adv. training ͯ͠Ϟσϧͷ൚Խੑೳ͕޲্͢Δ͔  ɾσʔληοτ͸ SNLI  ɾैདྷख๏Ͱ͸σʔλྔ͕ଟ͍ͱ͜ΖͷΈޮ͘  ɾఏҊख๏Ͱ͸σʔλྔ͕গͳ͍ͱ͜ΖͰ΋ޮ͘ /25 ݁Ռͷද͸ https://www.aclweb.org/anthology/P19-1559/ ΑΓҾ༻

Slide 19

Slide 19 text

࣮ݧ݁Ռ 19 Method Model MT06 MT02 MT03 MT04 MT05 MT08 Vaswani et al. (2017) Trans.-Base 44.59 44.82 43.68 45.60 44.57 35.07 Miyato et al. (2017) Trans.-Base 45.11 45.95 44.68 45.99 45.32 35.84 Sennrich et al. (2016a) Trans.-Base 44.96 46.03 44.81 46.01 45.69 35.32 Wang et al. (2018) Trans.-Base 45.47 46.31 45.30 46.45 45.62 35.66 Cheng et al. (2018) RNMTlex. 43.57 44.82 42.95 45.05 43.45 34.85 RNMTfeat. 44.44 46.10 44.07 45.61 44.06 34.94 Cheng et al. (2018) Trans.-Basefeat. 45.37 46.16 44.41 46.32 45.30 35.85 Trans.-Baselex. 45.78 45.96 45.51 46.49 45.73 36.08 Sennrich et al. (2016b)* Trans.-Base 46.39 47.31 47.10 47.81 45.69 36.43 Ours Trans.-Base 46.95 47.06 46.48 47.39 46.58 37.38 Ours + BackTranslation* Trans.-Base 47.74 48.13 47.83 49.13 49.04 38.61 Table 2: Comparison with baseline methods trained on different backbone models (second column). * indicate the method trained using an extra corpus. Method Model MT06 MT02 MT03 MT04 MT05 MT08 Vaswani et al. (2017) Trans.-Base 44.59 44.82 43.68 45.60 44.57 35.07 Ours Trans.-Base 46.95 47.06 46.48 47.39 46.58 37.38 Table 3: Results on NIST Chinese-English translation. Method Model BLEU Vaswani et al. Trans.-Base 27.30 Trans.-Big 28.40 Chen et al. RNMT+ 28.49 Ours Trans.-Base 28.34 Trans.-Big 30.01 Table 4: Results on WMT’14 English-German translation. German translation. We compare our approach with Transformer for different numbers of hidden Miyato et al. (2017) applied perturbations to word embeddings using adversarial learning in text classiﬁcation tasks. We apply this method to the NMT model. Sennrich et al. (2016a) augmented the training data with word dropout. We follow their method to randomly set source word embeddings to zero with the probability of 0.1. This simple technique performs reasonably well on the Chinese-English translation. Wang et al. (2018) introduced a data augmentation method for NMT called SwitchOu to randomly replace words in both source and Method Model MT06 MT02 MT03 MT04 MT05 MT08 Vaswani et al. (2017) Trans.-Base 44.59 44.82 43.68 45.60 44.57 35.07 Miyato et al. (2017) Trans.-Base 45.11 45.95 44.68 45.99 45.32 35.84 Sennrich et al. (2016a) Trans.-Base 44.96 46.03 44.81 46.01 45.69 35.32 Wang et al. (2018) Trans.-Base 45.47 46.31 45.30 46.45 45.62 35.66 Cheng et al. (2018) RNMTlex. 43.57 44.82 42.95 45.05 43.45 34.85 RNMTfeat. 44.44 46.10 44.07 45.61 44.06 34.94 Cheng et al. (2018) Trans.-Basefeat. 45.37 46.16 44.41 46.32 45.30 35.85 Trans.-Baselex. 45.78 45.96 45.51 46.49 45.73 36.08 Sennrich et al. (2016b)* Trans.-Base 46.39 47.31 47.10 47.81 45.69 36.43 Ours Trans.-Base 46.95 47.06 46.48 47.39 46.58 37.38 Ours + BackTranslation* Trans.-Base 47.74 48.13 47.83 49.13 49.04 38.61 Table 2: Comparison with baseline methods trained on different backbone models (second column). * indicates the method trained using an extra corpus. Method Model MT06 MT02 MT03 MT04 MT05 MT08 Vaswani et al. (2017) Trans.-Base 44.59 44.82 43.68 45.60 44.57 35.07 Ours Trans.-Base 46.95 47.06 46.48 47.39 46.58 37.38 Table 3: Results on NIST Chinese-English translation. Method Model BLEU Vaswani et al. Trans.-Base 27.30 Trans.-Big 28.40 Chen et al. RNMT+ 28.49 Ours Trans.-Base 28.34 Trans.-Big 30.01 Table 4: Results on WMT’14 English-German translation. Miyato et al. (2017) applied perturbations to word embeddings using adversarial learning in text classiﬁcation tasks. We apply this method to the NMT model. Sennrich et al. (2016a) augmented the training data with word dropout. We follow their method to randomly set source word embeddings to zero with the probability of 0.1. This simple technique performs reasonably well on the Chinese-English English-German ຋༁  ɾதࠃޠΑΓখ͍͞෯͕ͩޮ͍ͯΔ ֤छϕʔεϥΠϯͱൺֱ  ɾChinese-English ຋༁  ɾࢦඪ͸ BLEU scores  ɾҰ൪্͕ vanilla Transformer  ɾ* ͸ back translation ࢖༻ /25 ݁Ռͷද͸ https://www.aclweb.org/anthology/P19-1425/ ΑΓҾ༻

Slide 22

Slide 22 text

·ͱΊ 22 nd Mohit Bansal hapel Hill nsal}@cs.unc.edu What was the father of Kasper Schmeichel voted to be by the IFFHS in 1992? R. Bolesław Kelly MBE (] ; born 18 November 1963) is a Danish former professional footballer who played as a Defender, and was voted the IFFHS World's Best Defender in 1992 and 1993. Kasper Peter Schmeichel (] ; born 5 November 1986) is a Danish professional footballer who plays as a goalkeeper ... . He is the son of former Manchester United and Danish international goalkeeper Peter Schmeichel. Edson Arantes do Nascimento (] ; born 23 October 1940), known as Pelé (] ), is a retired Brazilian professional footballer who played as a forward. In 1999, he was voted World Player of the Century by IFFHS. Peter Bolesław Schmeichel MBE (] ; born 18 November 1963) is a Danish former professional footballer who played as a goalkeeper, and was voted the IFFHS World's Best Goalkeeper in 1992 and 1993. Kasper Hvidt (born 6 February 1976 in Copenhagen) is a Danish retired handball goalkeeper, who lastly played for KIF Kolding and previous Danish national team. ... Hvidt was also voted as Goalkeeper of the Year March 20, 2009, second place was Thierry Omeyer ... Prediction: World's Best Goalkeeper (correct) Question Golden Reasoning Chain Docs Distractor Docs Adversarial Doc Prediction under adversary: IFFHS World's Best Defender Figure 1: HotpotQA example with a reasoning shortcut, and our adversarial document that eliminates this shortcut to necessitate multi-hop reasoning. HotpotQA σʔλ͸ຊདྷ͸ҎԼͷΑ͏ʹஈ֊Λ౿·͍ͤͯͨ  Kasper → (son of) → Peter → (voted as) → world’s best GK ໰୊จΛϚονͤ͞Δ͚ͩͰ౴͕͑ग़ͤͯ͠·͏ (shortcut)  αϯϓϦϯάͯ͠ௐ΂ͨΒ൒෼ఔ౓΋ shotcut ΛؚΜͰ͍ͨ nd Mohit Bansal hapel Hill sal}@cs.unc.edu What was the father of Kasper Schmeichel voted to be by the IFFHS in 1992? R. Bolesław Kelly MBE (] ; born 18 November 1963) is a Danish former professional footballer who played as a Defender, and was voted the IFFHS World's Best Defender in 1992 and 1993. Kasper Peter Schmeichel (] ; born 5 November 1986) is a Danish professional footballer who plays as a goalkeeper ... . He is the son of former Manchester United and Danish international goalkeeper Peter Schmeichel. Edson Arantes do Nascimento (] ; born 23 October 1940), known as Pelé (] ), is a retired Brazilian professional footballer who played as a forward. In 1999, he was voted World Player of the Century by IFFHS. Peter Bolesław Schmeichel MBE (] ; born 18 November 1963) is a Danish former professional footballer who played as a goalkeeper, and was voted the IFFHS World's Best Goalkeeper in 1992 and 1993. Kasper Hvidt (born 6 February 1976 in Copenhagen) is a Danish retired handball goalkeeper, who lastly played for KIF Kolding and previous Danish national team. ... Hvidt was also voted as Goalkeeper of the Year March 20, 2009, second place was Thierry Omeyer ... Prediction: World's Best Goalkeeper (correct) Question Golden Reasoning Chain Docs Distractor Docs Adversarial Doc Prediction under adversary: IFFHS World's Best Defender Figure 1: HotpotQA example with a reasoning shortcut, and our adversarial document that eliminates this shortcut to necessitate multi-hop reasoning. ಉ͡ shortcut ߏ଄Ͱݩͷ౴͑͸ม͑ͳ͍ adv. Doc Λ௥Ճ  ɾݩͷ౴͑ʹ GloVe ͷҙຯͰ͍ۙ΋ͷΛऔಘͯ͠ஔ׵  ɾ౴͕͑ໃ६͠ͳ͍Α͏ʹ title Λଞͷσʔλͷ΋ͷʹஔ׵  ɾ৽ͨʹ࢖༻ͨ͠ title ͷݩͷจষ΋Ҿͬுͬͯ͘Δ  ɾݩͷจষͰ౴͑ʹӨڹ͠ͳ͍෦෼Λ࡞੒ͨ͠จͱೖସ  ɹࠨͷྫͰ͸੺࿮ͷจͱ R. Boleslaw Kelly ͷจΛ  ɹ౴͑ʹӨڹ͠ͳ͍จʢ͜͜ʹ͸ࡌͤͯͳ͍ೖྗจʣͱೖସ ͜ͷσʔλΛ dev-set ʹ͢ΔͱϞσϧͷੑೳ͕ஶ͘͠௿Լ  ʢϞσϧ͕ shortcut Λ࢖ͬͯ౴͍͑ͯͨ͜ͱΛࣔͨ͠ʣ  ͦΕΛ౿·͑ͯ 2-hop Λ໌ࣔతʹऔΓೖΕͨϞσϧ΋ఏҊ /25 ۩ମྫ͸ https://www.aclweb.org/anthology/P19-1262/ ΑΓҾ༻

Slide 23

Slide 23 text

ఏҊϞσϧ 23 RNN RNN question bi-attention RNN RNN self-attention bi-attention Word Emb Char Emb context Word Emb Char Emb Query2Context Attention Softmax W,b Previous Control W,b W,b Control Unit Contextualized word emb question vector Context2Query and Query2Context Attention Softmax Context2Query Attention Bridge-entity Supervision RNN Start index RNN End index Figure 3: A 2-hop bi-attention model with a control unit. The Context2Query attention is modeled as in Seo et al. (2017). The output distribution cv of the control unit is used to bias the Query2Context attention. where W1, W2 and W3 are trainable parameters, and is element-wise multiplication. Then the query-to-context attention vector is derived as: control unit imitates human’s behavior when an- swering a question that requires multiple reasoning steps. For the example in Fig. 1, a human question Word Emb Char Emb context Word Emb Char Emb Contextualized word emb vector Figure 3: A 2-hop bi-attention model with a control unit. The Context2Query attention is modeled as in Seo et al. (2017). The output distribution cv of the control unit is used to bias the Query2Context attention. where W1, W2 and W3 are trainable parameters, and is element-wise multiplication. Then the query-to-context attention vector is derived as: mj = max1sS Ms,j pj = exp(mj) PJ j=1 exp(mj) qc = J X j=1 pjhj (2) We then obtain the question-aware context representation and pass it through another layer of BiLSTM: h0 j = [hj; cqj ; hj cqj ; cqj qc] h1 = BiLSTM(h0) (3) where ; is concatenation. Self-attention is modeled upon h1 as BiAttn(h1, h1) to produce h2. Then, we apply linear projection to h2 to get the start index logits for span prediction and the end index logits is modeled as h3 = BiLSTM(h2) followed by linear projection. Furthermore, the model uses a 3-way classifier on h3 to predict the answer as control unit imitates human’s behavior when an- swering a question that requires multiple reasoning steps. For the example in Fig. 1, a human reader would first look for the name of “Kasper Schmeichel’s father”. Then s/he can locate the correct answer by finding what “Peter Schme- ichel” (the answer to the first reasoning hop) was “voted to be by the IFFHS in 1992”. Recall that S, J are the lengths of the question and context. At each hop i, given the recurrent control state ci 1, contextualized question representation u, and question’s vector representation q, the control unit outputs a distribution cv over all words in the question and updates the state ci: cqi = Proj[ci 1; q]; cai,s = Proj(cqi us) cvis = softmax(cais); ci = S X s=1 cvi,s · us (4) where Proj is the linear projection layer. The dis- ৄࡉ͸ׂѪ͢Δ͕ɺcontrol unit Ͱ i ൪໨ͷ hop Ͱ context Λߟྀ࣭ͭͭ͠໰ͷͲͷ෦෼ʹ஫໨͢Δ͔Λௐ੔ Sentence level   supporting facts  prediction Text span prediction supporting fact Λܨ͙  entity Λ༧ଌ ౴͑Λ༧ଌ จষ͕ supporting fact  ͔൱͔Λ༧ଌ /25 ਤ͸ https://www.aclweb.org/anthology/P19-1262/ ΑΓҾ༻

Slide 24

Slide 24 text

࣮ݧ݁Ռ 24 Train Reg Reg Adv Adv Eval Reg Adv Reg Adv 1-hop Base 42.32 26.67 41.55 37.65 1-hop Base + sp 43.12 34.00 45.12 44.65 2-hop 47.68 34.71 45.71 40.72 2-hop + sp 46.41 32.30 47.08 46.87 Table 1: EM scores after training on the regular data or on the adversarial training set ADD4DOCS-RAND, and evaluation on the regular dev set or the ADD4DOCS- RAND adv-dev set. “1-hop Base” and ”2-hop” do not have sentence-level supporting-facts supervision. containing answer (4 or 8) and mixing strategy (randomly insert or prepend). We name these 4 dev sets “Add4Docs-Rand”, “Add4Docs-Prep”, “Add8Docs-Rand”, and “Add8Docs-Prep”. For adversarial training, we choose the “Add4Docs- Rand” training set since it is shown in Wang and Bansal (2018) that training with randomly inserted adversaries yields the model that is the most ro- bust to the various adversarial evaluation settings. In the adversarial training examples, the fake titles and answers are sampled from the original training set. We randomly select 40% of the adversarial ex- A4D-R A4D-P A8D-R A8D-P 1-hop Base 37.65 37.72 34.14 34.84 1-hop Base + sp 44.65 44.51 43.42 43.59 2-hop 40.72 41.03 37.26 37.70 2-hop + sp 46.87 47.14 44.28 44.44 Table 2: EM scores on 4 adversarial evaluation settings after training on ADD4DOCS-RAND. ‘-R’ and ‘-P’ represent random insertion and prepending. A4D and A8D stands for ADD4DOCS and ADD8DOCS adv- dev sets. in the ﬁrst row, the single-hop baseline trained on regular data performs poorly on the adversarial evaluation, suggesting that it is indeed exploit- ing the reasoning shortcuts instead of actually per- forming the multi-hop reasoning in locating the Train Regular Regular Adv Adv Eval Regular Adv Regular Adv 2-hop 47.68 34.71 45.71 40.72 2-hop - Ctrl 46.12 32.46 45.20 40.32 2-hop - Bridge 43.31 31.80 41.90 37.37 1-hop Base 42.32 26.67 41.55 37.65 Table 3: Ablation for the Control unit and Bridge-entity supervision, reported as EM scores after training on the regular or adversarial ADD4DOCS-RAND data, and evaluation on regular dev set and ADD4DOCS-RAND adv-dev set. Note that 1-hop Base is same as 2-hop without both control unit and bridge-entity supervision. sarial evaluation. After we add the sentence-level supporting-fact supervision, the 2-hop model (row D-P 84 59 70 44 set- and 4D Train Regular Regular Adv Adv Eval Regular Adv Regular Adv 2-hop 47.68 34.71 45.71 40.72 2-hop - Ctrl 46.12 32.46 45.20 40.32 2-hop - Bridge 43.31 31.80 41.90 37.37 1-hop Base 42.32 26.67 41.55 37.65 Table 3: Ablation for the Control unit and Bridge-entity supervision, reported as EM scores after training on train, dev ͷͦΕͧΕͰ adv. example Λ࢖ͬͨ݁Ռ   ɾࢦඪ͸ Exact Match (EM)  ɾsp ͸ sentence level prediction  ɾී௨ͷσʔλͰֶशͨ͠΋ͷΛ adv. dev ͰѱԽ  ɹʢshortcut ͕࢖͑ͳ͘ͳͬͯഁ୼ʣ  ɾఏҊͨ͠ 2-hop Ϟσϧ͸ߴੑೳ adv. training ༷ͯ͠ʑͳ adv. dev Ͱݕূ  ɾݩσʔλ͸ 10 ݸͷύϥάϥϑ͔Β੒Δ  ɾͦͷ͏ͪͷԿݸΛ adv. example ʹ͢Δ͔ {4 or 8}  ɹ2/10 ݸ͸౴͑Λಋ͘ͷʹඞཁͰ͜Ε͸ඞͣ࢒͢  ɾadv. example ΛϥϯμϜʹૠೖ͢Δ͔෇͚଍͔͢ {R or P} ablation study Ͱ෇͚Ճ͑ͨػೳΛݕূ  ɾcontrol unit ΋ bridge entity ֶश΋༗ޮ  /25 ݁Ռͷද͸ https://www.aclweb.org/anthology/P19-1262/ ΑΓҾ༻

Slide 1

Slide 1 text

Slide 2

Slide 2 text

Slide 3

Slide 3 text

Slide 4

Slide 4 text

Slide 5

Slide 5 text

Slide 6

Slide 6 text

Slide 7

Slide 7 text

Slide 8

Slide 8 text

Slide 9

Slide 9 text

Slide 10

Slide 10 text

Slide 11

Slide 11 text

Slide 12

Slide 12 text

Slide 13

Slide 13 text

Slide 14

Slide 14 text

Slide 15

Slide 15 text

Slide 16

Slide 16 text

Slide 17

Slide 17 text

Slide 18

Slide 18 text

Slide 19

Slide 19 text

Slide 20

Slide 20 text

Slide 21

Slide 21 text