Practical and Interpretable Deep Learning Techniques in Our Iyatomi’s Lab

Practical and Interpretable Deep Learning Techniques in Our Iyatomi’s Lab
Shunsuke Kitada 1st year Ph.D student at Major in Applied Informatics, Graduate School of Science and Engineering, Hosei University The 1st Univ. Carthage - Hosei International Joint Webinar with honorable support by the Embassy of the Republic of Tunisia in Japan. Recent Issues in Intelligent Robotics, Machine Learning and Distributed System Mar. 17th, 2021 The ﬁgures and formulas presented in this presentation are borrowed/captured from the papers.

Self-introduction Shunsuke Kitada • 1st year Ph.D student in Hosei
Univ. • JSPS Research Fellow DC2 Research Interest: • Natural Language Processing (NLP) ◦ Learning character-level compositionality ▪ From Kanji [Kitada+ AIPRW’18, Aoki+ AACL SRW’20] ▪ From Arabic [Daif+ ACL SRW’20] ◦ Developing perturbation robust and interpretable deep learning models [Kitada+ IEEE Access’21, Kitada+ CoRR’21] • Medical image processing ◦ Recognizing skin cancer from skin image [Kitada+ CoRR’18] • Computational advertising ◦ Supporting to create good ad creatives [Kitada+ KDD’19] 2 HP: shunk031.me GitHub Japanese characters

About our Iyatomi’s lab 3 Automatic plant disease diagnosis Cybersecurity
CBIR on MRI Skin cancer Natural language processing (NLP)

About our Iyatomi’s lab 4 Automatic plant disease diagnosis Cybersecurity
CBIR on MRI Skin cancer Natural language processing (NLP)

Natural language processing with Deep Learning Models 5 • Natural
language processing (NLP) ◦ One of a field of AI that gives the machines the ability to read, understand and derive meaning from human languages ◦ Deep learning models have provided excellent prediction performance in this field as well 猫猫 b'54yr' Cat in Japanese The key idea to improve the two aspect is Attention Mechanisms and Adversarial Training However, the models generally become a black box that is difficult to interpret for the prediction ➜ In recent years, deep learning models have placed more emphasis on the interpretability and robustness Introduction > Contribution > Basemodel > Methods > Experiments > Conclusion

Attention Mechanisms in NLP 6 Attention mechanisms [Bahdanau+’14] • learn
conditional distributions over input units to compose a weighted context vector • significantly contribute to improving the performance of NLP tasks, e.g., text classification [Lin+’17], question answering [Golub+’16], natural language inference [Parikh+’16] Image from Bahdanau+’14 Introduction > Contribution > Basemodel > Methods > Experiments > Conclusion Interpretability through the mechanisms • Attention weights are often claimed to afford insights into the “inner-workings” of models ➜ “Attention provides an important way to explain the workings of neural models” [Li+’16] • The claims that attention provides interpretability are common in the literature [Xu+’15, Choi+’16, Xie+’17, Lin+’17] Attention heatmap of Yelp reviews with 5 star review Image from Lin+’17

Attention Mechanisms in NLP 7 Attention mechanisms [Bahdanau+’14] • learn
conditional distributions over input units to compose a weighted context vector • significantly contribute to improving the performance of NLP tasks e.g., text classification [Lin+’17], question answering [Golub+’16], natural language inference [Parikh+’16] Image from Bahdanau+’14 Interpretability through the mechanisms • Attention weights are often claimed to afford insights into the “inner-workings” of models ➜ “Attention provides an important way to explain the workings of neural models” [Li+’16] • The claims that attention provides interpretability are common in the literature [Xu+’15, Choi+’16, Xie+’17, Lin+’17] Attention heatmap of Yelp reviews with 5 star review Image from Lin+’17 However, it has been pointed out that DNN models tend to be locally unstable, and even tiny perturbations to the original inputs [Szegedy+’13] or attention mechanisms [Jain+’19] can mislead the models. ➜ Maliciously perturbations are called adversarial examples or adversarial perturbations Introduction > Contribution > Basemodel > Methods > Experiments > Conclusion

AT is widely used in the various NLP field: •
Text classification [Miyato+’16, Sato+’18] • Part-of-speech tagging [Yasunaga+’18] • Relation extraction [Wang+’18] Overcome the vulnerability of adversarial examples: Adversarial Training 8 Adversarial Training (AT) [Goodfellow+’14] • aims to improve the robustness of a model to input perturbations by training on adversarial examples • primarily explored in image recognition field and demonstrate the enhanced robustness [Shaham+’18] In the context of attention mechanisms in NLP, yet the specific effects of the robustness from AT are unclear. Image from Goodfellow+’14 Image from Miyato+’16 Introduction > Contribution > Basemodel > Methods > Experiments > Conclusion

The attention weight of each word is considered an indicator
of the importance of each word ➜ In terms of interpretability, the weights is considered a higher-order feature than the word embeddings AT to attention mechanisms is expected to be more eﬀective Adversarial training in NLP 9 Adversarial perturbation to word embeddings • AT to word embeddings ◦ Improving the text classiﬁcation performance by applying AT to a word embedding space [Miyato+’16] • interpretable AT to word embeddings ◦ Restricting the direction of the perturbations to existing words in the word embedding space [Sato+’18] Image from Sato+’18 AT [Miyato+’16] iAT [Sato+’18] Introduction > Contribution > Basemodel > Methods > Experiments > Conclusion

Main contribution of adversarial training for attention mechanisms (from my
recent work [Kitada+ IEEE Access’21]) 10 Investigating the idea/technique of employing AT for attention mechanisms, the following ﬁndings is obtained by the AT for attention mechanisms: • improves the prediction performance of various NLP tasks • helps the model learn cleaner attention • is much less independent concerning perturbation size Introduction > Contribution > Basemodel > Methods > Experiments > Conclusion Image from Kitada+’21

Brief introduction of Adversarial Training for Attention Mechanisms 11 Base
model Following [Jain+’19], 1-layer bi-LSTM with additive attention mechanism was used as base model Introduction > Contribution > Basemodel > Methods > Experiments > Conclusion Image from Kitada+’21 • Input layer ◦ Word Embeddings • Intermediate layer ◦ Additive attention mechanisms ▪ The AT for attention mechanisms was employed to the layer • Output layer ◦ Prediction for target task

Attention AT: Adversarial Training for Attention Mechanisms 12 The main
idea is to employ AT to attention mechanism ã: • The adversarial perturbation is deﬁned as the worst-case perturbation of a size ε that maximizes the loss function of the current model Input word sequence with perturbated attention score Perturbation Ground Truth • Adversarial perturbation was constructed as ã adv • Train the model with adversarial examples Introduction > Contribution > Basemodel > Methods > Experiments > Conclusion

Attention iAT: Interpretable Adversarial Training for Attention Mechanisms 13 Attention
iAT enhances the difference in attention. The difference leads to clear and interpretable attention. • defines the normalized difference vector as the normalized difference between attention to in a sentence: Input word sequence with perturbated attention score Perturbation Ground Truth • defines perturbation for attention with trainable parameters Introduction > Contribution > Basemodel > Methods > Experiments > Conclusion where, where, • seeks the worst- case weights of the difference vectors that maximize the loss function

Experiments | Task & Model settings 14 • Task and
Dataset ◦ Binary classiﬁcation (BC): 4 datasets ▪ Stanford Sentiment Treebank (SST) [Socher+’13], IMDB Movie Review Corpus [Maas+’11], 20Newsgroups Corpus [Lang+’95], AgNews Corpus [Zhang+’15] ◦ Question answering (QA): 2 datasets ▪ CNN news [Hermann+’15], bAbI task 1, 2, 3 [Weston+’16] ◦ Natural language inference (NLI): 2 datasets ▪ SNLI [Bowman+’15], MultiNLI [Williams+’17] • Model Settings ◦ Vanilla model (described in basemodel section) [Jain+’19] ◦ Word AT [Miyato+’16]: apply AT for word embedding ◦ Word iAT [Sato+’18]: apply iAT for word embedding ◦ Attention RP: apply random perturbation for attention ◦ Attention AT (proposed): apply AT for attention ◦ Attention iAT (proposed): apply iAT for attention Introduction > Contribution > Basemodel > Methods > Experiments > Conclusion

Evaluation Criteria 15 • Prediction performance (followed by [Jain+’19]) ◦
F1 score, accuracy, micro-F1 for BC, QA, NLI • Correlation with word importance ◦ How the attention weights obtained through the proposals agreed with the importance of words calculated by the gradients [Simonyan+’13] • Eﬀects of perturbation size ◦ Randomly chose the value of ε in the 0-30 range and ran the training 100 times Introduction > Contribution > Basemodel > Methods > Experiments > Conclusion The movie was pretty good The movie was pretty good Word importance obtained from backprop. gradient Learned attention weight How agreed based on Pearson’s correlation

Results | Binary classiﬁcation task 16 • Prediction performance ◦
Attention AT/iAT showed a clear advantage over the model without AT as well as other AT-based technique • Correlation with word importance ◦ The attention to the words obtained with the Attention AT/iAT notable correlated with the importance of the word as determined by the gradients Introduction > Contribution > Basemodel > Methods > Experiments > Conclusion

Results | QA and NLI tasks 17 Introduction > Contribution
> Basemodel > Methods > Experiments > Conclusion

Results | QA and NLI tasks 18 We observed similar
trends in other datasets/tasks. The detail of the results are show in [Kitada+’20] Introduction > Contribution > Basemodel > Methods > Experiments > Conclusion

19 Vanilla Attention AT Visualization of learned attention weights for
each words Attention iAT Introduction > Contribution > Basemodel > Methods > Experiments > Conclusion

20 Vanilla Attention AT Attention iAT Introduction > Contribution >
Basemodel > Methods > Experiments > Conclusion

21 Attention AT yielded clearer attention compared to the Vanilla
model or Attention iAT ➜ Attention AT tended to strongly focus attention on a few words Attention AT Vanilla Attention iAT Introduction > Contribution > Basemodel > Methods > Experiments > Conclusion

22 Attention AT yielded clearer attention compared to the Vanilla
model or Attention iAT ➜ Attention AT tended to strongly focus attention on a few words Attention AT Vanilla Attention iAT Introduction > Contribution > Basemodel > Methods > Experiments > Conclusion

23 Vanilla Attention AT Attention iAT Introduction > Contribution >
Basemodel > Methods > Experiments > Conclusion

24 Attention AT Vanilla Attention iAT In terms of the
correlation of word importance based on attention weights and gradient-based word importance: Attention iAT demonstrated higher similarities than the other models. Introduction > Contribution > Basemodel > Methods > Experiments > Conclusion

25 Attention AT Vanilla Attention iAT In terms of the
correlation of word importance based on attention weights and gradient-based word importance: Attention iAT demonstrated higher similarities than the other models. Introduction > Contribution > Basemodel > Methods > Experiments > Conclusion

Eﬀects of perturbation size ε 26 • The performances of
the conventional Word AT/iAT deteriorated according to the increase in the perturbation size. • Attention AT/iAT maintained almost the same prediction performance Introduction > Contribution > Basemodel > Methods > Experiments > Conclusion

Conclusion | Adversarial training for attention mechanisms 27 • The
key idea of improving the model interpretability and the prediction performance in the deep learning model: ◦ Attention mechanisms and Adversarial training • My recent work is proposed Attention AT and Attention iAT, training technique to robust and interpretable attention mechanisms that exploit adversarial training ◦ achieves better performance than techniques using AT for word embedding • Attention iAT introduced adversarial perturbations that ◦ emphasized diﬀerences in the importance of words ◦ combined high accuracy with clear attention, which was strongly correlated with the word Introduction > Contribution > Basemodel > Methods > Experiments > Conclusion Thank you for your kind attention :) [email protected] HP: shunk031.me Feel free to contact me!

Practical and Interpretable Deep Learning Techn...

Practical and Interpretable Deep Learning Techniques in Our Iyatomi’s Lab

Shunsuke KITADA

More Decks by Shunsuke KITADA

Other Decks in Research

Featured

Transcript

Practical and Interpretable Deep Learning Techniques in Our Iyatomi’s Lab

Self-introduction Shunsuke Kitada • 1st year Ph.D student in Hosei

About our Iyatomi’s lab 3 Automatic plant disease diagnosis Cybersecurity

About our Iyatomi’s lab 4 Automatic plant disease diagnosis Cybersecurity

Natural language processing with Deep Learning Models 5 • Natural

Attention Mechanisms in NLP 6 Attention mechanisms [Bahdanau+’14] • learn

Attention Mechanisms in NLP 7 Attention mechanisms [Bahdanau+’14] • learn

AT is widely used in the various NLP ﬁeld: •

The attention weight of each word is considered an indicator

Main contribution of adversarial training for attention mechanisms (from my

Brief introduction of Adversarial Training for Attention Mechanisms 11 Base

Attention AT: Adversarial Training for Attention Mechanisms 12 The main

Attention iAT: Interpretable Adversarial Training for Attention Mechanisms 13 Attention

Experiments | Task & Model settings 14 • Task and

Evaluation Criteria 15 • Prediction performance (followed by [Jain+’19]) ◦

Results | Binary classiﬁcation task 16 • Prediction performance ◦

Results | QA and NLI tasks 17 Introduction > Contribution

Results | QA and NLI tasks 18 We observed similar

19 Vanilla Attention AT Visualization of learned attention weights for

20 Vanilla Attention AT Attention iAT Introduction > Contribution >

21 Attention AT yielded clearer attention compared to the Vanilla

22 Attention AT yielded clearer attention compared to the Vanilla

23 Vanilla Attention AT Attention iAT Introduction > Contribution >

24 Attention AT Vanilla Attention iAT In terms of the

25 Attention AT Vanilla Attention iAT In terms of the

Eﬀects of perturbation size ε 26 • The performances of

Conclusion | Adversarial training for attention mechanisms 27 • The