Implicit Emotion Classification with Deep Contextualized Word Representations

IIIDYT AT IEST 2018: IMPLICIT EMOTION IIIDYT AT IEST 2018:
IMPLICIT EMOTION CLASSIFICATION WITH DEEP CLASSIFICATION WITH DEEP CONTEXTUALIZED WORD CONTEXTUALIZED WORD REPRESENTATIONS REPRESENTATIONS Jorge A. Balazs, Edison Marrese-Taylor, Yutaka Matsuo https://arxiv.org/abs/1808.08672 1

INTRODUCTION INTRODUCTION 2

PROPOSED PROPOSED APPROACH APPROACH 3

PREPROCESSING PREPROCESSING 4

PREPROCESSING PREPROCESSING We wanted to have a single format for
special tokens 4

special tokens The replacements were chosen arbitrarily 4

special tokens The replacements were chosen arbitrarily Shorter replacements did not impact performance significantly 4

special tokens The replacements were chosen arbitrarily Shorter replacements did not impact performance significantly Completely removing [#TRIGGERWORD#] had a negative impact in our best model. 4

special tokens The replacements were chosen arbitrarily Shorter replacements did not impact performance significantly Completely removing [#TRIGGERWORD#] had a negative impact in our best model. We tokenized the data using an emoji- aware modification of the twokenize.py script. 4

ARCHITECTURE ARCHITECTURE 5

HYPERPARAMETERS HYPERPARAMETERS ELMo Layer Oﬀicial implementation with default parameters Dimensionalities
ELMo output = BiLSTM output = for each direction Sentence vector representation = Fully-connected (FC) layer input = FC layer hidden = FC layer output = Loss Function Cross-Entropy Optimizer Default Adam ( , , ) Learning Rate Slanted triangular schedule ( ) (Howard and Ruder ) Regularization Dropout ( a er Elmo Layer and FC hidden; a er max-pooling layer) 2018 6

ENSEMBLES ENSEMBLES 7

ENSEMBLES ENSEMBLES We tried combinations of 9 trained models initialized
with diﬀerent random seeds. 7

ENSEMBLES ENSEMBLES We tried combinations of 9 trained models initialized
with diﬀerent random seeds. Similar to Bonab and Can ( ), we found out that ensembling 6 models yielded the best results. 2016 7

EXPERIMENTS AND EXPERIMENTS AND ANALYSES ANALYSES 8

ABLATION STUDY ABLATION STUDY 9

ABLATION STUDY ABLATION STUDY ELMo provided the biggest boost in
performance. 9

performance. Emoji also helped ( ). analysis 9

performance. Emoji also helped ( ). Concat pooling (Howard and Ruder ), did not help. analysis 2018 9

performance. Emoji also helped ( ). Concat pooling (Howard and Ruder ), did not help. Diﬀerent BiLSTM sizes did not improve results. analysis 2018 9

performance. Emoji also helped ( ). Concat pooling (Howard and Ruder ), did not help. Diﬀerent BiLSTM sizes did not improve results. POS tag embeddings of dimension 50 slightly helped. analysis 2018 9

performance. Emoji also helped ( ). Concat pooling (Howard and Ruder ), did not help. Diﬀerent BiLSTM sizes did not improve results. POS tag embeddings of dimension 50 slightly helped. SGD optimizer with simpler LR schedule (Conneau et al. ), did not help. analysis 2018 2017 9

ABLATION STUDY ABLATION STUDY Dropout 10

ABLATION STUDY ABLATION STUDY Dropout Best dropout configurations concentrated around
high values for word-level representations, and low values for sentence-level representations. 10

ERROR ANALYSIS ERROR ANALYSIS Confusion Matrix Classification Report

ERROR ANALYSIS ERROR ANALYSIS Confusion Matrix Classification Report anger was
the hardest class to predict.

the hardest class to predict. joy was the easiest one

the hardest class to predict. joy was the easiest one (probably due to an annotation artifact).

ERROR ANALYSIS ERROR ANALYSIS PCA projection of test sentence representations
12

ERROR ANALYSIS ERROR ANALYSIS PCA projection of test sentence representations
Separate joy cluster corresponds to those sentences containing the “un[#TRIGGERWORD#]” pattern. 12

AMOUNT OF TRAINING DATA AMOUNT OF TRAINING DATA 13

AMOUNT OF TRAINING DATA AMOUNT OF TRAINING DATA Upward trend
suggests that the model is expressive enough to learn from new data, and is not overfitting the training set. 13

EMOJI & HASHTAGS EMOJI & HASHTAGS Number of examples with
and without emoji and hashtags. Numbers between parentheses correspond to the percentage of examples classified correctly. 14

EMOJI & HASHTAGS EMOJI & HASHTAGS Number of examples with
and without emoji and hashtags. Numbers between parentheses correspond to the percentage of examples classified correctly. Tweets and hashtags (to a lesser extent), seem to be good discriminating features. 14

EMOJI & HASHTAGS EMOJI & HASHTAGS ❤ 15

EMOJI & HASHTAGS EMOJI & HASHTAGS ❤ rage , mask
, and cry , were the most informative emoji. 15

, and cry , were the most informative emoji. Counterintuitively, sob was less informative than , despite representing a stronger emotion. 15

, and cry , were the most informative emoji. Counterintuitively, sob was less informative than , despite representing a stronger emotion. Removing sweat_smile and confused improved results. 15

CONCLUSIONS CONCLUSIONS 16

CONCLUSIONS CONCLUSIONS We obtained competitive results with: 16

CONCLUSIONS CONCLUSIONS We obtained competitive results with: simple preprocessing, 16

CONCLUSIONS CONCLUSIONS We obtained competitive results with: simple preprocessing, almost
no external data dependencies (save for the pretrained ELMo language model), 16

CONCLUSIONS CONCLUSIONS We obtained competitive results with: simple preprocessing, almost
no external data dependencies (save for the pretrained ELMo language model), a simple architecture. 16

CONCLUSIONS CONCLUSIONS 17

CONCLUSIONS CONCLUSIONS We showed that: 17

CONCLUSIONS CONCLUSIONS We showed that: The “un[#TRIGGERWORD#]” artifact had significant
impact in the final example representations (as shown by the PCA projection). 17

impact in the final example representations (as shown by the PCA projection). This in turn made the model better at classifying joy examples. 17

impact in the final example representations (as shown by the PCA projection). This in turn made the model better at classifying joy examples. Emoji and hashtags were good features for implicit emotion classification. 17

FUTURE WORK FUTURE WORK 18

FUTURE WORK FUTURE WORK Ensemble models with added POS tag
features. 18

features. Perform fine-grained hashtag analysis. 18

features. Perform fine-grained hashtag analysis. Implement architectural improvements. 18

CLOSING WORDS CLOSING WORDS Our implementation is available at: https://github.com/jabalazs/implicit_emotion
19

REFERENCES REFERENCES Bonab, Hamed R., and Fazli Can. 2016. “A
Theoretical Framework on the Ideal Number of Classifiers for Online Ensembles in Data Streams.” In Proceedings of the 25th Acm International on Conference on Information and Knowledge Management, 2053–6. CIKM ’16. New York, NY, USA: ACM. . Conneau, Alexis, Douwe Kiela, Holger Schwenk, Loïc Barrault, and Antoine Bordes. 2017. “Supervised Learning of Universal Sentence Representations from Natural Language Inference Data.” In Proceedings of the 2017 Conference on Empirical Methods in Natural Language Processing, 670–80. Copenhagen, Denmark: Association for Computational Linguistics. . Howard, Jeremy, and Sebastian Ruder. 2018. “Universal Language Model Fine-tuning for Text Classification.” ArXiv E-Prints. . https://doi.org/10.1145/2983323.2983907 https://www.aclweb.org/anthology/D17-1070 http://arxiv.org/abs/1801.06146 20

Implicit Emotion Classification with Deep Conte...

Implicit Emotion Classification with Deep Contextualized Word Representations

Other Decks in Research

Featured

Transcript