Upgrade to Pro
— share decks privately, control downloads, hide ads and more …
Speaker Deck
Features
Speaker Deck
PRO
Sign in
Sign up for free
Search
Search
Using millions of emoji occurrences to learn an...
Search
Yuto Kamiwaki
December 16, 2018
Research
0
110
Using millions of emoji occurrences to learn any-domain representations for detecting sentiment, emotion and sarcasm
2018/12/17 文献紹介の発表内容
Yuto Kamiwaki
December 16, 2018
Tweet
Share
More Decks by Yuto Kamiwaki
See All by Yuto Kamiwaki
Emo2Vec: Learning Generalized Emotion Representation by Multi-task Training
yuto_kamiwaki
0
120
Modeling Naive Psychology of Characters in Simple Commonsense Stories
yuto_kamiwaki
1
210
Epita at SemEval-2018 Task 1: Sentiment Analysis Using Transfer Learning Approach
yuto_kamiwaki
0
130
Tensor Fusion Network for Multimodal Sentiment Analysis
yuto_kamiwaki
0
260
Sentiment Analysis: It’s Complicated!
yuto_kamiwaki
0
81
ADAPT at IJCNLP-2017 Task 4: A Multinomial Naive Bayes Classification Approach for Customer Feedback Analysis task
yuto_kamiwaki
0
160
EmoWordNet: Automatic Expansion of Emotion Lexicon Using English WordNet
yuto_kamiwaki
0
110
ATTENTION-BASED LSTM FOR PSYCHOLOGICAL STRESS DETECTION FROM SPOKEN LANGUAGE USING DISTANT SUPERVISION
yuto_kamiwaki
0
150
BB_twtr at SemEval-2017 Task 4: Twitter Sentiment Analysis with CNNs and LSTMs
yuto_kamiwaki
0
250
Other Decks in Research
See All in Research
投資戦略202508
pw
0
560
ストレス計測方法の確立に向けたマルチモーダルデータの活用
yurikomium
0
1.5k
問いを起点に、社会と共鳴する知を育む場へ
matsumoto_r
PRO
0
610
Stealing LUKS Keys via TPM and UUID Spoofing in 10 Minutes - BSides 2025
anykeyshik
0
120
2025年度人工知能学会全国大会チュートリアル講演「深層基盤モデルの数理」
taiji_suzuki
25
18k
IMC の細かすぎる話 2025
smly
2
630
論文読み会 SNLP2025 Learning Dynamics of LLM Finetuning. In: ICLR 2025
s_mizuki_nlp
0
210
EcoWikiRS: Learning Ecological Representation of Satellite Images from Weak Supervision with Species Observation and Wikipedia
satai
3
140
[輪講] SigLIP 2: Multilingual Vision-Language Encoders with Improved Semantic Understanding, Localization, and Dense Features
nk35jk
2
1k
Submeter-level land cover mapping of Japan
satai
3
300
MIRU2025 チュートリアル講演「ロボット基盤モデルの最前線」
haraduka
15
8k
数理最適化に基づく制御
mickey_kubo
6
730
Featured
See All Featured
Building a Scalable Design System with Sketch
lauravandoore
462
33k
For a Future-Friendly Web
brad_frost
180
9.9k
A designer walks into a library…
pauljervisheath
207
24k
Performance Is Good for Brains [We Love Speed 2024]
tammyeverts
12
1.1k
The Invisible Side of Design
smashingmag
301
51k
KATA
mclloyd
32
14k
Embracing the Ebb and Flow
colly
87
4.8k
Fireside Chat
paigeccino
39
3.6k
Building Better People: How to give real-time feedback that sticks.
wjessup
368
19k
Why You Should Never Use an ORM
jnunemaker
PRO
59
9.5k
Gamification - CAS2011
davidbonilla
81
5.4k
Why Our Code Smells
bkeepers
PRO
339
57k
Transcript
Using millions of emoji occurrences to learn any-domain representations for
detecting sentiment, emotion and sarcasm Nagaoka University of Technology Yuto Kamiwaki Literature Review
Literature • Using millions of emoji occurrences to learn any-domain
representations for detecting sentiment, emotion and sarcasm • Bjarke Felbo, Alan Mislove, Anders Søgaard, Iyad Rahwan, Sune Lehmann • EMNLP 2017 2
Abstract • sentiment analysis, emotion analysis and sarcasm classificationにおける8つのbenchmarkでSoTA達成 •
感情ラベルの多様性が以前のdistant supervisonのアプ ローチよりもパフォーマンスの向上をもたらすことを確認 3
Introduction • NLPのタスクでは,アノテーション済み(感情が付与された)の データは少ない. • Distant supervisionを用いてSoTAを達成している研究があ る. Distant supervision
: (http://web.stanford.edu/~jurafsky/mintz.pdf) ラベル付きデータの情報を手がかりに全く別のラベルなしデータからラベル付きの学 習データを生成し、モデルを学習する手法 4
Related work • Ekman, Plutchikなどの感情の理論を用いて手作業によって 分類 ◦ 感情の理解が難しく,時間がかかる. • official
emoji tables (Eisner et al., 2016)からembeddingす る手法 ◦ emojiの使われ方を考慮しない. • マルチタスク学習 ◦ データストレージの観点から問題あり. 5
Pretraining • 2013年1月から2017年6月までのTweet data(emojiあり) • Only English tweets without URL’s
are used for the pretraining dataset. • All tweets are tokenized on a word-by-word basis. 6
Model 7
Transfer Learning(ChainThaw) 8
Emoji Prediction 9
Benchmarking 10 8 Benchmarks(3tasks,5domains)
Benchmarking 11
Importance of emoji diversity 12 Pos/Neg Emoji:8 types DeepMoji:64 types
感情ラベルの多様性が重要 64種類のemojiの細かい ニュアンスを学習できている. (次ページの図を参照)
Importance of emoji diversity 13
Model architecture 14 Pretraining時点では,差がない benchmark時点では,Attention ありの方が精度が高い 低層の特徴へのアクセスが簡単 勾配消失がなく,学習可能
Analyzing the effect of pretraining 15 Pretraining+chainthawで語彙が 増加 ->word coverageが改善
Comparing with human-level agreement 16 Human:76.1% Deepmoji:82.4% Deepmojiの方が,精度 が高い (実験内容については,論文
を参照)
Conclusion • sentiment analysis, emotion analysis and sarcasm classificationにおける8つのbenchmarkでSoTA達成 •
感情ラベルの多様性が以前のdistant supervisonのアプ ローチよりもパフォーマンスの向上をもたらすことを確認 • Pretraining済みモデルを公開 ◦ (Demo : https://deepmoji.mit.edu/) 17