$30 off During Our Annual Pro Sale. View Details »

第8回最先端NLP勉強会 'Sarcastic or Not' (Ghosh+, 2015) の紹介

Shuntaro Yada
September 10, 2016

第8回最先端NLP勉強会 'Sarcastic or Not' (Ghosh+, 2015) の紹介

Sarcastic or Not: Word Embeddings to Predict the
Literal or Sarcastic Meaning of Words
Debanjan Ghosh, Weiwei Guo, and Smaranda Muresan
EMNLP2015

Shuntaro Yada

September 10, 2016
Tweet

More Decks by Shuntaro Yada

Other Decks in Research

Transcript

  1. Sarcastic or Not: Word Embeddings to Predict the Literal or

    Sarcastic Meaning of Words Debanjan Ghosh, Weiwei Guo, and Smaranda Muresan EMNLP2015 ୈ8ճ࠷ઌ୺NLPษڧձ 2016/09/10 ൃද୲౰ɿ໼ా ॡଠ࿠ʢ౦େ ӨӜݚ | D1ʣ 1
  2. Abstract • Sarcasm Detection (ൽ೑ݕग़) Λ Literal/Sarcastic Sense Disambiguation (LSSD)

    λεΫͱଊ͑௚ͨ͠ • ൑ఆର৅Λ ೖྗςΩετશମˠ୯ޠ୯Ґ ΁ • 2ͭͷ໰୊ʹऔΓ૊Μͩ 1. ൽ೑ͱͯ͠࢖ΘΕ͏Δޠ (=: target words) Λ͍͔ʹूΊΔ͔ 2. ςΩετͱ target words ͕ೖྗ͞Εͨͱ͖ɺtarget words ͷ ҙຯ͕ Literal/Sarcastic ͷͲͪΒͰ͋Δ͔Λ͍͔ʹ൑ఆ͢Δ͔ 2
  3. 2 Collection of Target Words 1. Ϋϥ΢υιʔγϯάͰൽ೑ΛؚΉίʔύεΛ௚ፊ తͳදݱʹஔ͖׵͑ͨίʔύεΛੜ੒͢Δ 2. ಘΒΕͨύϥϨϧͳίʔύεʹ

    unsupervised alignment ʢ୯ޠΞϥΠϝϯτʣΛద༻ͯ͠ରٛޠ ϖΞΛऩू 3
  4. 2 Collection of Target Words I love going to the

    dentist I hate going to the dentist 1. Turkers ʹΑΔݴ͍׵͑ 3. ݩͷπΠʔτʹ͋ͬͨ ΄͏ͷ୯ޠΛ target words ʹՃ͑Δ 2. Unsupervised alignment ͰϖΞΛ
 ݟ͚ͭΔ 4 (1) Ϋϥ΢υιʔγϯά 1000 πΠʔτ from searching for #sarcasm/#sarcastic 5000 πΠʔτ by 5 Turkers
  5. 2 Collection of Target Words (2) Unsupervised alignment 1. co-training

    algorithm for paraphrase detection (Barzilay and McKeown, 2001) 2. Statistical machine translation alignment (IBM Model 4 with HMM alignment in Giza+ +; Och and Ney, 2000) 5 → 367 ϖΞ → 70 ϖΞ obtained # of t φ ≥ 0.8
  6. 3 Literal/Sarcastic Sense Disambiguation • ೖྗςΩετͱ target word (t) ͕ࢦఆ͞Εͨͱ͖

    ʹɺt ͷҙຯ͕ sarcastic (S) ͔ literal (L) ͔Λਪఆ • ֶशͷͨΊʹɺ֤ t ʹ͍ͭͯ S/L ͦΕͧΕͷҙຯͰ ༻͍ΒΕ͍ͯΔྫจσʔλ͕ඞཁ 6
  7. 3.1 Data Collection • target words ΛؚΉπΠʔτΛݕࡧ͠ɺ#sarcasm/ #sarcastic ΛؚΉ΋ͷΛ S,

    ͦ͏Ͱͳ͍΋ͷΛ L ͱ
 ͨ͠ • ͨͩ͠ L ʹ͍ͭͯ͸ ײ৘ʢϙδɾωΨʣ͕ϋογϡ λάͱͯ͠ϥϕϧ෇͚͞Ε͍ͯΔ΋ͷΛ Lsent ͱͨ͠ • S/Lsent ͷํ͕೉͍͠ͱ͍ΘΕΔ (Gonzalez et al., 2011) 7
  8. 3.1 Data Collection • શମͰ 2,542,249 πΠʔτΛ ऩू͠ɺ
 ֶशɿ։ൃɿςετ =

    8 : 1 : 1 ʹ෼ׂͨ͠ • 70 ͋ͬͨ t ͷ͏ֶͪशσʔλ ͕ 400 ʹຬͨͳ͍΋ͷ͸औΓ আ͖ɺ37 छྨΛ࣮ݧର৅ʹ target words ͱ ֶशσʔλ਺ʢΧοί಺ʣ 8
  9. 3.2 Learning Approaches • ෼ࢄΞϓϩʔνɿt ͷ෼ࢄදݱΛ༻͍Δ • ෼ྨΞϓϩʔνɿt ʹ͍ͭͯͷ S/L

    ෼ྨ໰୊ͱ͢Δ 9
  10. 3.2.1 Distributional Approaches 1. ֤ t ʹ͍ͭͯɺS/L ͦΕͧΕʹ෼ࢄදݱʹΑΔ୯ޠ ϕΫτϧ (vs

    , vl ) Λੜ੒ 2. ೖྗςΩετ u ͔Βɺu ͷͳ͔Ͱͷ t ͷ୯ޠϕΫ τϧΛੜ੒ (vu ) 3. vu ͕ vs , vl ͷͲͪΒͱ͍͔ۙΛԿΒ͔ͷϕΫτϧ ؒྨࣅ౓ࢦඪͰ൑ఆ 10
  11. 3.2.1 Distributional Approaches • PPMI baseline (Church and Hanks, 1990)

    • Word Embeddings: • WTMF (Guo and Diab, 2012) • word2vec (Mikolov et al., 2013) • GloVe (Pennington et al., 2014) 11
  12. 3.2.1 Distributional Approaches PPMI: Positive Pointwise Mutual Information model •

    ֤ t ʹ͍ͭͯ S/L ͷσʔληοτ͝ͱʹ PPMI Ͱ্ Ґ࠷େ 1000 ͷจ຺ޠΛநग़͠ɺͦΕΒͷ TF-IDF Λ஋ʹͱΔϕΫτϧΛੜ੒͢Δ • ϕΫτϧؒྨࣅ౓͸ cos ྨࣅ౓Ͱଌఆ 12
  13. target words ͱ ͦͷจ຺ޠʢσʔληοτͷ෼ׂ͝ͱʣ 13

  14. 3.2.1 Distributional Approaches Word Embeddings • 3 ͭͷΞϧΰϦζϜʹ͓͍ͯɺϕΫτϧͷ࣍ݩ͸ 100 ʹݻఆ

    • σʔληοτ͕େ͖͍΄Ͳ࣍ݩ΋େ͖͘͢Δ͜ͱ ͕ଟ͍͕ɺ100 ສΦʔμʔͩͱ 100 ͕ී௨ • word2vec Ͱ͸ skip-gram ͱ CBOW ͦΕͧΕద༻ 14
  15. 3.2.1 Distributional Approaches Word Embeddings • ϕΫτϧؒྨࣅ౓ʹ maximum-valued matrix-element (MVME)

    Ξ ϧΰϦζϜ (Islam and Inkpen, 2008) Λ word embeddings ʹରԠ ͤͨ͞ MVMEwe Λద༻ • MVME ͸ 2 ͭͷจͷؒͰશ୯ޠͷ૊͝ͱͷྨࣅ౓Λ΋ͱʹ୯ޠ ΞϥΠϝϯτ͢ΔͨΊͷख๏ • 2ͭͷ෼ࢄදݱϕΫτϧؒͷྨࣅ౓Λࢉग़͢ΔͨΊʹɺ྆ऀͦΕͧ Εͷจ຺ޠͷ෼ࢄදݱϕΫτϧશϖΞͷྨࣅ౓Λฒ΂ͨߦྻΛ࢖ ͏ 15
  16. I going to the dentist ignored being waking work sick

    ... suppose: u = “I love going to the dentist” (t = “love”) love ͷจ຺ޠ ck in S love ͷจ຺ޠ wj in u -0.9 -0.9 -0.9 -0.9 -0.9 -0.9 -0.9 0.8 0.5 0.3 0.5 -0.9 -0.9 -0.9 -0.9 -0.9 -0.9 -0.9 0.3 -0.9 0.1 -0.1 -0.9 -0.9 -0.9 Sim = 0 ... ... ... ... ... ߦྻ M Sim = 0.8 0.3 repeat until max = 0 or size(M) = (0, 0) 16
  17. 3.2.2 Classification Approaches • 37 target words ͦΕͧΕʹ͍ͭͯɺS/L ͱ S/Lsent

    ͷ ෼ྨΛߦ͏ʢ෼ྨث͕ 37 × 2 ݸʣ • ෼ྨثʹ libSVM toolkit (Chang and Lin, 2011) Λ
 ࢖༻ • ϋΠύʔύϥϝʔλͷνϡʔχϯάʹ։ൃσʔλΛ ༻͍ͨ 17
  18. 3.2.2 Classification Approaches • SVM Baseline: ैདྷख๏ͷయܕతͳ࣮૷ • SVM with

    MVMEwe Kernel: ΧʔωϧΛ MVMEwe ʹ 18
  19. 3.2.2 Classification Approaches SVM Baseline • n-gram ͱࣙॻϕʔεͷೋ஋ૉੑ • Bag-of-Words

    ʹΑΔ୯ޠϕΫτϧ • LIWC ࣙॻ (Pennebaker et al., 2001) ͷ෼ྨ • Wikipedia ͔Βಘͨؒ౤ࢺɺه߸ɺֆจࣈͷ༗ແ • González-Ibáñez et al., 2011/Tchokni et al., 2014 ͷఏҊख๏͕ ϕʔε 19
  20. 3.2.2 Classification Approaches SVM with MVMEwe Kernel Χʔωϧʹ MVMEwe ͱಉ༷ͷख๏

    kernelwe Λద༻ • 2 πΠʔτؒͷྨࣅ౓Λɺ྆ऀͦΕͧΕʹؚ·ΕΔ ୯ޠͷ෼ࢄදݱϕΫτϧશϖΞͷྨࣅ౓Λฒ΂ͨߦ ྻ͔Βࢉग़ • ͦͷͱ͖༻͍Δ word embeddings ͸෼ࢄΞϓϩʔ νͱಉ͡ 3 छྨΛͦΕͧΕద༻ 20
  21. 4 Results and Discussions 21 ෼ࢄΞϓϩʔν

  22. 4 Results and Discussions 22 ෼ྨΞϓϩʔν

  23. • ैདྷख๏ͷ SVM ͸ֶशσʔλͷ ଟ͔ͬͨ target words Λ޷Ή • Word

    embeddings Λ༻͍ͨख๏ ͸શൠʹɺֶशσʔλ͕গͳͯ͘ ΋ߴ͍ੑೳ͕ग़͍ͯΔ 23 ↓෼ྨΞϓϩʔν ↑෼ࢄΞϓϩʔν target words ͱ ֶशσʔλ਺ʢΧοί಺ʣ
  24. Presenter’s Comments • end-to-end Ͱͳ͍ݎ࣮ͳ΍ΓํͰੑೳ޲্Λ࣮ݱ͠ ͍ͯΔ • ҰͭҰͭͷεςοϓ΋ஸೡ • ࿦จͷߏ੒ɾॻ͖ํͱͯ͠΋ษڧʹͳͬͨ

    • DARPA ͔ΒࢿۚΛಘͨݚڀͷΑ͏ͩ 24