Upgrade to Pro — share decks privately, control downloads, hide ads and more …

第8回最先端NLP勉強会 'Sarcastic or Not' (Ghosh+, 2015) の紹介

57bfe84632f84aeb65a42df3cdc04c43?s=47 Shuntaro Yada
September 10, 2016

第8回最先端NLP勉強会 'Sarcastic or Not' (Ghosh+, 2015) の紹介

Sarcastic or Not: Word Embeddings to Predict the
Literal or Sarcastic Meaning of Words
Debanjan Ghosh, Weiwei Guo, and Smaranda Muresan
EMNLP2015

57bfe84632f84aeb65a42df3cdc04c43?s=128

Shuntaro Yada

September 10, 2016
Tweet

Transcript

  1. Sarcastic or Not: Word Embeddings to Predict the Literal or

    Sarcastic Meaning of Words Debanjan Ghosh, Weiwei Guo, and Smaranda Muresan EMNLP2015 ୈ8ճ࠷ઌ୺NLPษڧձ 2016/09/10 ൃද୲౰ɿ໼ా ॡଠ࿠ʢ౦େ ӨӜݚ | D1ʣ 1
  2. Abstract • Sarcasm Detection (ൽ೑ݕग़) Λ Literal/Sarcastic Sense Disambiguation (LSSD)

    λεΫͱଊ͑௚ͨ͠ • ൑ఆର৅Λ ೖྗςΩετશମˠ୯ޠ୯Ґ ΁ • 2ͭͷ໰୊ʹऔΓ૊Μͩ 1. ൽ೑ͱͯ͠࢖ΘΕ͏Δޠ (=: target words) Λ͍͔ʹूΊΔ͔ 2. ςΩετͱ target words ͕ೖྗ͞Εͨͱ͖ɺtarget words ͷ ҙຯ͕ Literal/Sarcastic ͷͲͪΒͰ͋Δ͔Λ͍͔ʹ൑ఆ͢Δ͔ 2
  3. 2 Collection of Target Words 1. Ϋϥ΢υιʔγϯάͰൽ೑ΛؚΉίʔύεΛ௚ፊ తͳදݱʹஔ͖׵͑ͨίʔύεΛੜ੒͢Δ 2. ಘΒΕͨύϥϨϧͳίʔύεʹ

    unsupervised alignment ʢ୯ޠΞϥΠϝϯτʣΛద༻ͯ͠ରٛޠ ϖΞΛऩू 3
  4. 2 Collection of Target Words I love going to the

    dentist I hate going to the dentist 1. Turkers ʹΑΔݴ͍׵͑ 3. ݩͷπΠʔτʹ͋ͬͨ ΄͏ͷ୯ޠΛ target words ʹՃ͑Δ 2. Unsupervised alignment ͰϖΞΛ
 ݟ͚ͭΔ 4 (1) Ϋϥ΢υιʔγϯά 1000 πΠʔτ from searching for #sarcasm/#sarcastic 5000 πΠʔτ by 5 Turkers
  5. 2 Collection of Target Words (2) Unsupervised alignment 1. co-training

    algorithm for paraphrase detection (Barzilay and McKeown, 2001) 2. Statistical machine translation alignment (IBM Model 4 with HMM alignment in Giza+ +; Och and Ney, 2000) 5 → 367 ϖΞ → 70 ϖΞ obtained # of t φ ≥ 0.8
  6. 3 Literal/Sarcastic Sense Disambiguation • ೖྗςΩετͱ target word (t) ͕ࢦఆ͞Εͨͱ͖

    ʹɺt ͷҙຯ͕ sarcastic (S) ͔ literal (L) ͔Λਪఆ • ֶशͷͨΊʹɺ֤ t ʹ͍ͭͯ S/L ͦΕͧΕͷҙຯͰ ༻͍ΒΕ͍ͯΔྫจσʔλ͕ඞཁ 6
  7. 3.1 Data Collection • target words ΛؚΉπΠʔτΛݕࡧ͠ɺ#sarcasm/ #sarcastic ΛؚΉ΋ͷΛ S,

    ͦ͏Ͱͳ͍΋ͷΛ L ͱ
 ͨ͠ • ͨͩ͠ L ʹ͍ͭͯ͸ ײ৘ʢϙδɾωΨʣ͕ϋογϡ λάͱͯ͠ϥϕϧ෇͚͞Ε͍ͯΔ΋ͷΛ Lsent ͱͨ͠ • S/Lsent ͷํ͕೉͍͠ͱ͍ΘΕΔ (Gonzalez et al., 2011) 7
  8. 3.1 Data Collection • શମͰ 2,542,249 πΠʔτΛ ऩू͠ɺ
 ֶशɿ։ൃɿςετ =

    8 : 1 : 1 ʹ෼ׂͨ͠ • 70 ͋ͬͨ t ͷ͏ֶͪशσʔλ ͕ 400 ʹຬͨͳ͍΋ͷ͸औΓ আ͖ɺ37 छྨΛ࣮ݧର৅ʹ target words ͱ ֶशσʔλ਺ʢΧοί಺ʣ 8
  9. 3.2 Learning Approaches • ෼ࢄΞϓϩʔνɿt ͷ෼ࢄදݱΛ༻͍Δ • ෼ྨΞϓϩʔνɿt ʹ͍ͭͯͷ S/L

    ෼ྨ໰୊ͱ͢Δ 9
  10. 3.2.1 Distributional Approaches 1. ֤ t ʹ͍ͭͯɺS/L ͦΕͧΕʹ෼ࢄදݱʹΑΔ୯ޠ ϕΫτϧ (vs

    , vl ) Λੜ੒ 2. ೖྗςΩετ u ͔Βɺu ͷͳ͔Ͱͷ t ͷ୯ޠϕΫ τϧΛੜ੒ (vu ) 3. vu ͕ vs , vl ͷͲͪΒͱ͍͔ۙΛԿΒ͔ͷϕΫτϧ ؒྨࣅ౓ࢦඪͰ൑ఆ 10
  11. 3.2.1 Distributional Approaches • PPMI baseline (Church and Hanks, 1990)

    • Word Embeddings: • WTMF (Guo and Diab, 2012) • word2vec (Mikolov et al., 2013) • GloVe (Pennington et al., 2014) 11
  12. 3.2.1 Distributional Approaches PPMI: Positive Pointwise Mutual Information model •

    ֤ t ʹ͍ͭͯ S/L ͷσʔληοτ͝ͱʹ PPMI Ͱ্ Ґ࠷େ 1000 ͷจ຺ޠΛநग़͠ɺͦΕΒͷ TF-IDF Λ஋ʹͱΔϕΫτϧΛੜ੒͢Δ • ϕΫτϧؒྨࣅ౓͸ cos ྨࣅ౓Ͱଌఆ 12
  13. target words ͱ ͦͷจ຺ޠʢσʔληοτͷ෼ׂ͝ͱʣ 13

  14. 3.2.1 Distributional Approaches Word Embeddings • 3 ͭͷΞϧΰϦζϜʹ͓͍ͯɺϕΫτϧͷ࣍ݩ͸ 100 ʹݻఆ

    • σʔληοτ͕େ͖͍΄Ͳ࣍ݩ΋େ͖͘͢Δ͜ͱ ͕ଟ͍͕ɺ100 ສΦʔμʔͩͱ 100 ͕ී௨ • word2vec Ͱ͸ skip-gram ͱ CBOW ͦΕͧΕద༻ 14
  15. 3.2.1 Distributional Approaches Word Embeddings • ϕΫτϧؒྨࣅ౓ʹ maximum-valued matrix-element (MVME)

    Ξ ϧΰϦζϜ (Islam and Inkpen, 2008) Λ word embeddings ʹରԠ ͤͨ͞ MVMEwe Λద༻ • MVME ͸ 2 ͭͷจͷؒͰશ୯ޠͷ૊͝ͱͷྨࣅ౓Λ΋ͱʹ୯ޠ ΞϥΠϝϯτ͢ΔͨΊͷख๏ • 2ͭͷ෼ࢄදݱϕΫτϧؒͷྨࣅ౓Λࢉग़͢ΔͨΊʹɺ྆ऀͦΕͧ Εͷจ຺ޠͷ෼ࢄදݱϕΫτϧશϖΞͷྨࣅ౓Λฒ΂ͨߦྻΛ࢖ ͏ 15
  16. I going to the dentist ignored being waking work sick

    ... suppose: u = “I love going to the dentist” (t = “love”) love ͷจ຺ޠ ck in S love ͷจ຺ޠ wj in u -0.9 -0.9 -0.9 -0.9 -0.9 -0.9 -0.9 0.8 0.5 0.3 0.5 -0.9 -0.9 -0.9 -0.9 -0.9 -0.9 -0.9 0.3 -0.9 0.1 -0.1 -0.9 -0.9 -0.9 Sim = 0 ... ... ... ... ... ߦྻ M Sim = 0.8 0.3 repeat until max = 0 or size(M) = (0, 0) 16
  17. 3.2.2 Classification Approaches • 37 target words ͦΕͧΕʹ͍ͭͯɺS/L ͱ S/Lsent

    ͷ ෼ྨΛߦ͏ʢ෼ྨث͕ 37 × 2 ݸʣ • ෼ྨثʹ libSVM toolkit (Chang and Lin, 2011) Λ
 ࢖༻ • ϋΠύʔύϥϝʔλͷνϡʔχϯάʹ։ൃσʔλΛ ༻͍ͨ 17
  18. 3.2.2 Classification Approaches • SVM Baseline: ैདྷख๏ͷయܕతͳ࣮૷ • SVM with

    MVMEwe Kernel: ΧʔωϧΛ MVMEwe ʹ 18
  19. 3.2.2 Classification Approaches SVM Baseline • n-gram ͱࣙॻϕʔεͷೋ஋ૉੑ • Bag-of-Words

    ʹΑΔ୯ޠϕΫτϧ • LIWC ࣙॻ (Pennebaker et al., 2001) ͷ෼ྨ • Wikipedia ͔Βಘͨؒ౤ࢺɺه߸ɺֆจࣈͷ༗ແ • González-Ibáñez et al., 2011/Tchokni et al., 2014 ͷఏҊख๏͕ ϕʔε 19
  20. 3.2.2 Classification Approaches SVM with MVMEwe Kernel Χʔωϧʹ MVMEwe ͱಉ༷ͷख๏

    kernelwe Λద༻ • 2 πΠʔτؒͷྨࣅ౓Λɺ྆ऀͦΕͧΕʹؚ·ΕΔ ୯ޠͷ෼ࢄදݱϕΫτϧશϖΞͷྨࣅ౓Λฒ΂ͨߦ ྻ͔Βࢉग़ • ͦͷͱ͖༻͍Δ word embeddings ͸෼ࢄΞϓϩʔ νͱಉ͡ 3 छྨΛͦΕͧΕద༻ 20
  21. 4 Results and Discussions 21 ෼ࢄΞϓϩʔν

  22. 4 Results and Discussions 22 ෼ྨΞϓϩʔν

  23. • ैདྷख๏ͷ SVM ͸ֶशσʔλͷ ଟ͔ͬͨ target words Λ޷Ή • Word

    embeddings Λ༻͍ͨख๏ ͸શൠʹɺֶशσʔλ͕গͳͯ͘ ΋ߴ͍ੑೳ͕ग़͍ͯΔ 23 ↓෼ྨΞϓϩʔν ↑෼ࢄΞϓϩʔν target words ͱ ֶशσʔλ਺ʢΧοί಺ʣ
  24. Presenter’s Comments • end-to-end Ͱͳ͍ݎ࣮ͳ΍ΓํͰੑೳ޲্Λ࣮ݱ͠ ͍ͯΔ • ҰͭҰͭͷεςοϓ΋ஸೡ • ࿦จͷߏ੒ɾॻ͖ํͱͯ͠΋ษڧʹͳͬͨ

    • DARPA ͔ΒࢿۚΛಘͨݚڀͷΑ͏ͩ 24