Upgrade to Pro — share decks privately, control downloads, hide ads and more …

Statistical Semantic入門 〜分布仮説からword2vecまで〜 @PFIセミナー

Yuya Unno
February 06, 2014

Statistical Semantic入門 〜分布仮説からword2vecまで〜 @PFIセミナー

Yuya Unno

February 06, 2014
Tweet

More Decks by Yuya Unno

Other Decks in Technology

Transcript

  1. ¨ŬœƓüūŋÌ [Bird+10] 10Ǵ ī=ßƜ=ĐDz 10.1 œƓüūÌĐ 10.2 ŹøŅÌÒ 10.3 •ƳƐūŅÌ

    10.4 Əūī=ßƜŅ 10.5 ŵõßƜŅ 10.6 C9F 10.7 ņěīǟ 10.8 ǀǃáø 2
  2. ǔǮũ (Distributional Hypothesis) !  §+īȍ8”é,Nƹū>§+ßƜRû59'9 !  À‹=õ> Ń;!L-'=ǮũR÷Í*7N¸Ê *7NžÁÎǎRǦC, .1

    The Distributional Hypothesis is that words that occur in the same contexts tend to have similar meanings (Harris, 1954). (ACL wiki )
  3. ŔĀ°ßƜŅ (Statistical Semantics)9> !  ū=ŔĀ°;äŨ!LßƜRěN !  Ȕ:= › " %=ŔĀ°;äŨR÷Í,N

    .2 Statistical Semantics is the study of "how the statistical patterns of human word usage can be used to figure out what people mean, at least to a level sufficient for information access”(ACL wiki )
  4. ĝĽ°<8#0;'9 !  §ŞūĢ©ūƱŞū !  ū9ū=ßƜ=¸ĮRǦNK<;N !  ąŞū !  īȍ<ʼn+7 :=ßƜ8ħQO7N=!"Q!

    N !  ƻēū´ū !  ƯƦīȍ!L ƻē=ƹū=ßƜ"ƱƘ8#N !  Ǿū !  ƿ;Nüū®8©ʼn,Nƹū=¸Į"¿L!< .5
  5. ;/'=õR,N=! !  ļij;ĜƲđúRŀÝ9*;1F üūn i)O?ƷÍ8#N !  PFI=ŦĴ=Ǎ&7N³Rśƽ*7%ON! G*O; !  ćǶ‡`Šp=§ŞūƤŮūŞȘȗäĐŗ

    !  īĦƱ< &NĜƲni=Öŕ !  ąüūćǶ<ŀÝ;ǾūǛĦ !  '=1=®<ÆǂC*æĨ"úġ)O7N .6
  6. ŒǪȉ<>35=ŋÌR— !  üūŋÌ°;­ŋÌR— !  ex: ĪÍRȖ, ŒīſËīſ ǚ›ū etc… ! 

    ƹūRīȍ<K47Ôé,N !  ex: īĦ±ķŜ ī±ķŜ ǯƗǸ ƹªòǞªò ĮMî& ÓȌ-Æ°ū¸Į etc… !  ƹūÔéRC%Ôé*ĝ, !  ex: —Ǔye ĬŊye NNye etc… !  DZƛȐ ĬŊ€n† NN€n†Gƅ /-
  7. ȈŁßƜĐDz: Latent Semantic Indexing (LSI), Latent Semantic Analysis (LSA) [Deerwester+90]

    !  đúćǶ=`}‚rmU8¤CO1 !  9 :'=ÈįRǘS8GŶƚ!ãÏ)O7 N%LćǶ9üūŋÌ>ǜ"ƍ !  ƹūR 0=ƹū"”é*7NīĦĿ›9* 7Ô, !  '=ƹūy\o†>#49ȓ–;=8ĚĎƭĄ 8#N /.
  8. LSI=ƙĐ // ńƹū":=īĦ<Ŷ ¶”7N=!=—Ǔ U ∑ V = x x

    (SVD) i k: KM>8 ƹūiRkĚĎ8Ôé *7N & 
  9. ĬŊ°ȈŁßƜĐDz Probabilistic Latent Semantic Indexing (PLSI) [Hofmann99] !  LSI9Ʊǽ=ßƜĵ&"8#NK;€n† ! 

    ovk\<K47ƿ;Nƹū"”I, !  ex: e{l;LµĺbkZı¾ £ Ć;LŒŔŷÄťŽ;: !  Vd9*7> LSI8ƹūRţĚĎ<Ŏ9* 1=9§ƅ< ƹūGīĦGovk\=ĿCM 9Ġ47ĬŊ€n†<;NK<çĀ /6
  10. PLSI=GŃ*3HS9*1ũ¿ !  ńīĦ<>ovk\=ǔ"ºC47N !  ex: bkZ9óâ=õø"”I, D1;Ȋǐè !  ńƹū<©*7 ! 

    ovk\ǔ<Ʈ475ovk\RºFN !  0=ovk\!LƹūR@95ºFN !  'ORǑMŢ*7īĦ"¤æ)O19Ġ 0-
  11. üū€n†ƢžÁ=Řå !  ĬŊ€n†;=8Ƹ=ĬŊ°žÁ9Ñä"Ř !  ĬŊƄ=dz&IJ<œƓ;ßƜĵ&"8#N !  ex: ŔĀ°đúćǶ ŔĀ°×ȆȒǾ etc.

    !  ƟĖªÁ"Ĭ¹)O7N !  ؝ǔ"Ƙ¯)O7N=8O? éÇ=œƓ ī=ĬŊ"¥%;47N>- !  ăï"ŀ-1.0<;NK<Ïš)O7N1F e `TĬŊ=Ǘ©Ƅ<ßƜ"N 00
  12. Neural Network Language Model (NNLM) [Bengio +03] !  N]„~üū€n†R NNÉ

    !  ÊêN-1īſ!L Ě =īſRÛ7NĬŊ€ n†=r‚„†sk oRŠǖ,N 1-
  13. Recurrent Neural Network Language Model (RNNLM) [Mikolov+10] !  t-1īſǘS29#=Œŝ Ry\o†É*7

    tīſÆR 0=Œŝ!LÛ7N !  NNLM8>ĝ­Nīſ=y\o† !LĚRÛ771 !  ĝ­C8=īȍđú"ǡFĉ CO7NȊǐè !  http://rnnlm.org 1. īſ ƹū «Ǚ =ǤOǏ ǤOǏ Ě=¨» =āǒ `v
  14. Skip-gram€n†[Mikolov+13b]=Æ°¸ð !  ¨»`ue: w 1 , w 2 , …,

    w T w i >ƹū  12 'ORÙ ŒÉ vw I&wP#@NK9FeU]il*FmD7 =OLP?A8 cI WSYD5<L8
  15. word2vec";/ǼƔ°241! !  ƹū=y\o†Ôé>ÀC8G41 !  ƱǽDŽ=ļǢ>Ɲ1*71" ŕ*1Mí1M9 41ǥė>ÀC88#;!41 !  üū€n†Ƣ> ċ°<ī±ķŜR€n†É*7N1F

    ovk\9Œ#ȎÂ*!ȕLO7; !  —ǓĐƢ8 œ=e`T=ßƜĵ&<ƺƖ*7NMF"ƶÍ) O1'9RĠ”0 !  ßƜ=Šű>G49ǚǪ8 ¨Ų<ĹŇ)O1üūÈǁ 9ňDž";9Ôé8#;9Ġ471 15
  16. NNƢžÁ=''"Ř !  ș=äŨ"Ě9¿L!< !  ßƜ=ĝĽ°;äŨ" Ŷƚ!y\o†Ō®’<ǡ FĉCO7N !  ŕ*1Mí1MôLj941M ! 

    B9S:"2013<úġ)O7N !  Mikolov>'=2&815ĕŅīR”*7N !  1AS Ŷ!"Q!N=>'O!L 2.
  17. ÆRdžF7DN9 !  ŶRīȍ9,N! !  üū€n†Ƣī±ķŜ ĝ­Nƹū !  —ǓĐƢī±ķŜ !  NNƢĝ­Nƹū

    ǯƗǸ !  :łƧ,N! !  üū€n†ƢǤOłð<Ŏ9, !  —ǓĐƢţ„Š\<Ŏ9, !  NNƢÆ°¸ð"ٌÙË<;NK<Ïš 2/ žÁ=Ū>O: ¢Ÿ>ǽ7N
  18. À¬:;47%! !  NNƢžÁ=Ũ°<þÝ;³"ØM”)ON !  ;/ÀC8C%—!;!41=! Ũ°;þÝ ;Ū>Ŷ;=! !  ij=ßƜĵ&Iij=ÇéªÁ"”é,N ! 

    —ǓĐƢ=ªÁ8ģé8#N9 ǻ°<Òǃ«® RĄL*1M Œňƒ<ĀIJ8#NK<;N!G !  ʼn͝µ=Ĭ¹ !  C2 ;S!ğƈ‡y† !  ÇĈ=Tx…^cƒŠ<ʼnÍ)O7% 20
  19. C9F !  Statistical Semantics9>;<! !  ŔĀ°;äŨ Ü<ƹū=”éīȍ=äŨ9 0= ū=ßƜ=¸Į<!S,NųƊ ! 

    Œ#%&735=ªòä !  —ǓĐƢ üū€n†Ƣ r‚„†skoƢ !  NNƢžÁ>ÖĠÅ !  ŕ*1M í1M ôLj941M !  Ȃź=NNüū€n†>C2ĭC41?!M2 21
  20. ņěīǟ1ř¨ !  [Bird+10] Steven Bird, Ewan Klein, Edward Loper, ȇÚì‘,

    ’ ·Ƿę, öµƣ¿. . Y„V…duŠ, 2010. !  [–Ƃ+96] –ƂĞŤĩÌƴ. !$#  . ƬưĦō, 1996. !  [Evert10] Stefan Evert. Distributional Semantic Models. NAACL 2010 Tutorial. !  [Àč13] ÀčE5D. & . ȃȅĦǨ, 2013. !  [Deerwester+90] Scott Deerwester, Susan T. Dumais, George W. Furnas, Thomas K. Landauer, Richard Harshman. Indexing by Latent Semantic Analysis. JASIS, 1990. 22
  21. ņěīǟ2üū€n†Ƣ—ǓĐƢ !  [Hofmann99] Thomas Hofmann. Probabilistic Latent Semantic Indexing. SIGIR,

    1999. !  [Blei+03] David M. Blei, Andrew Y. Ng, Michael I. Jordan. Latent Dirichlet Allocation. JMLR, 2003. !  [Lee+99] Daniel D. Lee, H. Sebastian Seung. Learning the parts of objects by non-negative matrix factorization. Nature, vol 401, 1999. !  [Ding+08] Chris Ding, Tao Li, Wei Peng. On the equivalence between Non-negative Matrix Factorization and Probabilistic Latent Semantic Indexing. Computational Statistics & Data Analysis, 52(8), 2008. !  [Cruys10] Tim Van de Cruys. A Non-negative Tensor Factorization Model for Selectional Preference Induction. Natural Language Engineering, 16(4), 2010. 23
  22. ņěīǟ3NNƢ1 !  [Bengio+03] Yoshua Bengio, Réjean Ducharme, Pascal Vincent, Christian

    Jauvin. A Neural Probabilistic Language Model. JMLR, 2003. !  [Mikolov+10] Tomas Mikolov, Martin Karafiat, Lukas Burget, Jan "Honza" Cernocky, Sanjeev Khudanpur. Recurrent neural network based language model. Interspeech, 2010. !  [Mikolov+13a] Tomas Mikolov, Wen-tau Yih, Geoffrey Zweig. Linguistic Regularities in Continuous Space Word Representations. HLT-NAACL, 2013. !  [Mikolov+13b] Tomas Mikolov, Kai Chen, Greg Corrado, Jeffrey Dean. Efficient Estimation of Word Representations in Vector Space. CoRR, 2013. 24
  23. ņěīǟ4NNƢ2 !  [Mikolov+13c] Tomas Mikolov, Ilya Sutskever, Kai Chen, Gregory

    S. Corrado, Jeffrey Dean. Distributed Representations of Words and Phrases and their Compositionality. NIPS, 2013. !  [Kim+13] Joo-Kyung Kim, Marie-Catherine de Marneffe. Deriving adjectival scales from continuous space word representations. EMNLP , 2013. !  [Mikolov+13d] Tomas Mikolov, Quoc V. Le, Ilya Sutskever. Exploiting Similarities among Languages for Machine Translation. CoRR, 2013. 25