Upgrade to Pro — share decks privately, control downloads, hide ads and more …

CBoW入門

Kento Nozawa
April 21, 2016

 CBoW入門

2016年4月22日の機械学習勉強会の資料
Continuous Bag of Wordsの入門スライドです

Kento Nozawa

April 21, 2016
Tweet

More Decks by Kento Nozawa

Other Decks in Research

Transcript

  1. ࠓ೔࿩͢͜ͱ • ଟ૚ύʔηϓτϩϯ (MLP) • Continuous Bag of Words •

    word2vecʹ͋ΔยํͷϞσϧ • ߴ଎Խ΍NGʹ͍ͭͯ͸ݴٴ͠·ͤΜ
  2. ଟ૚ύʔηϓτϩϯͷ۩ମྫ • 4୯ޠ͔͠ͳ͍ੈքΛߟ͑Δ • [jobs, mac, win8, ms] • ೖྗɿจॻ

    • ग़ྗɿ֬཰ʢೖྗจॻ͕”mac”͔”windowns”ʣ Input layer hidden layer output layer (softmax) jobs h3 h1 h2 mac win8 ms p(mac)=0.2 p(win)=0.8
  3. ۩ମྫɿೖྗ૚ ͦΕͧΕ୯ޠͷස౓͕ೖྗ૚ͷೖྗ஋ • doc0: [win8, win8, ms, ms, ms, jobs]

    -> ms • doc1: [jobs, mac, mac, mac, mac, mac, mac] -> mac Input layer hidden layer output layer (softmax) jobs=1 h3 h1 h2 mac=0 win8=2 ms=3 Input layer hidden layer output layer (softmax) jobs=1 h3 h1 h2 mac=6 win8=0 ms=0 doc0 doc1
  4. ۩ମྫɿӅΕ૚ ೖྗ-ӅΕؒͷॏΈߦྻW͸ɼ3x4ͷߦྻ ӅΕ૚͸ɼ(ೖྗ૚ͷग़ྗ)x(ॏΈ)ͷ࿨hΛड͚औΔ doc0 2 4 1 2 3 0

    1 2 1 2 1 1 1 1 3 5 2 6 6 4 1 0 2 3 3 7 7 5 = 2 4 7 9 5 3 5 Input layer hidden layer output layer (softmax) jobs=1 f(5)=0.99 f(7)=0.99 f(9)=0.99 mac=0 win8=2 ms=3 Wx = h
  5. ۩ମྫɿӅΕ૚ ೖྗ-ӅΕؒͷॏΈߦྻW͸ɼ3x4ͷߦྻ ӅΕ૚͸ɼ(ೖྗ૚ͷग़ྗ)x(ॏΈ)ͷ࿨hΛड͚औΔ doc0 2 4 1 2 3 0

    1 2 1 2 1 1 1 1 3 5 2 6 6 4 1 0 2 3 3 7 7 5 = 2 4 7 9 5 3 5 Input layer hidden layer output layer (softmax) jobs=1 f(5)=0.99 f(7)=0.99 f(9)=0.99 mac=0 win8=2 ms=3
  6. ۩ମྫɿӅΕ૚ ׆ੑԽؔ਺ f(x) Λ௨ͯ͠ӅΕ૚͔Βग़ྗ doc0 Input layer hidden layer output

    layer (softmax) jobs=1 f(5)=0.99 f(7)=0.99 f(9)=0.99 mac=0 win8=2 ms=3 By Chrislb - created by Chrislb, CC දࣔ-ܧঝ 3.0, https://commons.wikimedia.org/w/index.php?curid=223990 ؔ਺ྫɿγάϞΠυؔ਺
  7. ۩ମྫɿग़ྗ૚ ӅΕ૚-ग़ྗ૚ͷॏΈW’͸ɼ2x3ͷߦྻ ग़ྗ͸ɼ(ӅΕ૚ͷग़ྗ)x(ॏΈ)ͷ࿨Λड͚औΔ doc0 Input layer hidden layer output layer

    (softmax) jobs=1 f(5)=0.99 f(7)=0.99 f(9)=0.99 mac=0 win8=2 ms=3 -0.1 0.1  1 1 1.01 1 1 1.01 2 4 0.99 0.99 0.99 3 5 =  1.0 1.0 W0f(h) = u o
  8. ग़ྗ૚ͷ׆ੑԽؔ਺ ग़ྗ૚ͷ׆ੑԽؔ਺ɿ֬཰஋Λग़ྗ͢Δsoftmaxؔ਺ doc0(=[win8, win8, ms, ms, ms, jobs])͸0.54Ͱwinͷจॻ Input layer

    hidden layer output layer (softmax) jobs=1 f(5)=0.99 f(7)=0.99 f(9)=0.99 mac=0 win8=2 ms=3 -0.1 0.1 p(mac)=0.46 p(win)=0.54 exi P n exn e0.1 e0.1 + e 0.1 = 0.54 e 0.1 e0.1 + e 0.1 = 0.46
  9. ֶश • ޡࠩٯ఻೻๏Λ࢖ͬͯॏΈW, W’ Λௐઅ͠ɼdoc0͕win ʹͳΔ֬཰ΛߴΊΔΑ͏ʹֶश • doc0ͱ͖ɼޡࠩͷݩʹͳΔͷ͸ਖ਼ղϥϕϧ [0, 1]

    Input layer hidden layer output layer (softmax) jobs=1 f(5)=0.99 f(7)=0.99 f(9)=0.99 mac=0 win8=2 ms=3 -0.1 0.1 p(mac)=0.46 p(win)=0.54
  10. Continuous Bag of Wordsɿೖྗ૚ MLPͷೖྗ૚͕ਤͷೖྗ૚ͷശ1ͭʹ૬౰ Input layer hidden layer output

    layer (softmax) jobs=1 f(5)=0.99 f(7)=0.99 f(9)=0.99 mac=0 win8=2 ms=3 MLP
  11. Continuous Bag of Wordsɿೖྗ૚ • ശ1ͭ͸one-hotදݱΛड͚औΔ • I drink coffee

    everyday Ͱw(t)=coffee drink= [0, 1, 0, 0] ͕੺͍෦෼ͷͱΔ஋ coffee
  12. Continuous Bag of Wordsɿೖྗ૚ I = [0, 1, 0, 0]

    drink= [0, 1, 0, 0] everyday = [0, 0, 0, 1] coffee
  13. Continuous Bag of Wordsɿೖྗ૚-ӅΕ૚ͷॏΈ • ໼ҹ1ͭʹରͯ͠ɼॏΈߦྻ • ͜ͷॏΈߦྻ͸ڞ༗ WN⇥V 2

    4 1 2 3 0 1 2 1 2 1 1 1 1 3 5 2 6 6 4 0 1 0 0 3 7 7 5 = 2 4 2 2 1 3 5 Wx = ut 1
  14. Continuous Bag of WordsɿӅΕ૚ • ୯ޠϕΫτϧͷฏۉ͕ӅΕ૚ͷೖྗʢN࣍ݩϕΫτϧʣ • ׆ੑԽؔ਺ͳ͠ ut 2

    + ut 1 + ut+1 3 = h 1 3 0 @ 2 4 1 1 1 3 5 + 2 4 2 2 1 3 5 + 2 4 0 2 1 3 5 1 A = 2 4 1 1.67 0.33 3 5
  15. Continuous Bag of WordsɿӅΕ૚-ग़ྗ૚ ॏΈߦྻ ͱӅΕ૚ͷग़ྗ஋ʢฏۉϕΫτϧʣͷੵ W0V ⇥N 2 6

    6 4 1 2 1 1 2 1 1 2 2 0 2 0 3 7 7 5 2 4 1.00 1.67 0.33 3 5 = 2 6 6 4 4.01 2.01 5.00 3.34 3 7 7 5 W0h = u o
  16. Continuous Bag of Wordsɿग़ྗ૚ 1୯ޠͷ༧ଌΛ͍ͨ͠ • ग़ྗ૚ͷϢχοτ਺ = ޠኮ਺ =

    V • ׆ੑԽؔ਺ɿsoftmaxؔ਺ softmax (u o ) = y softmax 0 B B @ 2 6 6 4 4 . 01 2 . 01 5 . 00 3 . 34 3 7 7 5 1 C C A = 2 6 6 4 0 . 23 0 . 03 0 . 62 0 . 12 3 7 7 5
  17. ୯ޠϕΫτϧͷخ͍͠ಛੑ • analogy • king-man+woman=queen • Japan-Tokyo+Paris=France • eats-eat+run=runs •

    ୯ޠͷಛ௃ྔ • ਂ૚ֶशͷॳظ஋ • ྨࣅ౓ܭࢉ • nzwͷ࠷ॳͷ࿦จ͸͜Ε
  18. ࢀߟจݙͳͲ • gensim : https://radimrehurek.com/gensim/ • pythonɼؔ਺͕͍Ζ͍Ζ͋ͬͯศར • chainer :

    https://github.com/pfnet/chainer/tree/master/examples/word2vec • PythonɼχϡʔϥϧωοτͰͷ࣮૷ྫ • word2vec : https://code.google.com/archive/p/word2vec/ • CɼΦϦδφϧ • word2vec Parameter Learning Explained : http://arxiv.org/pdf/1411.2738v3.pdf • ӳޠɼΘ͔Γ΍͍͢ղઆ • Efficient Estimation of Word Representations in Vector Spaceɿhttp://arxiv.org/pdf/ 1301.3781.pdf • ӳޠɼCBoWͷ΋ͱ࿦จɽεϥΠυͷਤͷCBoW͸ͪ͜Β͔Β • ਂ૚ֶश Deep Learning. ਓ޻஌ೳֶձ. • ೔ຊޠɼॻ੶