Upgrade to Pro — share decks privately, control downloads, hide ads and more …

CBoW入門

2ab3dc02a9448f246bab64174b19dc1e?s=47 Kento Nozawa
April 21, 2016

 CBoW入門

2016年4月22日の機械学習勉強会の資料
Continuous Bag of Wordsの入門スライドです

2ab3dc02a9448f246bab64174b19dc1e?s=128

Kento Nozawa

April 21, 2016
Tweet

More Decks by Kento Nozawa

Other Decks in Research

Transcript

  1. Continuous Bag of Wordsೖ໳ @ػցֶशษڧձ 2016೥04݄22೔ʢۚʣ M1 ໺୔

  2. ࠓ೔࿩͢͜ͱ • ଟ૚ύʔηϓτϩϯ (MLP) • Continuous Bag of Words •

    word2vecʹ͋ΔยํͷϞσϧ • ߴ଎Խ΍NGʹ͍ͭͯ͸ݴٴ͠·ͤΜ
  3. ଟ૚ύʔηϓτϩϯͷ͓͞Β͍ • ؙɿ1ͭͷ਺஋Λड͚ͯɼؔ਺Λద༻ͯ͠1ͭͷ਺஋Λग़ྗ ʢؙ1ͭΛϢχοτɼؔ਺Λ׆ੑԽؔ਺ʣ • ໼ҹɿϢχοτͷग़ྗͱॏΈʢ਺஋ʣͷੵΛ࣍ͷ૚ʹ఻೻ Ͱ͖Δ͚ͩਖ਼ղ͢ΔΑ͏ͳॏΈΛٻΊΔ Input layer hidden

    layer output layer (soft max) x1 h3 h1 h2 x2 x3 x4 0.2 0.5 0.3
  4. ଟ૚ύʔηϓτϩϯͷ۩ମྫ • 4୯ޠ͔͠ͳ͍ੈքΛߟ͑Δ • [jobs, mac, win8, ms] • ೖྗɿจॻ

    • ग़ྗɿ֬཰ʢೖྗจॻ͕”mac”͔”windowns”ʣ Input layer hidden layer output layer (softmax) jobs h3 h1 h2 mac win8 ms p(mac)=0.2 p(win)=0.8
  5. ۩ମྫɿೖྗ૚ ͦΕͧΕ୯ޠͷස౓͕ೖྗ૚ͷೖྗ஋ • doc0: [win8, win8, ms, ms, ms, jobs]

    -> ms • doc1: [jobs, mac, mac, mac, mac, mac, mac] -> mac Input layer hidden layer output layer (softmax) jobs=1 h3 h1 h2 mac=0 win8=2 ms=3 Input layer hidden layer output layer (softmax) jobs=1 h3 h1 h2 mac=6 win8=0 ms=0 doc0 doc1
  6. ۩ମྫɿӅΕ૚ ೖྗ-ӅΕؒͷॏΈߦྻW͸ɼ3x4ͷߦྻ ӅΕ૚͸ɼ(ೖྗ૚ͷग़ྗ)x(ॏΈ)ͷ࿨hΛड͚औΔ doc0 2 4 1 2 3 0

    1 2 1 2 1 1 1 1 3 5 2 6 6 4 1 0 2 3 3 7 7 5 = 2 4 7 9 5 3 5 Input layer hidden layer output layer (softmax) jobs=1 f(5)=0.99 f(7)=0.99 f(9)=0.99 mac=0 win8=2 ms=3 Wx = h
  7. ۩ମྫɿӅΕ૚ ೖྗ-ӅΕؒͷॏΈߦྻW͸ɼ3x4ͷߦྻ ӅΕ૚͸ɼ(ೖྗ૚ͷग़ྗ)x(ॏΈ)ͷ࿨hΛड͚औΔ doc0 2 4 1 2 3 0

    1 2 1 2 1 1 1 1 3 5 2 6 6 4 1 0 2 3 3 7 7 5 = 2 4 7 9 5 3 5 Input layer hidden layer output layer (softmax) jobs=1 f(5)=0.99 f(7)=0.99 f(9)=0.99 mac=0 win8=2 ms=3
  8. ۩ମྫɿӅΕ૚ ׆ੑԽؔ਺ f(x) Λ௨ͯ͠ӅΕ૚͔Βग़ྗ doc0 Input layer hidden layer output

    layer (softmax) jobs=1 f(5)=0.99 f(7)=0.99 f(9)=0.99 mac=0 win8=2 ms=3 By Chrislb - created by Chrislb, CC දࣔ-ܧঝ 3.0, https://commons.wikimedia.org/w/index.php?curid=223990 ؔ਺ྫɿγάϞΠυؔ਺
  9. ۩ମྫɿग़ྗ૚ ӅΕ૚-ग़ྗ૚ͷॏΈW’͸ɼ2x3ͷߦྻ ग़ྗ͸ɼ(ӅΕ૚ͷग़ྗ)x(ॏΈ)ͷ࿨Λड͚औΔ doc0 Input layer hidden layer output layer

    (softmax) jobs=1 f(5)=0.99 f(7)=0.99 f(9)=0.99 mac=0 win8=2 ms=3 -0.1 0.1  1 1 1.01 1 1 1.01 2 4 0.99 0.99 0.99 3 5 =  1.0 1.0 W0f(h) = u o
  10. ग़ྗ૚ͷ׆ੑԽؔ਺ ग़ྗ૚ͷ׆ੑԽؔ਺ɿ֬཰஋Λग़ྗ͢Δsoftmaxؔ਺ doc0(=[win8, win8, ms, ms, ms, jobs])͸0.54Ͱwinͷจॻ Input layer

    hidden layer output layer (softmax) jobs=1 f(5)=0.99 f(7)=0.99 f(9)=0.99 mac=0 win8=2 ms=3 -0.1 0.1 p(mac)=0.46 p(win)=0.54 exi P n exn e0.1 e0.1 + e 0.1 = 0.54 e 0.1 e0.1 + e 0.1 = 0.46
  11. ֶश • ޡࠩٯ఻೻๏Λ࢖ͬͯॏΈW, W’ Λௐઅ͠ɼdoc0͕win ʹͳΔ֬཰ΛߴΊΔΑ͏ʹֶश • doc0ͱ͖ɼޡࠩͷݩʹͳΔͷ͸ਖ਼ղϥϕϧ [0, 1]

    Input layer hidden layer output layer (softmax) jobs=1 f(5)=0.99 f(7)=0.99 f(9)=0.99 mac=0 win8=2 ms=3 -0.1 0.1 p(mac)=0.46 p(win)=0.54
  12. CBoWͷΞϧΰϦζϜ MLP͕Θ͔Ε͹ָͳ͸ͣɽɽɽɽ

  13. one—hotදݱ • ୯ޠΛޠኮ਺࣍ݩVͷϕΫτϧͰදݱ • ରԠ͢Δ࣍ݩ͚ͩ1ɼ࢒Γ͸0 ྫɿ΋͠{I, drink, coffee, everyday} ͳΒ

    I = [1, 0, 0, 0] drink = [0, 1, 0, 0] coffee = [0, 0, 1, 0] everyday = [0, 0, 0, 1]
  14. จ຺૭෯ ͋Δจʹ͓͍ͯ஫໨͢Δ1୯ޠͷपғn୯ޠΛѻ͏ ͜ͷͱ͖ɼnΛจ຺૭෯ͱ͍͏ Q. I drink coffee everydayͰจ຺૭෯2ҎԼʹग़ݱ͢Δ Bog of

    Words͸ʁ A. [I, drink, everyday]
  15. Continuous Bag of Wordsɿ֓ཁ • 3૚ͷχϡʔϥϧωοτ • ೖྗɿจ຺૭෯ҎԼͰڞى͢Δ୯ޠ • ग़ྗɿ1୯ޠͷ֬཰෼෍

  16. Continuous Bag of Wordsɿೖྗ૚ MLPͷೖྗ૚͕ਤͷೖྗ૚ͷശ1ͭʹ૬౰ Input layer hidden layer output

    layer (softmax) jobs=1 f(5)=0.99 f(7)=0.99 f(9)=0.99 mac=0 win8=2 ms=3 MLP
  17. Continuous Bag of Wordsɿೖྗ૚ • ശ1ͭ͸one-hotදݱΛड͚औΔ • I drink coffee

    everyday Ͱw(t)=coffee drink= [0, 1, 0, 0] ͕੺͍෦෼ͷͱΔ஋ coffee
  18. Continuous Bag of Wordsɿೖྗ૚ I = [0, 1, 0, 0]

    drink= [0, 1, 0, 0] everyday = [0, 0, 0, 1] coffee
  19. Continuous Bag of Wordsɿೖྗ૚-ӅΕ૚ͷॏΈ • ໼ҹ1ͭʹରͯ͠ɼॏΈߦྻ • ͜ͷॏΈߦྻ͸ڞ༗ WN⇥V 2

    4 1 2 3 0 1 2 1 2 1 1 1 1 3 5 2 6 6 4 0 1 0 0 3 7 7 5 = 2 4 2 2 1 3 5 Wx = ut 1
  20. Continuous Bag of Wordsɿೖྗ૚-ӅΕ૚ͷॏΈ • ໼ҹ1ͭʹରͯ͠ɼॏΈߦྻ • ͜ͷॏΈߦྻ͸ڞ༗ • ೖྗ஋͸one–hotΑΓɼ୯ޠϕΫτϧ͕ӅΕ૚ʹ఻೻

    WN⇥V 2 4 1 2 3 0 1 2 1 2 1 1 1 1 3 5 2 6 6 4 0 1 0 0 3 7 7 5 = 2 4 2 2 1 3 5 Wx = ut 1
  21. Continuous Bag of WordsɿӅΕ૚ • ୯ޠϕΫτϧͷฏۉ͕ӅΕ૚ͷೖྗʢN࣍ݩϕΫτϧʣ • ׆ੑԽؔ਺ͳ͠ ut 2

    + ut 1 + ut+1 3 = h 1 3 0 @ 2 4 1 1 1 3 5 + 2 4 2 2 1 3 5 + 2 4 0 2 1 3 5 1 A = 2 4 1 1.67 0.33 3 5
  22. Continuous Bag of WordsɿӅΕ૚-ग़ྗ૚ ॏΈߦྻ ͱӅΕ૚ͷग़ྗ஋ʢฏۉϕΫτϧʣͷੵ W0V ⇥N 2 6

    6 4 1 2 1 1 2 1 1 2 2 0 2 0 3 7 7 5 2 4 1.00 1.67 0.33 3 5 = 2 6 6 4 4.01 2.01 5.00 3.34 3 7 7 5 W0h = u o
  23. Continuous Bag of Wordsɿग़ྗ૚ 1୯ޠͷ༧ଌΛ͍ͨ͠ • ग़ྗ૚ͷϢχοτ਺ = ޠኮ਺ =

    V • ׆ੑԽؔ਺ɿsoftmaxؔ਺ softmax (u o ) = y softmax 0 B B @ 2 6 6 4 4 . 01 2 . 01 5 . 00 3 . 34 3 7 7 5 1 C C A = 2 6 6 4 0 . 23 0 . 03 0 . 62 0 . 12 3 7 7 5
  24. Continuous Bag of Wordsɿग़ྗ૚ I, drink, everydayΛೖΕͯಘΒΕͨ୯ޠͷ֬཰෼෍ 2 6 6

    4 0.23 0.03 0.62 0.12 3 7 7 5 coffeeͷ֬཰஋
  25. ֶश݁Ռͷ୯ޠϕΫτϧ • ೖྗ૚ͱӅΕ૚ؒͷॏΈߦྻ͕୯ޠϕΫτϧͷू߹ • 1୯ޠɿ100࣍ݩͱ͔200࣍ݩͰີͳϕΫτϧ

  26. ୯ޠϕΫτϧͷخ͍͠ಛੑ • analogy • king-man+woman=queen • Japan-Tokyo+Paris=France • eats-eat+run=runs •

    ୯ޠͷಛ௃ྔ • ਂ૚ֶशͷॳظ஋ • ྨࣅ౓ܭࢉ • nzwͷ࠷ॳͷ࿦จ͸͜Ε
  27. ࢀߟจݙͳͲ • gensim : https://radimrehurek.com/gensim/ • pythonɼؔ਺͕͍Ζ͍Ζ͋ͬͯศར • chainer :

    https://github.com/pfnet/chainer/tree/master/examples/word2vec • PythonɼχϡʔϥϧωοτͰͷ࣮૷ྫ • word2vec : https://code.google.com/archive/p/word2vec/ • CɼΦϦδφϧ • word2vec Parameter Learning Explained : http://arxiv.org/pdf/1411.2738v3.pdf • ӳޠɼΘ͔Γ΍͍͢ղઆ • Efficient Estimation of Word Representations in Vector Spaceɿhttp://arxiv.org/pdf/ 1301.3781.pdf • ӳޠɼCBoWͷ΋ͱ࿦จɽεϥΠυͷਤͷCBoW͸ͪ͜Β͔Β • ਂ૚ֶश Deep Learning. ਓ޻஌ೳֶձ. • ೔ຊޠɼॻ੶