Kento Nozawa
April 21, 2016
3.6k

# CBoW入門

2016年4月22日の機械学習勉強会の資料
Continuous Bag of Wordsの入門スライドです

April 21, 2016

## Transcript

2. ### ࠓ೔࿩͢͜ͱ • ଟ૚ύʔηϓτϩϯ (MLP) • Continuous Bag of Words •

word2vecʹ͋ΔยํͷϞσϧ • ߴ଎Խ΍NGʹ͍ͭͯ͸ݴٴ͠·ͤΜ
3. ### ଟ૚ύʔηϓτϩϯͷ͓͞Β͍ • ؙɿ1ͭͷ਺஋Λड͚ͯɼؔ਺Λద༻ͯ͠1ͭͷ਺஋Λग़ྗ ʢؙ1ͭΛϢχοτɼؔ਺Λ׆ੑԽؔ਺ʣ • ໼ҹɿϢχοτͷग़ྗͱॏΈʢ਺஋ʣͷੵΛ࣍ͷ૚ʹ఻೻ Ͱ͖Δ͚ͩਖ਼ղ͢ΔΑ͏ͳॏΈΛٻΊΔ Input layer hidden

layer output layer (soft max) x1 h3 h1 h2 x2 x3 x4 0.2 0.5 0.3
4. ### ଟ૚ύʔηϓτϩϯͷ۩ମྫ • 4୯ޠ͔͠ͳ͍ੈքΛߟ͑Δ • [jobs, mac, win8, ms] • ೖྗɿจॻ

• ग़ྗɿ֬཰ʢೖྗจॻ͕”mac”͔”windowns”ʣ Input layer hidden layer output layer (softmax) jobs h3 h1 h2 mac win8 ms p(mac)=0.2 p(win)=0.8
5. ### ۩ମྫɿೖྗ૚ ͦΕͧΕ୯ޠͷස౓͕ೖྗ૚ͷೖྗ஋ • doc0: [win8, win8, ms, ms, ms, jobs]

-> ms • doc1: [jobs, mac, mac, mac, mac, mac, mac] -> mac Input layer hidden layer output layer (softmax) jobs=1 h3 h1 h2 mac=0 win8=2 ms=3 Input layer hidden layer output layer (softmax) jobs=1 h3 h1 h2 mac=6 win8=0 ms=0 doc0 doc1
6. ### ۩ମྫɿӅΕ૚ ೖྗ-ӅΕؒͷॏΈߦྻW͸ɼ3x4ͷߦྻ ӅΕ૚͸ɼ(ೖྗ૚ͷग़ྗ)x(ॏΈ)ͷ࿨hΛड͚औΔ doc0 2 4 1 2 3 0

1 2 1 2 1 1 1 1 3 5 2 6 6 4 1 0 2 3 3 7 7 5 = 2 4 7 9 5 3 5 Input layer hidden layer output layer (softmax) jobs=1 f(5)=0.99 f(7)=0.99 f(9)=0.99 mac=0 win8=2 ms=3 Wx = h
7. ### ۩ମྫɿӅΕ૚ ೖྗ-ӅΕؒͷॏΈߦྻW͸ɼ3x4ͷߦྻ ӅΕ૚͸ɼ(ೖྗ૚ͷग़ྗ)x(ॏΈ)ͷ࿨hΛड͚औΔ doc0 2 4 1 2 3 0

1 2 1 2 1 1 1 1 3 5 2 6 6 4 1 0 2 3 3 7 7 5 = 2 4 7 9 5 3 5 Input layer hidden layer output layer (softmax) jobs=1 f(5)=0.99 f(7)=0.99 f(9)=0.99 mac=0 win8=2 ms=3
8. ### ۩ମྫɿӅΕ૚ ׆ੑԽؔ਺ f(x) Λ௨ͯ͠ӅΕ૚͔Βग़ྗ doc0 Input layer hidden layer output

layer (softmax) jobs=1 f(5)=0.99 f(7)=0.99 f(9)=0.99 mac=0 win8=2 ms=3 By Chrislb - created by Chrislb, CC දࣔ-ܧঝ 3.0, https://commons.wikimedia.org/w/index.php?curid=223990 ؔ਺ྫɿγάϞΠυؔ਺
9. ### ۩ମྫɿग़ྗ૚ ӅΕ૚-ग़ྗ૚ͷॏΈW’͸ɼ2x3ͷߦྻ ग़ྗ͸ɼ(ӅΕ૚ͷग़ྗ)x(ॏΈ)ͷ࿨Λड͚औΔ doc0 Input layer hidden layer output layer

(softmax) jobs=1 f(5)=0.99 f(7)=0.99 f(9)=0.99 mac=0 win8=2 ms=3 -0.1 0.1  1 1 1.01 1 1 1.01 2 4 0.99 0.99 0.99 3 5 =  1.0 1.0 W0f(h) = u o
10. ### ग़ྗ૚ͷ׆ੑԽؔ਺ ग़ྗ૚ͷ׆ੑԽؔ਺ɿ֬཰஋Λग़ྗ͢Δsoftmaxؔ਺ doc0(=[win8, win8, ms, ms, ms, jobs])͸0.54Ͱwinͷจॻ Input layer

hidden layer output layer (softmax) jobs=1 f(5)=0.99 f(7)=0.99 f(9)=0.99 mac=0 win8=2 ms=3 -0.1 0.1 p(mac)=0.46 p(win)=0.54 exi P n exn e0.1 e0.1 + e 0.1 = 0.54 e 0.1 e0.1 + e 0.1 = 0.46
11. ### ֶश • ޡࠩٯ఻೻๏Λ࢖ͬͯॏΈW, W’ Λௐઅ͠ɼdoc0͕win ʹͳΔ֬཰ΛߴΊΔΑ͏ʹֶश • doc0ͱ͖ɼޡࠩͷݩʹͳΔͷ͸ਖ਼ղϥϕϧ [0, 1]

Input layer hidden layer output layer (softmax) jobs=1 f(5)=0.99 f(7)=0.99 f(9)=0.99 mac=0 win8=2 ms=3 -0.1 0.1 p(mac)=0.46 p(win)=0.54

13. ### one—hotදݱ • ୯ޠΛޠኮ਺࣍ݩVͷϕΫτϧͰදݱ • ରԠ͢Δ࣍ݩ͚ͩ1ɼ࢒Γ͸0 ྫɿ΋͠{I, drink, coffee, everyday} ͳΒ

I = [1, 0, 0, 0] drink = [0, 1, 0, 0] coffee = [0, 0, 1, 0] everyday = [0, 0, 0, 1]
14. ### จ຺૭෯ ͋Δจʹ͓͍ͯ஫໨͢Δ1୯ޠͷपғn୯ޠΛѻ͏ ͜ͷͱ͖ɼnΛจ຺૭෯ͱ͍͏ Q. I drink coffee everydayͰจ຺૭෯2ҎԼʹग़ݱ͢Δ Bog of

Words͸ʁ A. [I, drink, everyday]

16. ### Continuous Bag of Wordsɿೖྗ૚ MLPͷೖྗ૚͕ਤͷೖྗ૚ͷശ1ͭʹ૬౰ Input layer hidden layer output

layer (softmax) jobs=1 f(5)=0.99 f(7)=0.99 f(9)=0.99 mac=0 win8=2 ms=3 MLP
17. ### Continuous Bag of Wordsɿೖྗ૚ • ശ1ͭ͸one-hotදݱΛड͚औΔ • I drink coffee

everyday Ͱw(t)=coffee drink= [0, 1, 0, 0] ͕੺͍෦෼ͷͱΔ஋ coffee
18. ### Continuous Bag of Wordsɿೖྗ૚ I = [0, 1, 0, 0]

drink= [0, 1, 0, 0] everyday = [0, 0, 0, 1] coffee
19. ### Continuous Bag of Wordsɿೖྗ૚-ӅΕ૚ͷॏΈ • ໼ҹ1ͭʹରͯ͠ɼॏΈߦྻ • ͜ͷॏΈߦྻ͸ڞ༗ WN⇥V 2

4 1 2 3 0 1 2 1 2 1 1 1 1 3 5 2 6 6 4 0 1 0 0 3 7 7 5 = 2 4 2 2 1 3 5 Wx = ut 1
20. ### Continuous Bag of Wordsɿೖྗ૚-ӅΕ૚ͷॏΈ • ໼ҹ1ͭʹରͯ͠ɼॏΈߦྻ • ͜ͷॏΈߦྻ͸ڞ༗ • ೖྗ஋͸one–hotΑΓɼ୯ޠϕΫτϧ͕ӅΕ૚ʹ఻೻

WN⇥V 2 4 1 2 3 0 1 2 1 2 1 1 1 1 3 5 2 6 6 4 0 1 0 0 3 7 7 5 = 2 4 2 2 1 3 5 Wx = ut 1
21. ### Continuous Bag of WordsɿӅΕ૚ • ୯ޠϕΫτϧͷฏۉ͕ӅΕ૚ͷೖྗʢN࣍ݩϕΫτϧʣ • ׆ੑԽؔ਺ͳ͠ ut 2

+ ut 1 + ut+1 3 = h 1 3 0 @ 2 4 1 1 1 3 5 + 2 4 2 2 1 3 5 + 2 4 0 2 1 3 5 1 A = 2 4 1 1.67 0.33 3 5
22. ### Continuous Bag of WordsɿӅΕ૚-ग़ྗ૚ ॏΈߦྻ ͱӅΕ૚ͷग़ྗ஋ʢฏۉϕΫτϧʣͷੵ W0V ⇥N 2 6

6 4 1 2 1 1 2 1 1 2 2 0 2 0 3 7 7 5 2 4 1.00 1.67 0.33 3 5 = 2 6 6 4 4.01 2.01 5.00 3.34 3 7 7 5 W0h = u o
23. ### Continuous Bag of Wordsɿग़ྗ૚ 1୯ޠͷ༧ଌΛ͍ͨ͠ • ग़ྗ૚ͷϢχοτ਺ = ޠኮ਺ =

V • ׆ੑԽؔ਺ɿsoftmaxؔ਺ softmax (u o ) = y softmax 0 B B @ 2 6 6 4 4 . 01 2 . 01 5 . 00 3 . 34 3 7 7 5 1 C C A = 2 6 6 4 0 . 23 0 . 03 0 . 62 0 . 12 3 7 7 5
24. ### Continuous Bag of Wordsɿग़ྗ૚ I, drink, everydayΛೖΕͯಘΒΕͨ୯ޠͷ֬཰෼෍ 2 6 6

4 0.23 0.03 0.62 0.12 3 7 7 5 coffeeͷ֬཰஋

26. ### ୯ޠϕΫτϧͷخ͍͠ಛੑ • analogy • king-man+woman=queen • Japan-Tokyo+Paris=France • eats-eat+run=runs •

୯ޠͷಛ௃ྔ • ਂ૚ֶशͷॳظ஋ • ྨࣅ౓ܭࢉ • nzwͷ࠷ॳͷ࿦จ͸͜Ε
27. ### ࢀߟจݙͳͲ • gensim : https://radimrehurek.com/gensim/ • pythonɼؔ਺͕͍Ζ͍Ζ͋ͬͯศར • chainer :

https://github.com/pfnet/chainer/tree/master/examples/word2vec • PythonɼχϡʔϥϧωοτͰͷ࣮૷ྫ • word2vec : https://code.google.com/archive/p/word2vec/ • CɼΦϦδφϧ • word2vec Parameter Learning Explained : http://arxiv.org/pdf/1411.2738v3.pdf • ӳޠɼΘ͔Γ΍͍͢ղઆ • Efﬁcient Estimation of Word Representations in Vector Spaceɿhttp://arxiv.org/pdf/ 1301.3781.pdf • ӳޠɼCBoWͷ΋ͱ࿦จɽεϥΠυͷਤͷCBoW͸ͪ͜Β͔Β • ਂ૚ֶश Deep Learning. ਓ޻஌ೳֶձ. • ೔ຊޠɼॻ੶