Kento Nozawa
November 20, 2016
640

# MLPとCBoWとSkip Gram入門

klis14向け

## Kento Nozawa

November 20, 2016

## Transcript

3. ### ໨࣍ • લճͷ͓͞Β͍ • ଟ૚ύʔηϓτϩϯͱKeras • word2vecͱ͸ • Continuous Bag

of Words • Skip Gram • word2vecͷ࢖͍ํ • gensim ͱ͍͏ϥΠϒϥϦΛ࢖͍·͢ • αϒϫʔυΛߟྀ͍ͨ͠ͱ͖͸ fasttext

5. ### લճ΍ͬͨ͜ͱ • 1͔Β୯७ύʔηϓτϩϯΛ࣮૷͠·ͨ͠ • ఱؾ༧ใ • ೖྗ: 4࣍ݩϕΫτϧ x •

ग़ྗ: ੖ΕͰ͋Δ֬཰ z • Iris data • ೖྗ: 4࣍ݩϕΫτϧ • ग़ྗ: ඼छͷ֬཰ x0 +|f x1 x3 z x2 b w0 w1 w2 w3 z = f( w T x + b) = 1 1 + e ( 0.2) = 0.45

w2 w3
7. ### Kerasͱ୯७ύʔηϓτϩϯ x0 +|f x1 x3 z x2 b w0 w1

w2 w3 ۭͬΆͷ χϡʔϥϧωοτϫʔΫΛఆٛ
8. ### Kerasͱ୯७ύʔηϓτϩϯ x0 +|f x1 x3 z x2 b w0 w1

w2 w3 ೖྗ૚͸࣍ݩͷϕΫτϧ
9. ### Kerasͱ୯७ύʔηϓτϩϯ x0 +|f x1 x3 z x2 b w0 w1

w2 w3 தؒ૚͸ɼશ݁߹૚ Dense  தؒ૚͸࣍ݩ
10. ### Kerasͱ୯७ύʔηϓτϩϯ x0 +|f x1 x3 z x2 b w0 w1

w2 w3 தؒ૚ͷ׆ੑԽؔ਺͸ γάϞΠυؔ਺
11. ### Kerasͱ୯७ύʔηϓτϩϯͱMNIST • MNISTΛ୯७ύʔηϓτϩϯͰ෼ྨ • ਺ࣈ͕0͔1͔ͷ2஋෼ྨ • લճॻ͍ͨKeras͔Βͷมߋ఺: • લॲཧ෦෼ •

ೖྗ૚ͷ࣍ݩ: 4 → 784 (28x28) x0 +|f x1 x3 z x2 b w0 w1 w2 w3

15. ### ଟ஋෼ྨ • ग़ྗ஋: ֤Ϋϥεͷ֬཰஋ • MNISTͳΒ0~9·Ͱͷ֬཰ • શ෦଍͢ͱ1.0 • ༧ଌΫϥε:

֬཰஋࠷େͷΫϥε x13 z7 z9 z3 x2 z2 z5 0.14 x12 x7 x8 z6 z8 0.15 x0 x3 0.07 x9 x14 z0 x5 x4 0.12 x10 0.17 x1 0.05 0.04 0.06 x11 x6 0.17 z1 0.01 z4
16. ### ଟ஋෼ྨ: MNISTͷ৔߹ x13 z7 z9 z3 x2 z2 z5 0.14

x12 x7 x8 z6 z8 0.15 x0 x3 0.07 x9 x14 z0 x5 x4 0.12 x10 0.17 x1 0.05 0.04 0.06 x11 x6 0.17 z1 0.01 z4 Ͱ͋Δ֬཰͸ Ͱ͋Δ֬཰͸
17. ### ଟ஋෼ྨͷͨΊͷ׆ੑԽؔ਺ • Softmaxؔ਺ • ग़ྗ૚ͷෳ਺Ϣχοτʹରͯ͠ಉ࣌ʹఆٛ x13 z7 z9 z3 x2

z2 z5 0.14 x12 x7 x8 z6 z8 0.15 x0 x3 0.07 x9 x14 z0 x5 x4 0.12 x10 0.17 x1 0.05 0.04 0.06 x11 x6 0.17 z1 0.01 z4 softmax (z) = e zk P9 l=0 e zl ӈͷ৔߹ͷఆٛࣜ
18. ### ύʔηϓτϩϯʹΑΔMNISTͷଟ஋෼ྨ • ͖ͬ͞ίʔυͱ΄΅ಉ͡ • ࣮૷্ͷ஫ҙ • ׆ੑԽؔ਺Λ softmax • ग़ྗ૚ͷ࣍ݩΛ10࣍ݩʹ

• lossΛcategorical_crossentropyʹ • લॲཧ෦෼ͰYΛto_categoricalʹ

22. ### Just do deep, Just stack more layers! • தؒ૚Λ૿΍͢͜ͱͰॏΈʢύϥϝʔλʣ͕૿͑Δ •

ࡉ͔͍ύλʔϯʹରԠͰ͖ΔΑ͏ʹͳΔ • CNN΋தؒ૚ͷ1ͭ
23. ### Just do deep, Just stack more layers! • தؒ૚͸͍͘Β૿΍ͯ͠΋͍͍ •

ೖྗ૚ • தؒ૚ʢӅΕ૚ʣ • ग़ྗ૚ x3 x5 0.07 x1 h15 u3 x7 h4 h3 u7 h17 0.07 h19 u5 h11 0.20 x11 u2 h0 x13 0.05 0.02 h8 0.11 0.06 h6 h7 h18 u0 h13 h14 u1 0.20 u9 x14 u6 x2 0.12 h5 0.10 x10 x9 x8 x4 x12 h12 u4 h1 h2 h10 h9 u8 x0 x6 h16

25. ### Ϟσϧͷఆٛ: MNIST_MLP.py • Dense ʹ׆ੑԽؔ਺ΛจࣈྻͰ౉ͤΔ • ωοτϫʔΫͷߏ଄ • ೖྗ૚784࣍ݩ •

தؒ૚100࣍ݩɼ׆ੑԽؔ਺ relu, tanh, sigmoidͳͲ • ग़ྗ૚10࣍ݩɼ׆ੑԽؔ਺softmax
26. ### Just do deep, Just stack more layers!Just do deep, Just

stack more layers! Just do deep, Just stack more layers!Just do deep, Just stack more layers! ෼ྨͷੑೳΛ্͍͛ͨͰ͢
27. ### Just do deep, Just stack more layers!Just do deep, Just

stack more layers! Just do deep, Just stack more layers!Just do deep, Just stack more layers! ෼ྨͷੑೳΛ্͍͛ͨͰ͢ ૚ΛੵΊʂʂʂ ͦͨ͠Β͏·͍͘͘Α
28. ### Just do deep, Just stack more layers!Just do deep, Just

stack more layers! Just do deep, Just stack more layers!Just do deep, Just stack more layers! ෼ྨͷੑೳΛ্͍͛ͨͰ͢ ૚ΛੵΊʂʂʂ ͦͨ͠Β͏·͍͘͘Α ͜͏Ͱ͔͢ʁ
29. ### Just do deep, Just stack more layers!Just do deep, Just

stack more layers! Just do deep, Just stack more layers!Just do deep, Just stack more layers! ෼ྨͷੑೳΛ্͍͛ͨͰ͢ ૚ΛੵΊʂʂʂ ͦͨ͠Β͏·͍͘͘Α ͜͏Ͱ͔͢ʁ
30. ### Just do deep, Just stack more layers! Just do deep,

Just stack more layers! Just do deep, Just stack more layers! Just do deep, Just stack more layers! Just do deep, Just stack more layers! Just do deep, Just stack more layers! Just do deep, Just stack more layers! Just do deep, Just stack more layers! Just do deep, Just stack more layers! ෼ྨͷੑೳΛ্͍͛ͨͰ͢ ૚ΛੵΊʂʂʂ ͦͨ͠Β͏·͍͘͘Α ͜͏Ͱ͔͢ʁ ͨͩ͠ɼ ޯ഑͕ফ͑ͳ͍Α͏ʹͶ
31. ### ޯ഑ফࣦ໰୊ • લճɼޯ഑߱Լ๏ͷͨΊʹॏΈͷඍ෼ΛٻΊͨ • ೖྗ૚ʹ͍ۙ΄Ͳɼޯ഑͕ൃࢄ ͋Δ͍͸ 0 ʹ • ॏΈ͸ߋ৽͞Εͳ͍ʢֶशʹࣦഊ͠΍͍͢ʣ

• ग़ྗ૚ʹ͍ۙޯ഑৘ใ͸ೖྗ૚ʹ͍ۙߋ৽ࣜʹؔ༩ • ͜Ε͕ى͜Γʹ͘͘ͳͬͨͷ͕ਂ૚ֶश

33. ### word2vecͱ͸ • [Tomas Mikolov+, 2013] ͰఏҊ͞Εͨख๏ͷ࣮૷໊ • ΞϧΰϦζϜ໊Ͱ͸ͳ͍ͷͰ஫ҙ • ΞϧΰϦζϜࣗମ͸2ͭ͋Δ

• ࣗવݴޠॲཧ΍ػցֶशͷ෼໺ͰྲྀߦͬͯΔ • 1ͭͷ୯ޠΛ௿࣍ݩͳີϕΫτϧͷ1఺Ͱද͢ํ๏

35. ### Bag-of-Wordsදݱ (BoW) • ෳ਺୯ޠΛ·ͱΊͨදݱ • ୯ޠͷॱ൪͸ແࢹ • ཁૉͷॏෳΛڐͨ͠ू߹ʢଟॏू߹ʣ BoW: {|情報,

学群, 知識, 情報, ・, 図書館, 学類|} ू߹: {情報, 学群, 知識, ・, 図書館, 学類} ຊࢿྉͰ͸ू߹Λ {elem}ɼBoWΛ {|elem|} ͱ͢Δ
36. ### จ຺ޠ • จ຺ޠ: ͋ΔจதͰ஫໨͢Δ1୯ޠͷपғn୯ޠͷ͜ͱ • จ຺૭෯: n Q. “I drink

coffee everyday” Ͱcoffeeͷจ຺૭෯1ͷจ຺ޠ͸ʁ
37. ### จ຺ޠ • จ຺ޠ: ͋ΔจதͰ஫໨͢Δ1୯ޠͷपғn୯ޠͷ͜ͱ • จ຺૭෯: n Q. “I drink

coffee everyday” Ͱcoffeeͷจ຺૭෯1ͷจ຺ޠ͸ʁ A. {|drink, everyday|}
38. ### one-hot දݱ (one-of-K දݱ) • 1୯ޠΛޠኮ਺࣍ݩVͷϕΫτϧͰදݱ • ରԠ͢Δ࣍ݩ͚ͩ1ɼ࢒Γ͸0 ྫɿ΋͠ޠኮ਺4 (={I,

drink, coffee, everyday}) ͳΒ I = [1, 0, 0, 0] drink = [0, 1, 0, 0] coffee = [0, 0, 1, 0] everyday = [0, 0, 0, 1]
39. ### ୯ޠΛχϡʔϥϧωοτʹೖΕΔ৔߹͸ʁ • ໰୊ྫ: จதͰ࣍ʹग़ͯ͘Δ୯ޠΛ௚લͷ୯ޠ͔Β༧ଌ • ޠኮ: {I, drink, coffee, everyday}

I 0.19 drink coffee h1 drink everyday 0.32 coffee h0 I everyday 0.45 0.05 ೖྗ૚࣍ݩ ӅΕ૚࣍ݩ ग़ྗ૚࣍ݩ ͷχϡʔϥϧωοτ
40. ### drink ͕ೖྗͷ৔߹ • ୯ޠ “drink” ͷ࣍ʹݱΕΔ୯ޠΛ༧ଌ͍ͨ͠ • ೖྗ: “drink” ͷone-hotදݱ

I 0.19 drink coffee h1 drink everyday 0.32 coffee h0 I everyday 0.45 0.05 drink = [0, 1, 0, 0] 0 0 0 1
41. ### drink ͕ೖྗͷ৔߹ • one-hotදݱͰ͸ɼ1ͭͷϢχοτҎ֎0ʹͳΔ • ࢖͏ͷ͸੺ઢͷॏΈ͚ͩ I 0.19 drink coffee

h1 drink everyday 0.32 coffee h0 I everyday 0.45 0.05 0 0 0 1
42. ### Continuous Bag of Words • 3૚ͷχϡʔϥϧωοτ • ೖྗɿ2n ݸͷจ຺ޠͷ one-hotදݱ

• ग़ྗɿ୯ޠͷ֬཰஋ (V࣍ݩ) • ϥϕϧ: 1୯ޠͷone-hotදݱ
43. ### Continuous Bag of Wordsɿೖྗ૚ ଟ૚ύʔηϓτϩϯͷೖྗ૚͕ਤͷೖྗ૚ͷശ1ͭʹ૬౰ ͦΕ͕2nݸ͋Δ I 0.19 drink coffee

h1 drink everyday 0.32 coffee h0 I everyday 0.45 0.05
44. ### Continuous Bag of Wordsɿೖྗ૚ • ശ1ͭ͸one-hotදݱΛड͚औΔ • I drink coffee

everyday Ͱ w(t)=coffee drink= [0, 1, 0, 0] ͕੺͍෦෼ͷͱΔ஋ coffee
45. ### Continuous Bag of Wordsɿೖྗ૚ I = [0, 1, 0, 0]

drink= [0, 1, 0, 0] everyday = [0, 0, 0, 1] coffee
46. ### Continuous Bag of Wordsɿೖྗ૚-ӅΕ૚ͷॏΈ • ໼ҹ1ͭʹରͯ͠ɼॏΈߦྻ • ͜ͷॏΈߦྻ͸ڞ༗ WN⇥V 2

4 1 2 3 0 1 2 1 2 1 1 1 1 3 5 2 6 6 4 0 1 0 0 3 7 7 5 = 2 4 2 2 1 3 5 Wx = ut 1 ޠኮ਺ ୯ޠϕΫτϧͷ࣍ݩ
47. ### Continuous Bag of Wordsɿೖྗ૚-ӅΕ૚ͷॏΈ • ໼ҹ1ͭʹରͯ͠ɼॏΈߦྻ • ͜ͷॏΈߦྻ͸ڞ༗ • ೖྗ஋͸one–hotͳͷͰɼ୯ޠϕΫτϧ͕ӅΕ૚ʹ఻೻

WN⇥V 2 4 1 2 3 0 1 2 1 2 1 1 1 1 3 5 2 6 6 4 0 1 0 0 3 7 7 5 = 2 4 2 2 1 3 5 Wx = ut 1 drinkͷ୯ޠϕΫτϧ
48. ### Continuous Bag of WordsɿӅΕ૚ • ୯ޠϕΫτϧͷฏۉ͕ӅΕ૚ͷೖྗʢN࣍ݩϕΫτϧʣ • ׆ੑԽؔ਺ͳ͠ ut 2

+ ut 1 + ut+1 3 = h 1 3 0 @ 2 4 1 1 1 3 5 + 2 4 2 2 1 3 5 + 2 4 0 2 1 3 5 1 A = 2 4 1 1.67 0.33 3 5
49. ### Continuous Bag of WordsɿӅΕ૚-ग़ྗ૚ ॏΈߦྻ ͱӅΕ૚ͷग़ྗ஋ʢฏۉϕΫτϧʣͷੵ W0V ⇥N 2 6

6 4 1 2 1 1 2 1 1 2 2 0 2 0 3 7 7 5 2 4 1.00 1.67 0.33 3 5 = 2 6 6 4 4.01 2.01 5.00 3.34 3 7 7 5 W0h = u o
50. ### Continuous Bag of Wordsɿग़ྗ૚ 1୯ޠͷ༧ଌ • ग़ྗ૚ͷϢχοτ਺ = ޠኮ਺ =

V • ׆ੑԽؔ਺ɿsoftmaxؔ਺ softmax (u o ) = y softmax 0 B B @ 2 6 6 4 4 . 01 2 . 01 5 . 00 3 . 34 3 7 7 5 1 C C A = 2 6 6 4 0 . 23 0 . 03 0 . 62 0 . 12 3 7 7 5
51. ### Continuous Bag of Wordsɿग़ྗ૚ I, drink, everydayΛೖΕͯಘΒΕͨ୯ޠͷ֬཰෼෍ 2 6 6

4 0.23 0.03 0.62 0.12 3 7 7 5 coffeeͷ֬཰஋ Iͷ֬཰஋

53. ### Skip gram • 3૚ͷχϡʔϥϧωοτ • ೖྗɿ1୯ޠͷone-hotදݱ • ग़ྗɿ୯ޠͷ֬཰஋ (V࣍ݩ)͕2n ݸ

• ϥϕϧ: 1୯ޠͷone-hotදݱ͕2nݸ • ײ֮: CBoWΛதؒ૚Ͱରশʹͻͬ͘Γฦͨ͠ײ͡
54. ### Skip gramɿೖྗ૚ ଟ૚ύʔηϓτϩϯͷೖྗ૚͕ਤͷೖྗ૚ͷശ1ͭʹ૬౰ I 0.19 drink coffee h1 drink everyday

0.32 coffee h0 I everyday 0.45 0.05

56. ### Skip gramɿग़ྗ૚ ଟ૚ύʔηϓτϩϯͷग़ྗ૚͕ਤͷग़ྗ૚ͷശ1ͭʹ૬౰ softmax͕ 2n ݸ෼ I 0.19 drink coffee

h1 drink everyday 0.32 coffee h0 I everyday 0.45 0.05

59. ### ࣗ෼ͰσʔλΛूΊΑ͏ • 1จॻΛ1ߦɼ୯ޠΛۭന۠੾Γͱͨ͠σʔλΛ࡞Δ • ӳޠ: ΄΅ͦͷ··࢖͑Δ • ೔ຊޠ: ܗଶૉղੳ͕ඞཁ •

શֹܭࢉػͳΒ mecab • ࣗવݴޠͰͳͯ͘΋͍͍͔΋ • λάͷܥྻ

61. ### ঺հͯ͠ͳ͍͚ͲɼͰ͖Δ͜ͱ • จॻؒͷྨࣅ౓ܭࢉ (wmdistance ϝιου) • ଞͷ෼ࢄදݱΛֶशͨ͠ϞσϧΛgensimͰಡΈࠐΉ • Kerasͷॳظ஋ •

https://blog.keras.io/using-pre-trained-word-embeddings- in-a-keras-model.html
62. ### ࢀߟจݙͳͲ • gensim : https://radimrehurek.com/gensim/ • python࣮૷ • word2vec :

https://code.google.com/archive/p/word2vec/ • CɼΦϦδφϧ࣮૷ • word2vec Parameter Learning Explained : http://arxiv.org/pdf/1411.2738v3.pdf • ӳޠɼΘ͔Γ΍͍͢ղઆ • Efﬁcient Estimation of Word Representations in Vector Spaceɿhttp://arxiv.org/pdf/ 1301.3781.pdf • ӳޠɼCBoWͷ΋ͱ࿦จɽεϥΠυͷਤ͸ͪ͜Β͔ΒҾ༻ • ਂ૚ֶश Deep Learning. ਓ޻஌ೳֶձ. • ೔ຊޠɼॻ੶ • ΢Σϒσʔλͷػցֶश. ߨஊࣾ. • ೔ຊޠ, ॻ੶