Slide 1

Slide 1 text

Continuous Bag of Wordsೖ໳ @ػցֶशษڧձ 2016೥04݄22೔ʢۚʣ M1 ໺୔

Slide 2

Slide 2 text

ࠓ೔࿩͢͜ͱ • ଟ૚ύʔηϓτϩϯ (MLP) • Continuous Bag of Words • word2vecʹ͋ΔยํͷϞσϧ • ߴ଎Խ΍NGʹ͍ͭͯ͸ݴٴ͠·ͤΜ

Slide 3

Slide 3 text

ଟ૚ύʔηϓτϩϯͷ͓͞Β͍ • ؙɿ1ͭͷ਺஋Λड͚ͯɼؔ਺Λద༻ͯ͠1ͭͷ਺஋Λग़ྗ ʢؙ1ͭΛϢχοτɼؔ਺Λ׆ੑԽؔ਺ʣ • ໼ҹɿϢχοτͷग़ྗͱॏΈʢ਺஋ʣͷੵΛ࣍ͷ૚ʹ఻೻ Ͱ͖Δ͚ͩਖ਼ղ͢ΔΑ͏ͳॏΈΛٻΊΔ Input layer hidden layer output layer (soft max) x1 h3 h1 h2 x2 x3 x4 0.2 0.5 0.3

Slide 4

Slide 4 text

ଟ૚ύʔηϓτϩϯͷ۩ମྫ • 4୯ޠ͔͠ͳ͍ੈքΛߟ͑Δ • [jobs, mac, win8, ms] • ೖྗɿจॻ • ग़ྗɿ֬཰ʢೖྗจॻ͕”mac”͔”windowns”ʣ Input layer hidden layer output layer (softmax) jobs h3 h1 h2 mac win8 ms p(mac)=0.2 p(win)=0.8

Slide 5

Slide 5 text

۩ମྫɿೖྗ૚ ͦΕͧΕ୯ޠͷස౓͕ೖྗ૚ͷೖྗ஋ • doc0: [win8, win8, ms, ms, ms, jobs] -> ms • doc1: [jobs, mac, mac, mac, mac, mac, mac] -> mac Input layer hidden layer output layer (softmax) jobs=1 h3 h1 h2 mac=0 win8=2 ms=3 Input layer hidden layer output layer (softmax) jobs=1 h3 h1 h2 mac=6 win8=0 ms=0 doc0 doc1

Slide 6

Slide 6 text

۩ମྫɿӅΕ૚ ೖྗ-ӅΕؒͷॏΈߦྻW͸ɼ3x4ͷߦྻ ӅΕ૚͸ɼ(ೖྗ૚ͷग़ྗ)x(ॏΈ)ͷ࿨hΛड͚औΔ doc0 2 4 1 2 3 0 1 2 1 2 1 1 1 1 3 5 2 6 6 4 1 0 2 3 3 7 7 5 = 2 4 7 9 5 3 5 Input layer hidden layer output layer (softmax) jobs=1 f(5)=0.99 f(7)=0.99 f(9)=0.99 mac=0 win8=2 ms=3 Wx = h

Slide 7

Slide 7 text

۩ମྫɿӅΕ૚ ೖྗ-ӅΕؒͷॏΈߦྻW͸ɼ3x4ͷߦྻ ӅΕ૚͸ɼ(ೖྗ૚ͷग़ྗ)x(ॏΈ)ͷ࿨hΛड͚औΔ doc0 2 4 1 2 3 0 1 2 1 2 1 1 1 1 3 5 2 6 6 4 1 0 2 3 3 7 7 5 = 2 4 7 9 5 3 5 Input layer hidden layer output layer (softmax) jobs=1 f(5)=0.99 f(7)=0.99 f(9)=0.99 mac=0 win8=2 ms=3

Slide 8

Slide 8 text

۩ମྫɿӅΕ૚ ׆ੑԽؔ਺ f(x) Λ௨ͯ͠ӅΕ૚͔Βग़ྗ doc0 Input layer hidden layer output layer (softmax) jobs=1 f(5)=0.99 f(7)=0.99 f(9)=0.99 mac=0 win8=2 ms=3 By Chrislb - created by Chrislb, CC දࣔ-ܧঝ 3.0, https://commons.wikimedia.org/w/index.php?curid=223990 ؔ਺ྫɿγάϞΠυؔ਺

Slide 9

Slide 9 text

۩ମྫɿग़ྗ૚ ӅΕ૚-ग़ྗ૚ͷॏΈW’͸ɼ2x3ͷߦྻ ग़ྗ͸ɼ(ӅΕ૚ͷग़ྗ)x(ॏΈ)ͷ࿨Λड͚औΔ doc0 Input layer hidden layer output layer (softmax) jobs=1 f(5)=0.99 f(7)=0.99 f(9)=0.99 mac=0 win8=2 ms=3 -0.1 0.1  1 1 1.01 1 1 1.01 2 4 0.99 0.99 0.99 3 5 =  1.0 1.0 W0f(h) = u o

Slide 10

Slide 10 text

ग़ྗ૚ͷ׆ੑԽؔ਺ ग़ྗ૚ͷ׆ੑԽؔ਺ɿ֬཰஋Λग़ྗ͢Δsoftmaxؔ਺ doc0(=[win8, win8, ms, ms, ms, jobs])͸0.54Ͱwinͷจॻ Input layer hidden layer output layer (softmax) jobs=1 f(5)=0.99 f(7)=0.99 f(9)=0.99 mac=0 win8=2 ms=3 -0.1 0.1 p(mac)=0.46 p(win)=0.54 exi P n exn e0.1 e0.1 + e 0.1 = 0.54 e 0.1 e0.1 + e 0.1 = 0.46

Slide 11

Slide 11 text

ֶश • ޡࠩٯ఻೻๏Λ࢖ͬͯॏΈW, W’ Λௐઅ͠ɼdoc0͕win ʹͳΔ֬཰ΛߴΊΔΑ͏ʹֶश • doc0ͱ͖ɼޡࠩͷݩʹͳΔͷ͸ਖ਼ղϥϕϧ [0, 1] Input layer hidden layer output layer (softmax) jobs=1 f(5)=0.99 f(7)=0.99 f(9)=0.99 mac=0 win8=2 ms=3 -0.1 0.1 p(mac)=0.46 p(win)=0.54

Slide 12

Slide 12 text

CBoWͷΞϧΰϦζϜ MLP͕Θ͔Ε͹ָͳ͸ͣɽɽɽɽ

Slide 13

Slide 13 text

one—hotදݱ • ୯ޠΛޠኮ਺࣍ݩVͷϕΫτϧͰදݱ • ରԠ͢Δ࣍ݩ͚ͩ1ɼ࢒Γ͸0 ྫɿ΋͠{I, drink, coffee, everyday} ͳΒ I = [1, 0, 0, 0] drink = [0, 1, 0, 0] coffee = [0, 0, 1, 0] everyday = [0, 0, 0, 1]

Slide 14

Slide 14 text

จ຺૭෯ ͋Δจʹ͓͍ͯ஫໨͢Δ1୯ޠͷपғn୯ޠΛѻ͏ ͜ͷͱ͖ɼnΛจ຺૭෯ͱ͍͏ Q. I drink coffee everydayͰจ຺૭෯2ҎԼʹग़ݱ͢Δ Bog of Words͸ʁ A. [I, drink, everyday]

Slide 15

Slide 15 text

Continuous Bag of Wordsɿ֓ཁ • 3૚ͷχϡʔϥϧωοτ • ೖྗɿจ຺૭෯ҎԼͰڞى͢Δ୯ޠ • ग़ྗɿ1୯ޠͷ֬཰෼෍

Slide 16

Slide 16 text

Continuous Bag of Wordsɿೖྗ૚ MLPͷೖྗ૚͕ਤͷೖྗ૚ͷശ1ͭʹ૬౰ Input layer hidden layer output layer (softmax) jobs=1 f(5)=0.99 f(7)=0.99 f(9)=0.99 mac=0 win8=2 ms=3 MLP

Slide 17

Slide 17 text

Continuous Bag of Wordsɿೖྗ૚ • ശ1ͭ͸one-hotදݱΛड͚औΔ • I drink coffee everyday Ͱw(t)=coffee drink= [0, 1, 0, 0] ͕੺͍෦෼ͷͱΔ஋ coffee

Slide 18

Slide 18 text

Continuous Bag of Wordsɿೖྗ૚ I = [0, 1, 0, 0] drink= [0, 1, 0, 0] everyday = [0, 0, 0, 1] coffee

Slide 19

Slide 19 text

Continuous Bag of Wordsɿೖྗ૚-ӅΕ૚ͷॏΈ • ໼ҹ1ͭʹରͯ͠ɼॏΈߦྻ • ͜ͷॏΈߦྻ͸ڞ༗ WN⇥V 2 4 1 2 3 0 1 2 1 2 1 1 1 1 3 5 2 6 6 4 0 1 0 0 3 7 7 5 = 2 4 2 2 1 3 5 Wx = ut 1

Slide 20

Slide 20 text

Continuous Bag of Wordsɿೖྗ૚-ӅΕ૚ͷॏΈ • ໼ҹ1ͭʹରͯ͠ɼॏΈߦྻ • ͜ͷॏΈߦྻ͸ڞ༗ • ೖྗ஋͸one–hotΑΓɼ୯ޠϕΫτϧ͕ӅΕ૚ʹ఻೻ WN⇥V 2 4 1 2 3 0 1 2 1 2 1 1 1 1 3 5 2 6 6 4 0 1 0 0 3 7 7 5 = 2 4 2 2 1 3 5 Wx = ut 1

Slide 21

Slide 21 text

Continuous Bag of WordsɿӅΕ૚ • ୯ޠϕΫτϧͷฏۉ͕ӅΕ૚ͷೖྗʢN࣍ݩϕΫτϧʣ • ׆ੑԽؔ਺ͳ͠ ut 2 + ut 1 + ut+1 3 = h 1 3 0 @ 2 4 1 1 1 3 5 + 2 4 2 2 1 3 5 + 2 4 0 2 1 3 5 1 A = 2 4 1 1.67 0.33 3 5

Slide 22

Slide 22 text

Continuous Bag of WordsɿӅΕ૚-ग़ྗ૚ ॏΈߦྻ ͱӅΕ૚ͷग़ྗ஋ʢฏۉϕΫτϧʣͷੵ W0V ⇥N 2 6 6 4 1 2 1 1 2 1 1 2 2 0 2 0 3 7 7 5 2 4 1.00 1.67 0.33 3 5 = 2 6 6 4 4.01 2.01 5.00 3.34 3 7 7 5 W0h = u o

Slide 23

Slide 23 text

Continuous Bag of Wordsɿग़ྗ૚ 1୯ޠͷ༧ଌΛ͍ͨ͠ • ग़ྗ૚ͷϢχοτ਺ = ޠኮ਺ = V • ׆ੑԽؔ਺ɿsoftmaxؔ਺ softmax (u o ) = y softmax 0 B B @ 2 6 6 4 4 . 01 2 . 01 5 . 00 3 . 34 3 7 7 5 1 C C A = 2 6 6 4 0 . 23 0 . 03 0 . 62 0 . 12 3 7 7 5

Slide 24

Slide 24 text

Continuous Bag of Wordsɿग़ྗ૚ I, drink, everydayΛೖΕͯಘΒΕͨ୯ޠͷ֬཰෼෍ 2 6 6 4 0.23 0.03 0.62 0.12 3 7 7 5 coffeeͷ֬཰஋

Slide 25

Slide 25 text

ֶश݁Ռͷ୯ޠϕΫτϧ • ೖྗ૚ͱӅΕ૚ؒͷॏΈߦྻ͕୯ޠϕΫτϧͷू߹ • 1୯ޠɿ100࣍ݩͱ͔200࣍ݩͰີͳϕΫτϧ

Slide 26

Slide 26 text

୯ޠϕΫτϧͷخ͍͠ಛੑ • analogy • king-man+woman=queen • Japan-Tokyo+Paris=France • eats-eat+run=runs • ୯ޠͷಛ௃ྔ • ਂ૚ֶशͷॳظ஋ • ྨࣅ౓ܭࢉ • nzwͷ࠷ॳͷ࿦จ͸͜Ε

Slide 27

Slide 27 text

ࢀߟจݙͳͲ • gensim : https://radimrehurek.com/gensim/ • pythonɼؔ਺͕͍Ζ͍Ζ͋ͬͯศར • chainer : https://github.com/pfnet/chainer/tree/master/examples/word2vec • PythonɼχϡʔϥϧωοτͰͷ࣮૷ྫ • word2vec : https://code.google.com/archive/p/word2vec/ • CɼΦϦδφϧ • word2vec Parameter Learning Explained : http://arxiv.org/pdf/1411.2738v3.pdf • ӳޠɼΘ͔Γ΍͍͢ղઆ • Efficient Estimation of Word Representations in Vector Spaceɿhttp://arxiv.org/pdf/ 1301.3781.pdf • ӳޠɼCBoWͷ΋ͱ࿦จɽεϥΠυͷਤͷCBoW͸ͪ͜Β͔Β • ਂ૚ֶश Deep Learning. ਓ޻஌ೳֶձ. • ೔ຊޠɼॻ੶