Slide 1

Slide 1 text

ʮϕΠζਪ࿦ʹΑΔػցֶशೖ໳ʯྠಡձ #4 5.4 - 5.7

Slide 2

Slide 2 text

ࣗݾ঺հ • ਿࢁ Ѩ੟ • Software Engineer (Machine Learning ؔ܎ͷॾʑ) • ػցֶशਤؑ ڞஶ

Slide 3

Slide 3 text

͜͜ͰֶͿ͜ͱ • ҰൠʹػցֶशͰߦΘΕΔΞϧΰϦζϜΛϕΠζਪ࿦Ͱݟ௚͢ (͜͜Ͱ৮ΕΔେମͷΞϧΰϦζϜ͸ྫ͑͹ scikit-learn ʹ࣮૷͞Ε͍ͯΔ)

Slide 4

Slide 4 text

ํ਑ • ໰୊ઃఆͷ֬ೝΛߦ͏ • άϥϑΟΧϧϞσϧͷ֬ೝΛߦ͏ • ਺ࣜͷ֬ೝ͸௥͍ٻΊͳ͍ 1.ຊʹׂͱஸೡʹॻ͍ͯ͋ΔͨΊ 2.ຊΛهड़͕ͳ͍͜ͱΛ௥͍ٻΊΑ͏ͱ͢Δͱ্هͷϙΠϯτ ʹϑΥʔΧε͖͠Εͳ͍ͨΊ

Slide 5

Slide 5 text

໨࣍ 1.τϐοΫϞσϧ (5.4) 2.ςϯιϧ෼ղ (5.5) 3.ϩδεςΟοΫճؼ (5.6) 4.χϡʔϥϧωοτϫʔΫ (5.7)

Slide 6

Slide 6 text

Point ಺༰ (section) Point τϐοΫϞσϧ (5.4) ࣗવݴޠॲཧͷάϥϑΟΧϧϞσϧ ςϯιϧ෼ղ (5.5) ࣌ܥྻσʔλʹରͯ͠ͷڠௐϑΟϧλ ϦϯάͷάϥϑΟΧϧϞσϧ ϩδεςΟοΫճؼ (5.6) ࠶ύϥϝʔλʔԽ χϡʔϥϧωοτϫʔΫ (5.7) ޡࠩٯ఻೻๏

Slide 7

Slide 7 text

τϐοΫϞσϧ (5.4)

Slide 8

Slide 8 text

τϐοΫϞσϧ (5.4) ཁ໿ 1.LDA (Latent Dirichlet Allocation) Λѻ͏ 2.LDA Ͱ͸࣍ͷ 2 ͭΛಉ࣌ʹߦ͏ • จষ͔ΒͷτϐοΫͷநग़ • ֤τϐοΫͰग़ݱ͢Δ୯ޠͷੜ੒Ϟσϧͷֶश 3.ֶश͸ม෼ਪ࿦ɾ่յܕΪϒεαϯϓϦϯάͰՄೳ

Slide 9

Slide 9 text

τϐοΫϞσϧ ಺༰ 1.എܠɾ໰୊ઃఆ 2.Ϟσϧͷఆٛ 3.ֶश

Slide 10

Slide 10 text

τϐοΫϞσϧͰѻ͏σʔλ • ࣗવݴޠͰॻ͔Εͨจষͷղੳ • จষ͸ू߹ͱͯ͠ѻ͍ɺ୯ޠͷग़ݱॱং͸ߟྀ͠ͳ͍ • χϡʔεهࣄͷจষͷΑ͏ʹจষʹ͸τϐοΫ (੓࣏ɺܳೳɺ etc.) ͕͋ΓɺτϐοΫ͝ͱʹग़ݱ͢Δޠ͕۟ҧ͏ͱ૝ఆ͢Δ • ྫ : ʮྟ࣌ࠃձʯ͸ʮ੓࣏ʯτϐοΫͰ͸ग़ݱ͠΍͘͢ɺ ʮܳೳʯτϐοΫͰ͸ग़ݱ͠ʹ͍͘ • τϐοΫ͕Կ͔͸ ۩ମతʹ͸஌Βͳ͍ ΋ͷͱ͢Δ

Slide 11

Slide 11 text

τϐοΫϞσϧͰୡ੒͍ͨ͜͠ͱ 1.จষͷଐ͢ΔτϐοΫʹ͍ͭͯɺ෼ੳऀ͕໌ࣔతʹࢦఆ͢Δ͜ ͱͳ͘σʔλͦͷ΋ͷ͔Βநग़͍ͨ͠ • நग़ͨ͠τϐοΫ͕ԿΛҙຯ͢Δͷ͔͸෼ੳऀ͕൑அ͢Δ • จষ͸ʮ੓࣏ 0.9, ܳೳ 0.1ʯͷΑ͏ʹෳ਺ͷτϐοΫʹ ଐ͢Δ΋ͷͱ͢Δ 2.୯ޠ͕ଐ͢ΔτϐοΫʹ͍ͭͯจ຺Λߟྀ͍ͨ͠ • ʮυϥΠϒʯ͸ं͔Β΋ςΫϊϩδʔ͔Β΋ग़ݱ͢Δ

Slide 12

Slide 12 text

એ఻ ػցֶशਤؑͰѻ͍ͬͯΔͷͰ಺༰Λ֬ೝ

Slide 13

Slide 13 text

༨ஊ • ࿦จ1 Ͱ͸ूஂҨ఻ֶͰͷԠ༻ʹ͍ͭͯड़΂͍ͯΔ 1.ར֐ؔ܎Λ࣋ͭूஂΛݟ͚ͭɺͦΕΒͷؔ܎ੑΛௐ΂͍ͨ 2.֤ूஂ͔Β࠾औͨ͠ DNA ͔Βڞ௨ͷ૆ઌΛௐ΂͍ͨ • σʔλ͔Β෼ੳऀͷओ؍ʹΑΒͳ͍ʮӅ͞Εͨʯߏ଄Λநग़͠ ͍ͨɺͱ͍͏ͷ͕Ϟνϕʔγϣϯ 1 Pritchard, J. K.; Stephens, M.; Donnelly, P. (June 2000). "Inference of population structure using multilocus genotype data". Genetics. 155 (2): pp. 945–959. ISSN 0016-6731.

Slide 14

Slide 14 text

τϐοΫϞσϧ ಺༰ 1.എܠɾ໰୊ઃఆ 2.Ϟσϧͷఆٛ 3.ֶश

Slide 15

Slide 15 text

ه߸ͷ४උ

Slide 16

Slide 16 text

No content

Slide 17

Slide 17 text

τϐοΫ਺ 1 ͷ৔߹ • ࣗ໌ͳྫΛѻͬͯɺه߸ͷ֬ೝΛ·ͣߦ͏ • τϐοΫ਺ 1 ͱ͢Δͱɺ1 ͭͷτϐοΫʹ͢΂ͯͷจষ͕ଐ ͢Δ͜ͱʹͳΔ • ͭ·ΓɺҰൠʹ೔ຊޠͷจষʹ͍ͭͯͷੜ੒ϞσϧΛѻ͏

Slide 18

Slide 18 text

No content

Slide 19

Slide 19 text

τϐοΫ਺ 2 ͷ৔߹ • εύϜϝʔϧ൑ఆͷΑ͏ͳ΋ͷΛѻ͏ • ਖ਼͘͠͸࣍ 1.ϝʔϧ͕ 2 छྨʹ෼͚ΒΕͦ͏ͳ͜ͱ͸ͳ͔ͥ஌͍ͬͯΔ 2.݁Ռ͕Ͳ͏෼͔ΕΔ͔͸஌Βͳ͍ 3.্هͷঢ়ଶͰΫϥελϦϯάΛߦ͍͍ͨ

Slide 20

Slide 20 text

No content

Slide 21

Slide 21 text

τϐοΫϞσϧ ಺༰ 1.എܠɾ໰୊ઃఆ 2.Ϟσϧͷఆٛ 3.ֶश

Slide 22

Slide 22 text

τϐοΫϞσϧͷֶश • ม෼ਪ࿦ʹ͍ͭͯ͸লུ • ॻ੶ʹׂͱஸೡʹهड़͞Ε͍ͯΔͨΊ • ่յܕΪϒεαϯϓϦϯάʹ͍ͭͯѻ͏

Slide 23

Slide 23 text

ΪϒεαϯϓϦϯά (4.2) ͋Δ֬཰෼෍ ͔Βαϯϓϧ Λಘ͍ͨ৔߹ɺ࣍ͷΑ͏ʹ৚݅෇͖֬཰෼෍͔Βஞ࣍αϯϓϦϯά ͢Δ͜ͱͰɺۙࣅతʹ΋ͱͷ෼෍ʹै͏αϯϓϧྻΛಘΒΕΔɻ

Slide 24

Slide 24 text

่յܕΪϒεαϯϓϦϯά (4.2) ͋Δ֬཰෼෍ ͔Βαϯϓϧ Λ ಘ͍ͨ৔߹ɺ࣍ͷΑ͏ʹपลআڈ͔ͯ͠ΒαϯϓϦϯά͢Δɻ

Slide 25

Slide 25 text

τϐοΫϞσϧ΁ͷԠ༻ 1.่յܕΪϒεαϯϓϦϯάΛ༻͍ͯ ΛαϯϓϦϯά͠ɺ࣮ ݱ஋ ΛಘΔ 2.࣮ݱ஋ Λ࢖ͬͯ୯ޠΛτϐοΫʹ෼͚ɺ ൪໨ͷτϐοΫ ͔Βग़ݱͨ͠୯ޠू߹ ΛಘΔ 3. ͔Β Λߋ৽͢Δ 4. ͔Β֤จষͷτϐοΫൺ཰ΛٻΊɺ Λߋ ৽͢Δ

Slide 26

Slide 26 text

पลআڈͨ͠άϥϑΟ ΧϧϞσϧ • पลআڈ͢Δͱ׬શάϥϑ͕Ͱ͖Δ (ӈਤ) • τϐοΫϞσϧͰ΋पลআڈΛ͢Δͱ ׬શάϥϑ͕Ͱ͖͕͋Δ

Slide 27

Slide 27 text

Ϛϧίϑϒϥϯέοτ (1.5) • , , , , , , ͷؒʹ͸ґଘ ؔ܎͕͋Δ • ͜ͷઌʹ͋Δϊʔυ͸͢΂ͯ ͱಠཱ

Slide 28

Slide 28 text

৚݅෇͖ಠཱੑͷ֬ೝ • ͱ , ͸৚݅෇͖ಠཱ • ਤ 4.8 ͱಉٞ͡࿦ • ͱ ͸৚݅෇͖ಠཱ • ڞಉ਌ͷؔ܎ʹͳ͍ͨΊ • ্هΛ࢖͏ͱ ͷ෼෍͸࣍ͷΑ͏ ʹͰ͖Δ

Slide 29

Slide 29 text

Ҏ߱লུ

Slide 30

Slide 30 text

ςϯιϧ෼ղ (5.4)

Slide 31

Slide 31 text

ςϯιϧ෼ղ ཁ໿ 1.τϨϯυΛߟྀͨ͠ڠௐϑΟϧλϦϯάΛѻ͏ 2.ڠௐϑΟϧλϦϯάͰ͸ϢʔβʔͱΞΠςϜͷ྆ํΛߟྀͨ͠ ϨʔςΟϯάΛߦ͏ 3.ֶश͸ม෼ਪ࿦ͰՄೳ (ΪϒεαϯϓϦϯά͸ݴٴͳ͠)

Slide 32

Slide 32 text

ςϯιϧ෼ղ ಺༰ 1.എܠɾ໰୊ઃఆ 2.Ϟσϧͷఆٛ 3.ֶश

Slide 33

Slide 33 text

໰୊ઃఆ: ۺ԰ͷधཁ༧ଌ2(1/2) • قॳͷ஫จͱաڈͷ஫จཤྺ͔Βࠓقͷधཁ༧ଌΛ͍ͨ͠ • ҠಈฏۉϞσϧ (ARMA) ͸࢖͑ͳ͍ɺͳͥͳΒ৽͍͠ΞΠςϜ ʹ͍ͭͯաڈͷσʔλ͸ͳ͍ • ճؼϞσϧ͸࢖͑ͳ͍ɺͳͥͳΒυϝΠϯ஌͕ࣝෳࡶ͗͢Δ͠ ϙϦγʔʹ൓͢Δ (?) 2 L.Xiong,X.Chen,T.K.Huang,J.Schneider,andJ.G.Carbonell.Temporal collaborative filtering with Bayesian probabilistic tensor factorization. In Proceeding sof the 2010 SIAM International Conferenceon Data Mining, pages 211-222, 2010.

Slide 34

Slide 34 text

໰୊ઃఆ: ۺ԰ͷधཁ༧ଌ (2/2) • యܕతͳڠௐϑΟϧλϦϯάͰ͸ෆे෼ɺͳͥͳΒσβΠϯ΍ ফඅऀͷ޷ΈͷྲྀߦΛߟྀͰ͖ͳ͍ • ͦ͜Ͱզʑ͸࣌ܥྻΛߟྀͨ͠ BPTF (Bayesian Probabilistic Tensor Factorization) ΛఏҊ͢Δ • Netflix ͷࢹௌཤྺσʔλ3Ͱݕূͨ͠ͱ͜Ζɺ࣌ܥྻΛߟྀ ͠ͳ͍Ϟσϧʹൺ΂͔ͯ֬ʹվળ͞Εͨ 3 Netflix Prise data https://www.kaggle.com/netflix-inc/netflix-prize-data

Slide 35

Slide 35 text

༨ஊ • ڠௐϑΟϧλϦϯά͸ɺΞΠςϜͱϢʔβʔΛ ࣍ݩϕΫτϧ ʹຒΊࠐΜͰ͍ΔͱղऍͰ͖Δ • ਂ૚ֶश੎͔Β͸ embedding Λಘ͍ͨͷͳΒ auto encoder ͱ͔ word2vec ͱ͔࢖͑͹͍͍Μ͡Όͳ͍͔ɺͱ ͳΔ (͔΋) • ࣮ࡍɺචऀ͕ VAE ͱͷؔ࿈ʹ͍ͭͯϒϩάهࣄ4Λॻ͍͍ͯΔ 4 ઢܗճؼΛ̍ͭ̍ͭվ଄ͯ͠ม෼ΦʔτΤϯίʔμʢVAEʣΛ࡞Δ - ࡞ͬͯ༡Ϳػցֶशɻ

Slide 36

Slide 36 text

ςϯιϧ෼ղ ಺༰ 1.എܠɾ໰୊ઃఆ 2.Ϟσϧͷఆٛ 3.ֶश

Slide 37

Slide 37 text

ڠௐϑΟϧλϦϯά (1/2) • ԣํ޲ʹΞΠςϜɺॎํ޲ʹϢʔβʔΛฒ΂ͨߦྻ Λѻ͏ • ͸Ϣʔβʔ ͷΞΠςϜ ʹର͢ΔϨʔςΟϯά • ػցֶशʹ͓͍ͯ࣍ͷΑ͏ʹߟ͑Δ͜ͱ͕ଟ͍ (ཁग़య) • ϕΫτϧ : 1 ࣍ݩ഑ྻ • ߦྻ : දɺςʔϒϧ • ςϯιϧ : ෳ਺ͷද (ϛχόονͱ͔)

Slide 38

Slide 38 text

ڠௐϑΟϧλϦϯά (2/2) • ϨʔςΟϯάΛ࣍ͷΑ͏ʹ෼ղ͢Δ • , , ͱͯ͠ • ੒෼Ͱॻ͘ͱ࣍ͷࣜ • ࣍ݩ࡟ݮʹΑΔܽଛ஋ͷิ׬ʹಉ͡ (5.1)

Slide 39

Slide 39 text

࣌ܥྻ΁ͷ֦ு • ͋Δ࣌ؒ ʹ͓͚Δಛ௃ ͷྲྀߦ౓ ͷΑ͏ͳ΋ͷ (ݪจϚϚ) Λಋೖ ͢Δ • ϢʔβʔͱΞΠςϜͷ಺ੵΛɺྲྀߦ౓ʹ ΑΔॏΈ෇͖Ͱܭࢉ͍ͯ͠Δ

Slide 40

Slide 40 text

No content

Slide 41

Slide 41 text

ςϯιϧ෼ղ ಺༰ 1.എܠɾ໰୊ઃఆ 2.Ϟσϧͷఆٛ 3.ֶश

Slide 42

Slide 42 text

ֶश (লུ) • ม෼ਪ࿦ͰՄೳ (ৄࡉ͸লུ) • ࣍ͷ݁Ռʹ஫໨͢Δ • ͸Ϣʔβʔ͝ͱʹݸผͷ஋ֶ͕श͞ΕΔ • ͸Ϣʔβʔ͝ͱʹڞ௨ͷ஋ֶ͕श͞ΕΔ • ΋ಉ༷ • ਫ਼౓ߦྻ , , ͸ʮେ͖͘ʯͳ͍ͬͯΔ

Slide 43

Slide 43 text

ϩδεςΟοΫճؼ (5.6)

Slide 44

Slide 44 text

ϩδεςΟοΫճؼ ཁ໿ 1.ϩδεςΟοΫճؼͰ͸ Softmax ؔ਺Λ༻͍ΔͨΊɺࠓ·Ͱ ͷΑ͏ʹղੳղ͕ٻΊΒΕͳ͍ 2.࠷దͳύϥϝʔλʔΛٻΊΔͨΊʹɺ৽ͨʹޯ഑๏Λಋೖ͢Δ 3.ޯ഑ͷܭࢉΛۙࣅతʹߦ͏ͨΊɺ࠶ύϥϝʔλʔԽτϦοΫΛ ߦͬͯܭࢉ͢Δ

Slide 45

Slide 45 text

ϩδεςΟοΫճؼ ಺༰ 1.എܠɾ໰୊ઃఆ 2.Ϟσϧͷఆٛ 3.ֶश

Slide 46

Slide 46 text

ϩδεςΟοΫճؼ (ඇϕΠζ) ػցֶशਤؑͰѻ͍ͬͯΔͷͰ֬ೝ

Slide 47

Slide 47 text

ϕΠζʹ͢ΔͱԿ͕خ͍͠ͷ͔ • ਤ 5.15 • Կݸ΋ϩδεςΟοΫճؼϞσϧΛ࡞ͬͯΞϯαϯϒϧ͢Δ͜ ͱͰ͍͍ײ͡ͷग़ྗ͕ಘΒΕΔ

Slide 48

Slide 48 text

ϩδεςΟοΫճؼ ಺༰ 1.എܠɾ໰୊ઃఆ 2.Ϟσϧͷఆٛ 3.ֶश

Slide 49

Slide 49 text

ϩδεςΟοΫճؼ ଟ࣍ݩϕΫτϧ ͕࣍ͷΑ͏ͳΧςΰϦ෼෍ʹैͬͯग़ྗ͞Ε Δ΋ͷͱ͢Δɻ ͜͜Ͱɺ ͸ඇઢܗؔ਺Ͱɺࠓճ͸ Softmax ؔ਺Λ༻͍Δɻ

Slide 50

Slide 50 text

༨ஊ ιϑτϚοΫεؔ਺ͷඍ෼

Slide 51

Slide 51 text

No content

Slide 52

Slide 52 text

ϩδεςΟοΫճؼ ֶश • ೖྗ஋ͱग़ྗ஋ͷσʔληοτ ͔Β ͷࣄޙ෼෍Λ ܭࢉ ϩδεςΟοΫճؼ ਪ࿦ • ৽نͷೖྗσʔλ ͕༩͑ΒΕͨͱ͖ͷग़ྗ஋ ͷܭࢉ

Slide 53

Slide 53 text

ϩδεςΟοΫճؼ ಺༰ 1.എܠɾ໰୊ઃఆ 2.Ϟσϧͷఆٛ 3.ֶश

Slide 54

Slide 54 text

ࠓ͔Βઆ໌͢Δ಺༰ • ม෼ਪ࿦Λߦ͍͍͕ͨɺ୯७ͳฏۉ৔ۙࣅͰ͸ෆՄೳ • ϞϯςΧϧϩ๏Ͱ΋ՄೳͰ͸͋Δ͕ɺ͏·͍͔͘ͳ͍͜ͱ͕஌ ΒΕ͍ͯΔ • ࠶ύϥϝʔλʔԽτϦοΫΛ࢖ͬͨޯ഑๏ͰɺϞϯςΧϧϩ๏ Λճආ͢Δ

Slide 55

Slide 55 text

ม෼ਪ࿦ • ͱ͢Δ • KLμΠόʔδΣϯε Λ࠷খԽ͢Δ

Slide 56

Slide 56 text

ղੳతͳղΛಘΒΕͳ͍߲ͷܭࢉ • ࠷ޙͷ໬౓ͷظ଴஋Λͱ͍ͬͯΔ߲ͷܭࢉ͕ࠔ೉ • ϞϯςΧϧϩ๏ʹΑΔܭࢉ͕ՄೳͰ͸͋Δ • ͔͠͠ɺ ΛαϯϓϦϯά͢Δͱ࠷దԽ͍ͨ͠ύϥϝʔλʔ ͕࢒Βͳ͍

Slide 57

Slide 57 text

࠶ύϥϝʔλʔԽτϦοΫ • ΛαϯϓϦϯάͯ͠ Λಘͨͱ͢Δ • ͔ΒαϯϓϦϯάͨ͠ͱߟ͑Δͱɺ࠷దԽͨ͠ ͍ύϥϝʔλʔ͕͔ࣜΒফ͑ͯޯ഑๏͕࣮ߦͰ͖ͳ͍ • ͷ࣮ݱ஋ ͕ಘΒΕͨͷͩͱΈͳ͢ ( ΛαϯϓϦϯάͨ͠ͷͩͱߟ͑Δ) • 1ճͷαϯϓϦϯάͰޯ഑ͷܭࢉͱɺॏΈͷߋ৽͕Ͱ͖Δ

Slide 58

Slide 58 text

No content

Slide 59

Slide 59 text

ࣜ 5.236 Ҏ߱ͷهड़ʹ͍ͭͯ • ޯ഑͕ܭࢉͰ͖ͨͷͰɺύϥϝʔλʔΛߋ৽Ͱ͖Δ • ͋ͱ͸Կ౓΋αϯϓϦϯάͱύϥϝʔλʔͷߋ৽ΛؤுΔ

Slide 60

Slide 60 text

ਪ࿦ • ࠷దͳύϥϝʔλʔ͕ٻΊΒΕͨͷͰ࣍ͷΑ͏ʹۙࣅͰ͖Δ • ϥϕϧͷظ଴஋͸࣍ͷΑ͏ʹۙࣅͰ͖Δ

Slide 61

Slide 61 text

χϡʔϥϧωοτϫʔΫ (5.7)

Slide 62

Slide 62 text

লུ

Slide 63

Slide 63 text

লུ͢Δཧ༝ 1.໰୊ઃఆ͸ϩδεςΟοΫճؼʹ΄΅ಉ͡ 2.Ϟσϧ͸ϩδεςΟοΫճؼΛॏͶΔ͚ͩ 3.ֶशͰಋೖ͢Δޡࠩٯ఻೻๏ʹ͍ͭͯɺ͜Ε͚ͩͷઆ໌ͰΘ͔ Δਓ͸͍ͳ͍

Slide 64

Slide 64 text

༨ஊ • χϡʔϥϧωοτϫʔΫͷॏΈΛখ͞ͳཚ਺ͰॳظԽ͠ޯ഑߱ Լ๏Ͱܭࢉͤ͞Δɺͱ͍͏ͷ͸Ұൠత • ͜͜·ͰདྷΔͱʮଟ਺ͷॏΈΛ͔͚߹Θͤͯ଍͢ʯͱ͍͏ͷ ͸ɺظ଴஋ΛͱΔૢ࡞ʹݟ͑ͯ͘Δ • ޯ഑๏Λ༻͍ͯߋ৽ͨ͠ॏΈͷ஋Λར༻͢Δͷ͸ɺࣄޙ෼෍͔ ΒαϯϓϦϯά͍ͯ͠ΔΑ͏ʹݟ͑ͯ͘Δ • ࣮ࡍɺਂ૚ֶश͸Ψ΢εաఔͱΈͳͤΔ͜ͱ͕஌ΒΕ͍ͯΔ

Slide 65

Slide 65 text

Recap ಺༰ (section) Point τϐοΫϞσϧ (5.4) ࣗવݴޠॲཧͷάϥϑΟΧϧϞσϧ ςϯιϧ෼ղ (5.5) ࣌ܥྻσʔλʹରͯ͠ͷڠௐϑΟϧλ ϦϯάͷάϥϑΟΧϧϞσϧ ϩδεςΟοΫճؼ (5.6) ࠶ύϥϝʔλʔԽ χϡʔϥϧωοτϫʔΫ (5.7) ޡࠩٯ఻೻๏ (লུ)