Upgrade to Pro — share decks privately, control downloads, hide ads and more …

ゼロから作るDeep Learning 2 3章 word2vec 
3.1〜3.2

ゼロから作るDeep Learning 2 3章 word2vec 
3.1〜3.2

ゼロから作るDeep Learning 2 自然言語編 読書会 第5回
の資料です!
https://retrieva.connpass.com/event/131746/

ota42y

May 29, 2019
Tweet

More Decks by ota42y

Other Decks in Programming

Transcript

  1. θϩ͔Β࡞ΔDeep Learning 2
    ̏ষ word2vec 

    3.1ʙ3.2
    ota42y
    θϩ͔Β࡞ΔDeep Learning 2 ࣗવݴޠฤ ಡॻձ ୈ5ճ

    View Slide

  2. ͜ͷষͰ΍Δ͜ͱ
    • word2vecΛ࣮૷͢Δ
    • ਪ࿦ϕʔεͰ୯ޠΛϕΫτϧͰද͢ํ๏
    • γϯϓϧ͕ͩແବ͸ଟ͍࣮૷
    • ଎౓͸࣍ͷষͰରԠ

    View Slide

  3. 3.1
    ਪ࿦ϕʔεͷख๏ͱ
    χϡʔϥϧωοτϫʔΫ

    View Slide

  4. ਪ࿦ϕʔεͷϕΫτϧԽ
    • ୯ޠΛϕΫτϧʹ͢Δ̎ͭͷख๏
    • Χ΢ϯτϕʔεʢ̎ষʣ
    • ਪ࿦ϕʔεʢ̏ষʣ
    • ͲͪΒ΋෼෍ԾઃΛϕʔεʹͯ͠Δ͕Ξϓϩʔν͸શ͘ผ
    • ෼෍Ծઃɿ୯ޠͷҙຯ͸पғͷ୯ޠ͔Βܗ੒͞ΕΔ (p.67)

    View Slide

  5. 3.1.1ɹΧ΢ϯτϕʔεͷख๏ͷ໰୊఺
    • Χ΢ϯτϕʔε͸पғͷ୯ޠͷස౓Λܭࢉ͢Δ
    • ޠኮ਺͕nͩͱn*nͷڊେͳڞىߦྻ͕ඞཁʹͳΔ
    • ࣍ݩ࡟ݮͷͨΊͷSVD͸O(n^3)ͷܭࢉྔɺ΍͹͍

    View Slide

  6. ਪ࿦ϕʔεͷར఺
    • Χ΢ϯτϕʔε͸ίʔύεશମͷ౷ܭσʔλΛҰؾʹར༻͢Δ
    • ਪ࿦ϕʔε(χϡʔϥϧωοτ)͸ίʔύεͷҰ෦Ͱֶश͢Δ
    • GPUͷฒྻܭࢉ΋ฉ͘
    • খ෼͚ʹͰ͖ɺߴ଎ʹฒྻॲཧͰ͖ΔͷͰڊେσʔλͰ΋ରԠͰ͖Δ
    • ଞʹ΋ັྗతͳ఺͕͋Δ(Β͍͠ɺৄ͘͠͸3.5.3)

    View Slide

  7. 3.1.2ɹਪ࿦ϕʔεͷख๏ͷ֓ཁ

    View Slide

  8. पғͷ୯ޠ͔Β୯ޠΛʮਪ࿦ʯ͢Δ
    • `?`ʹ͸Կ͕ೖΔ͔Λલޙ͔Βਪ࿦
    • ίϯςΩετ͔ΒλʔήοτΛਪ࿦
    • ίϯςΩετɿपғͷ୯ޠ(you, goodby)
    • λʔήοτɿର৅ͷ୯ޠ(`?`)

    View Slide

  9. ਪ࿦݁Ռ
    • ֤୯ޠ͕ͦ͜ʹݱΕΔ֬཰Λग़ྗ
    • ίϯςΩετΛϞσϧʹ༩͑Δͱ୯ޠͷ֬཰෼෍͕ಘΒΕΔ

    View Slide

  10. 3.1.3 χϡʔϥϧωοτϫʔΫʹ͓͚Δ୯
    ޠͷॲཧํ๏
    • χϡʔϥϧωοτϫʔΫ(NN)ͷೖྗ͸ݻఆ௕ϕΫτϧ
    • ୯ޠΛͦͷ··ೖΕΔͷ͸೉͍͠
    • ୯ޠΛone-hotදݱ(one-hotϕΫτϧ)ʹม׵͢Δ

    View Slide

  11. one-hotදݱ
    • ޠኮ਺ͷ௕͞Λ࣋ͪɺ୯ޠIDͱ֘౰͢Δ෦෼͕1ɺͦΕҎ֎͕0
    ͷϕΫτϧ
    • ͢΂ͯͷ୯ޠΛಉ͡௕͞ͷϕΫτϧͱͯ͠දݱ

    View Slide

  12. one-hotදݱ
    • શ݁߹૚Ͱม׵͢ΔͳΒ؆୯(ྫ͸தؒ૚=3)

    View Slide

  13. αϯϓϧίʔυ(p.99)
    • np.dot(c, W)͸୯ޠʹରԠ͢ΔॏΈΛऔΓग़ͯ͠Δ͚ͩ
    • W[0]ͷσʔλΛऔΓग़ͯ͠Δ͚ͩ
    • ແବͬΆ͍͕࣍ͷষͰ࣏͢Β͍͠

    View Slide

  14. ϨΠϠදݱ
    • MatMulϨΠϠ(p.30)Ͱ΋ಉ͜͡ͱ͕Ͱ͖Δ
    • np.dot͢Δ͚ͩͷϨΠϠͳͷͰ

    View Slide

  15. 3.2ɹγϯϓϧͳword2vec

    View Slide

  16. word2vecΛ࣮૷͢Δ
    • word2vecͰ࢖ΘΕΔϞσϧ͸CROWϞσϧͱskip-gramϞσϧ
    • "word2vec"͕͜ΕΒͷϞσϧΛࢦ͢৔߹΋͋Δ
    • ຊདྷͷҙຯͱ͸ζϨͯΔ

    View Slide

  17. 3.2.1 CBOWϞσϧͷਪ࿦ॲཧ
    • ίϯςΩετ͔ΒλʔήοτΛਪଌ͢ΔNN
    • ίϯςΩετʹपғͷ୯ޠ
    • λʔήοτʹର৅ͷ୯ޠ

    View Slide

  18. ୯ޠͷ෼ࢄදݱ
    • CBOWϞσϧΛ܇࿅͢Δ͜ͱͰ୯ޠͷ෼ࢄදݱΛಘΒΕΔ
    • Ϟσϧͷύϥϝʔλ͕෼ࢄදݱʹରԠ͢Δ

    View Slide

  19. CBOWϞσϧͷશମ૾
    • ίϯςΩετʹ̎ɺӅΕ૚ʹ̏ͷ৔߹

    View Slide

  20. CBOWϞσϧͷશମ૾
    • ೖྗ͸ෳ਺ݸͷone-hotදݱͷ୯ޠ
    • ग़ྗ͸֤୯ޠͷείΞ
    • softmaxΛ࢖͏ͱ֬཰͕ಘΒΕΔ
    • தؒ૚͸ೖྗ૚͔Βͷ஋ͷฏۉ

    View Slide

  21. • ෼ࢄදݱͷਖ਼ମ
    • [$ W_{in}]͸7*3ͷॏΈ
    • ͜Ε͕୯ޠͷ෼ࢄදݱ
    • ֶशʹΑͬͯྑ͍෼ࢄදݱʹ͍ͯ͘͠

    View Slide

  22. CBOWϞσϧͷϨΠϠදݱ

    View Slide

  23. CBOWϞσϧͷϨΠϠදݱ
    • ̎ͭͷMatMulϨΠϠ
    • ୯ޠʹରԠ͢ΔॏΈΛऔΓग़͢΍ͭ(P.99)
    • ̎ͭͷฏۉΛऔΔ(=଍ͯ͠0.5Λ͔͚Δ)
    • score΁ͷશ݁߹૚
    • ׆ੑԽؔ਺͸ແ͍ͷͰΘΓͱγϯϓϧ

    View Slide

  24. 3.2.2 CBOWϞσϧͷֶश
    • χϡʔϥϧωοτϫʔΫͷηΦϦʔ௨Γ
    • CBOW͸ଞΫϥε෼ྨΛ͢ΔNN
    • Ϋϥεʹone-hotͰද͞Εͨ୯ޠ
    • είΞ͔Β֬཰ΛٻΊͯɺਖ਼ղͱͷࠩΛֶश͢Δ
    • Softmaxؔ਺ʹ͔͚ͯ֬཰ʹ͢Δ
    • ڭࢣϥϕϧ͔ΒަࠩΤϯτϩϐʔޡࠩΛٻΊΔ

    View Slide

  25. ϨΠϠදݱ
    • Softmax with lossΛ෇͚Ճ͑Δ

    View Slide

  26. ίʔυϦʔσΟϯά
    • ch03/cbow_predict.py
    • https://github.com/oreilly-japan/deep-learning-from-
    scratch-2/blob/master/ch03/cbow_predict.py

    View Slide

  27. 3.2.3 word2vecͷॏΈͱ෼ࢄදݱ
    • ͱɹɹͷҧ͍
    • ྆ํͱ΋΋୯ޠͷҙຯ͕Τϯίʔυ͞Ε͍ͯΔ
    • ܗঢ়͕ҧ͏
    • ɹɹ͸7x3
    • ɹɹ͸3x7
    Win
    Wout
    Win
    Wout

    View Slide

  28. ෼ࢄදݱ͸ɹɹΛ࢖͏
    • ɹɹ ͸શ͘࢖Θͳ͍ɹ
    • ɹɹʹର͢Δskip-ngramͰͷ༗༻ੑ࣮ݧ
    • https://arxiv.org/abs/1611.01462
    • ɹɹ΋࢖͏͜ͱͰΑ͍݁Ռ͕ಘΒΕΔͱ͍͏ใࠂ΋
    • https://nlp.stanford.edu/projects/glove/
    • word2vecͱࣅ͍ͯΔ΍ͭͷख๏
    Win
    Win
    Wout
    Wout

    View Slide