Upgrade to Pro — share decks privately, control downloads, hide ads and more …

Improving "People You May Know" on Directed Social Graph

Improving "People You May Know" on Directed Social Graph

machine learning graph pitch #1 https://machine-learning-pitch.connpass.com/event/130083/ で話した資料です。
有向グラフ上での Graph Embeddings と、それを活用したつながり推薦について話しました。

Agata Naomichi

May 13, 2019
Tweet

More Decks by Agata Naomichi

Other Decks in Programming

Transcript

  1. ©2019 Wantedly, Inc.
    Improving “People You May Know”

    on Directed Social Graph
    Machine Learning Graph Pitch #1
    May 13, 2019 - Naomichi Agata
    Graph Embedding Λ༻͍ͨ૒ํ޲ͭͳ͕Γ༧ଌ

    View Slide

  2. ©2018 Wantedly, Inc.
    Naomichi Agata
    - 2018/04~ Wantedly, Inc.
    - Wantedly People ͷػցֶशΤϯδχΞ
    - Data Driven Developer Meetup ӡӦϝϯόʔ
    - https://d3m.connpass.com
    @agatan_
    ࣗݾ঺հ

    View Slide

  3. ©2019 Wantedly, Inc.
    • ͳʹΛղܾ͍ͨ͠ͷ͔
    • Graph Embeddings
    • ͭͳ͕ΓͷछྨΛ׆༻͢Δ
    • ·ͱΊ
    Agenda

    View Slide

  4. ©2019 Wantedly, Inc.
    ͳʹΛղܾ͍ͨ͠ͷ͔
    ղ͖͍ͨ໰୊ͱͦͷલఏ

    View Slide

  5. ©2019 Wantedly, Inc.
    ʮͭͳ͕Γʯͷ਺Λ;΍͍ͨ͠ʂ
    ղ͖͍ͨ໰୊
    Wantedly People͸ʮͭͳ͕ΓʯΛ؅ཧ͢ΔΞϓϦ
    ʢ໊ࢗΛࡱΔPSͭͳ͕ΓϦΫΤετʣˠͭͳ͕Δʂ
    ͭͳ͕ͬͨਓʹؔ࿈͢Δ৘ใ͕Θ͔Δʂ
    ͨͱ͑͹Wantedlyࣾһͱͭͳ͕ΔͱʮWantedly, Inc.ʯʹؔ࿈͢Δχϡʔε͕ݟΒΕΔ
    Ϣʔβಉ͕࢜ͭͳ͕Δ΄Ͳྑ͍ମݧ͕ఏڙͰ͖Δ͸ͣ

    View Slide

  6. ©2019 Wantedly, Inc.
    ૒ํ޲ͷʮͭͳ͕Γʯͷ਺Λ;΍͍ͨ͠ʂ
    Wantedly Peopleͷͭͳ͕ΓϞσϧ͸ʮ༗޲άϥϑʯͰɺҰํ޲ͷͭͳ͕Γ΋͋Δ
    lͭͳ͕ΓϦΫΤετz
    ͨͱ͑͹"͞Μ͕#͞Μͷ໊ࢗΛࡱӨ͢Δͱɺ"͞Μˠ#͞ΜʹϦΫΤετ͕ૹΒΕΔ
    ๻Β͕;΍͍ͨ͠ͷ͸ʮ૒ํ޲ͷͭͳ͕Γʯ
    Ұํతͳͭͳ͕ΓϦΫΤετΛ;΍͍ͨ͠Θ͚Ͱ͸ͳ͍
    ݫີʹ͸ʜ

    View Slide

  7. ©2019 Wantedly, Inc.
    - ʮ΋͔ͯ͠͠஌Γ߹͍͔΋ʁʯΛද͍ࣔͨ͠
    - “People You May Know”
    - Link Prediction : ͋ΔϢʔβ A ͱ B ͕ͭͳ͕Δ཰Λ༧ଌ͍ͨ͠
    Ͳ͏΍ͬͯղ͔͘
    ?
    ?
    ?

    View Slide

  8. ©2019 Wantedly, Inc.
    ॳظ
    ॳظ͸ϧʔϧϕʔε͔Βελʔτ
    ʮಉ͡ձࣾʯʮڞ௨ͷͭͳ͕Γ͕NਓҎ্ʯʮ͋ͳͨͷ໊ࢗΛಡΈऔͬͨਓʯͳͲ
    ϧʔϧͷෳࡶԽ΍ਫ਼౓ɺਪન਺͕՝୊ʹͳΓɺվળϑΣʔζʹ
    ෳ਺ͷͭͳ͕ΓΛ΋͍ͬͯΔͷʹϧʔϧʹ౰ͯ͸·Βͳ͍ͨΊʹʮਪન͕ग़ͤͳ͍ʯ
    Ϣʔβ͕͍ͨ

    View Slide

  9. ©2019 Wantedly, Inc.
    Graph Embeddings

    View Slide

  10. ©2019 Wantedly, Inc.
    ͜͜Ͱ͸Graphதͷ֤Nodeʹରͯ͠ʮϕΫτϧදݱʯΛ֫ಘ͢Δ͜ͱ
    ͭͳ͕͍ͬͯΔNodeͨͪ͸ۙ͘ɺͭͳ͕͍ͬͯͳ͍Nodeͨͪ͸ԕ͘ͳΔΑ͏ʹֶश͢Δ
    ʮྨࣅ౓ʯ͸Cosineྨࣅ౓΍಺ੵͰఆٛ͢Δ
    Graph Embeddings?
    A
    B C
    similarity(A, B) = ⃗
    a ⋅ ⃗
    b
    similarity(A, C) = ⃗
    a ⋅ ⃗
    c

    View Slide

  11. ©2019 Wantedly, Inc.
    ෳࡶͳؔ܎ΛදݱͰ͖Δʢ͜ͱ͕ظ଴͞ΕΔʣ
    άϥϑͱ͍͏ෳࡶͳσʔλߏ଄ͷ৘ใΛ࣍ݩϕΫτϧͱͯ͠ѻ͑ͨΒ͏Ε͍͠ʂ
    ʢยํ޲ͷΤοδ΋ؚΊΔͱʣ਺ԯΤοδͷάϥϑ͕γϯϓϧͳදݱʹམͱ͠ࠐΊΔ
    ਪનީิͷݕࡧ͕ߴ଎ʹ࣮ݱͰ͖Δ
    શ୳ࡧ͢Δͱͯ͠΋಺ੵΛऔΔ͚ͩ
    ʢଞͷλεΫ΁ͷԠ༻ʣ
    ಘΒΕͨembeddingsΛdownstream tasksͷೖྗʹ͔ͭ͑Δ
    Link Prediction΋ downstream tasksͷͻͱͭ
    ͳͥGraph EmbeddingsΛ͔ͭ͏ͷ͔

    View Slide

  12. ©2019 Wantedly, Inc.
    ਪનͷશମ૾
    Embeddings ۙ๣఺୳ࡧ ճؼϞσϧ ਪન

    View Slide

  13. ©2019 Wantedly, Inc.
    ਪનͷશମ૾
    Embeddings ۙ๣఺୳ࡧ ճؼϞσϧ ਪન
    ͢΂ͯͷϖΞΛྻڍͯ͠ܭࢉ͢Δͷ͸ܭࢉྔతʹ͖ͼ͍͠
    embeddingͷ৘ใΛ͔ͭͬͯਪનީิΛߜΔ
    ճؼϞσϧ͸Re-Rankͷ໾ׂ
    ۙࣅ࠷ۙ๣୳ࡧ(Approximate Nearest Neighbors) ͕࢖͑ΔͷͰεέʔϧ͢Δ

    View Slide

  14. ©2019 Wantedly, Inc.
    ਪનͷશମ૾
    Embeddings ۙ๣఺୳ࡧ ճؼϞσϧ ਪન
    ϢʔβͷϖΞ͔Βʮͭͳ͕Δ཰ʯΛ༧ଌ͢ΔϞσϧ
    ಛ௃ྔͱ֤ͯ͠ϢʔβͷϕΫτϧදݱΛ࢖͏
    ͦͷ΄͔ͷ৘ใʢ࣍਺΍ڞ௨ͷͭͳ͕Γ਺ͳͲʣΛಛ௃ྔͱ͔ͯͭ͠͏͜ͱ΋Ͱ͖Δ
    ͳͯ͘΋embeddingͷ಺ੵͷେ͖͞ॱʹιʔτ্ͯ͠ҐΛਪન͢Δํ๏͕ͱΕΔ
    ͋ͬͨ΄͏͕ਫ਼౓͕Α͔ͬͨ
    ֬཰ͱͯ͠ղऍͰ͖ΔΑ͏ʹͳΔͷͰѻ͍΍͍͢

    View Slide

  15. ©2019 Wantedly, Inc.
    ਪનͷશମ૾
    Embeddings ۙ๣఺୳ࡧ ճؼϞσϧ ਪન
    ͕͜͜ॏཁʂ
    ͪΌΜͱʮͭͳ͕Γͦ͏ͳϢʔβʯ͕ۙ๣ʹҐஔ͍ͯ͠ͳ͍ͱީิ͔Β࿙Εͯ͠·͏
    άϥϑͷ৘ใΛ͏·͘ຒΊࠐΊ͍ͯͳ͍ͱճؼϞσϧͷਫ਼౓͕Ͱͳ͍
    ͦ΋ͦ΋Embedding͕ఆٛͰ͖ͳ͍Ϣʔβʹ͸ਪનΛग़ͤͳ͍

    View Slide

  16. ©2019 Wantedly, Inc.
    ΍ͬͯΈͨ
    ·ͣ͸૒ํ޲ͷͭͳ͕ΓͷΈΛ׆༻ֶͯ͠शͨ͠
    γϯϓϧͳDeepWalk
    Graph ্ΛϥϯμϜ΢ΥʔΫͯ͠ಘͨ Node ྻ͔Β Word2Vec ͱಉ͡Α͏ͳํ๏
    Ͱֶश
    ਪનΛग़ͤΔϢʔβ͕;͑ͨ͜ͱͰɺैདྷͷ໿2ഒͷͭͳ͕ΓΛͭͬͨ͘ʂ
    %FFQ8BMLPOMJOFMFBSOJOHPGTPDJBMSFQSFTFOUBUJPOT<#1SP[[J ,%%>

    View Slide

  17. ©2019 Wantedly, Inc.
    Ծઆ
    ͭͳ͕ΓϦΫΤετ͕ঝೝ͞Εͳ͔ͬͨ͜ͱ
    ʮ୯ํ޲ͷͭͳ͕ΓʯΛܦ༝ͯ͠ऑͭ͘ͳ͕͍ͬͯΔͱ͍͏৘ใ
    ҰํͰɺ૒ํ޲Ͱͳ͍ΤοδΛࣺ͍ͯͯΔͷͰ
    ͕׆͔͍ͤͯͳ͍ʜ
    ͜ΕΒΛ׆༻Ͱ͖ͨΒ΋ͬͱྑ͘Ͱ͖ΔͷͰ͸ʁ
    ϦΫΤετ

    View Slide

  18. ©2019 Wantedly, Inc.
    ͭͳ͕ΓͷछྨΛ׆༻͢Δ

    View Slide

  19. ©2019 Wantedly, Inc.
    ಺ੵ΍ίαΠϯྨࣅ౓ΛείΞͱͯ͠࢖͏৔߹
    ͭͳ͕ΓͷछྨΛͲ͏දݱ͢Δ͔
    similarity(A, B) = ⃗
    a ⋅ ⃗
    b = ⃗
    b ⋅ ⃗
    a = similarity(B, A)
    ͳͷͰɺʮ"ˠ#ͷΤοδ͸͋Δͷʹ#ˠ"ͷΤοδ͸ͳ͍ʯΛදݱͰ͖ͳ͍ɻ

    View Slide

  20. ©2019 Wantedly, Inc.
    ͭͳ͕ΓͷछྨΛͲ͏දݱ͢Δ͔
    ͦ͜Ͱɺ"͔Β#ʹରͯ͠Sͱ͍͏Τοδ͕͋Δͱ͖ɺͦΕΛ࣍ͷΑ͏ʹදݱ͢Δ
    score(A, r, B) = ⃗
    a ⋅ fr
    ( ⃗
    b )
    score(B, r, A) = ⃗
    b ⋅ fr
    ( ⃗
    a )
    ͸Τοδͷछྨ͝ͱʹఆٛ͞ΕΔϕΫτϧΛม׵͢Δؔ਺
    fr
    A B
    r
    fr
    ͸খ͘͞ͳΔΑ͏ʹ΋ֶश

    a ⋅ fr
    ( ⃗
    b ) ͸େ͖͘

    b ⋅ fr
    ( ⃗
    a )

    View Slide

  21. ©2019 Wantedly, Inc.
    ࠓճ͸Complex EmbeddingsΛࢀߟʹ࣍ͷΑ͏ͳม׵Λߦͬͨ
    $PNQMFY&NCFEEJOHTGPS4JNQMF-JOL1SFEJDUJPO<55SPVJMMPO *$.->
    real, imag = embedding[:dim // 2], embedding[dim // 2:]
    concat(real * W_real - imag * W_imag, real * W_imag + image * W_real)
    W_real, W_imag͸ֶश͞ΕΔύϥϝʔλ
    ͭͳ͕ΓͷछྨΛͲ͏දݱ͢Δ͔

    View Slide

  22. ©2019 Wantedly, Inc.
    ࣮ݧ
    ʮ૒ํ޲ͷͭͳ͕ΓʯʮҰํ௨ߦͷͭͳ͕ΓʯΛ྆ํͻͱͭͷάϥϑ্Ͱදݱͯ͠
    embeddingsΛֶश
    ֶशͨ͠ϕΫτϧදݱΛಛ௃ྔʹͨ͠LightGBMͷϞσϧΛֶश͠ɺͦͷਫ਼౓ΛධՁ͢Δ
    ʮ૒ํ޲ͷͭͳ͕ΓʯͷΈΛ࢖ͬͨ৔߹ͷੑೳͱൺֱ͢Δ

    View Slide

  23. ©2019 Wantedly, Inc.
    ࠓճͷ࣮ݧͰͷදݱ૒ํ޲
    A͞Μ B͞Μ

    a ⋅ ⃗
    b

    b ⋅ ⃗
    a
    - src, dst ͱ΋ʹม׵ͳ͠
    ͲͪΒͷ޲͖͔Βݟͯ΋ಉ݁͡ՌʹͳΔΑ͏ʹ
    score(A, bidirected, B) = ⃗
    a ⋅ ⃗
    b
    score(B, bidirected, A) = ⃗
    b ⋅ ⃗
    a

    View Slide

  24. ©2019 Wantedly, Inc.
    ࠓճͷ࣮ݧͰͷදݱҰํ௨ߦ
    A͞Μ B͞Μ

    a ⋅ f( ⃗
    b )

    b ⋅ f( ⃗
    a )
    - dstଆΛม׵
    A → B ͸positive͔ͭB → A ͸negativeΛ

    දݱͰ͖ΔΑ͏ʹ͢Δ

    a ⋅ f( ⃗
    b ) Λେ͖͘ʢA → B͸Ұํ௨ߦʣ

    b ⋅ f( ⃗
    a ) Λখ͘͞ʢ# → A͸ଘࡏ͠ͳ͍ʣ

    a ⋅ ⃗
    b Λখ͘͞ʢA ͱ B͸૒ํ޲ͭͳ͕ΓͰ͸ͳ͍ʣ

    View Slide

  25. ©2019 Wantedly, Inc.
    ݁Ռ
    ʮTop0Ҏ಺ʹͭͳ͕ΔϢʔβΛਪનͰ͖ΔʯϢʔβ਺͸10.3% ૿Ճ
    ΋ͱ΋ͱਪનΛग़͍ͤͯͨϢʔβʹର͢Δਪનͷ࣭͸΄΅ಉ౳

    ʢAUC: 0.819 → 0.823ʣ
    ʮ୯ํ޲ͷΤοδΛ׆༻͢Δ͜ͱͰ࣭΋͕͋Δʯͱ͍͏݁Ռ͸ಘΒΕͳ͔ͬͨ
    खΛ޿͛ͯ΋ਫ਼౓͕མͪͳ͔ͬͨͱ͍͑ͦ͏

    View Slide

  26. ©2019 Wantedly, Inc.
    ͦͷଞͷ࣮ݧ͜Ε͔Β
    ͍Ζ͍Ζ΍ͬͯΈ͚ͨͲྑ͍݁Ռ͕ಘΒΕͳ͔࣮ͬͨݧͨͪʜ
    ʮ໊ࢗܦ༝ʯʮϦΫΤετܦ༝ʯͱ͍ͬͨ৘ใ΋Τοδͷϥϕϧͱͯ͠ѻͬͯΈΔ
    ϦΫΤετͷReject΋Τοδͱͯ͠ѻͬͯΈΔ
    ৽͘͠ਪન͕ग़ͤΔΑ͏ʹͳͬͨϢʔβʹߜͬͨධՁ͕Ͱ͖͍ͯͳ͍
    ୯ํ޲ͷΤοδ΋͔ͭ͏Α͏ʹͳͬͨ͜ͱͰɺ࣍਺ͷখ͍͞Ϣʔβ΋ଟؚ͘Ήάϥϑ
    ʹͳͬͨ
    ΄͔ͷλεΫ΁ͷԠ༻

    View Slide

  27. ©2019 Wantedly, Inc.
    ·ͱΊ
    ૒ํ޲ͱ୯ํ޲ͷͭͳ͕ΓΛ۠ผͯ͠ѻ͏͜ͱͰɺΑΓϦονͳ৘ใΛຒΊࠐΊͨʢʁʣ
    EstଆͷϕΫτϧදݱΛม׵͢Δ
    ༗ޮͳਪન͕Ͱ͖ͨϢʔβ͕10.3%૿Ճͨ͠ʂ
    ࠓޙ
    Τοδʹ͸΋ͬͱͨ͘͞Μͷ৘ใ͕͋ΔͷͰ׆༻͍ͨ͠
    Link PredictionҎ֎ͷλεΫ΁ͷ׆༻

    View Slide