Slide 1

Slide 1 text

©2019 Wantedly, Inc. Improving “People You May Know”
 on Directed Social Graph Machine Learning Graph Pitch #1 May 13, 2019 - Naomichi Agata Graph Embedding Λ༻͍ͨ૒ํ޲ͭͳ͕Γ༧ଌ

Slide 2

Slide 2 text

©2018 Wantedly, Inc. Naomichi Agata - 2018/04~ Wantedly, Inc. - Wantedly People ͷػցֶशΤϯδχΞ - Data Driven Developer Meetup ӡӦϝϯόʔ - https://d3m.connpass.com @agatan_ ࣗݾ঺հ

Slide 3

Slide 3 text

©2019 Wantedly, Inc. • ͳʹΛղܾ͍ͨ͠ͷ͔ • Graph Embeddings • ͭͳ͕ΓͷछྨΛ׆༻͢Δ • ·ͱΊ Agenda

Slide 4

Slide 4 text

©2019 Wantedly, Inc. ͳʹΛղܾ͍ͨ͠ͷ͔ ղ͖͍ͨ໰୊ͱͦͷલఏ

Slide 5

Slide 5 text

©2019 Wantedly, Inc. ʮͭͳ͕Γʯͷ਺Λ;΍͍ͨ͠ʂ ղ͖͍ͨ໰୊  Wantedly People͸ʮͭͳ͕ΓʯΛ؅ཧ͢ΔΞϓϦ  ʢ໊ࢗΛࡱΔPSͭͳ͕ΓϦΫΤετʣˠͭͳ͕Δʂ  ͭͳ͕ͬͨਓʹؔ࿈͢Δ৘ใ͕Θ͔Δʂ  ͨͱ͑͹Wantedlyࣾһͱͭͳ͕ΔͱʮWantedly, Inc.ʯʹؔ࿈͢Δχϡʔε͕ݟΒΕΔ  Ϣʔβಉ͕࢜ͭͳ͕Δ΄Ͳྑ͍ମݧ͕ఏڙͰ͖Δ͸ͣ

Slide 6

Slide 6 text

©2019 Wantedly, Inc. ૒ํ޲ͷʮͭͳ͕Γʯͷ਺Λ;΍͍ͨ͠ʂ  Wantedly Peopleͷͭͳ͕ΓϞσϧ͸ʮ༗޲άϥϑʯͰɺҰํ޲ͷͭͳ͕Γ΋͋Δ  lͭͳ͕ΓϦΫΤετz  ͨͱ͑͹"͞Μ͕#͞Μͷ໊ࢗΛࡱӨ͢Δͱɺ"͞Μˠ#͞ΜʹϦΫΤετ͕ૹΒΕΔ  ๻Β͕;΍͍ͨ͠ͷ͸ʮ૒ํ޲ͷͭͳ͕Γʯ  Ұํతͳͭͳ͕ΓϦΫΤετΛ;΍͍ͨ͠Θ͚Ͱ͸ͳ͍ ݫີʹ͸ʜ

Slide 7

Slide 7 text

©2019 Wantedly, Inc. - ʮ΋͔ͯ͠͠஌Γ߹͍͔΋ʁʯΛද͍ࣔͨ͠ - “People You May Know” - Link Prediction : ͋ΔϢʔβ A ͱ B ͕ͭͳ͕Δ཰Λ༧ଌ͍ͨ͠ Ͳ͏΍ͬͯղ͔͘ ? ? ?

Slide 8

Slide 8 text

©2019 Wantedly, Inc. ॳظ  ॳظ͸ϧʔϧϕʔε͔Βελʔτ  ʮಉ͡ձࣾʯʮڞ௨ͷͭͳ͕Γ͕NਓҎ্ʯʮ͋ͳͨͷ໊ࢗΛಡΈऔͬͨਓʯͳͲ  ϧʔϧͷෳࡶԽ΍ਫ਼౓ɺਪન਺͕՝୊ʹͳΓɺվળϑΣʔζʹ  ෳ਺ͷͭͳ͕ΓΛ΋͍ͬͯΔͷʹϧʔϧʹ౰ͯ͸·Βͳ͍ͨΊʹʮਪન͕ग़ͤͳ͍ʯ Ϣʔβ͕͍ͨ

Slide 9

Slide 9 text

©2019 Wantedly, Inc. Graph Embeddings

Slide 10

Slide 10 text

©2019 Wantedly, Inc.  ͜͜Ͱ͸Graphதͷ֤Nodeʹରͯ͠ʮϕΫτϧදݱʯΛ֫ಘ͢Δ͜ͱ  ͭͳ͕͍ͬͯΔNodeͨͪ͸ۙ͘ɺͭͳ͕͍ͬͯͳ͍Nodeͨͪ͸ԕ͘ͳΔΑ͏ʹֶश͢Δ  ʮྨࣅ౓ʯ͸Cosineྨࣅ౓΍಺ੵͰఆٛ͢Δ Graph Embeddings? A B C similarity(A, B) = ⃗ a ⋅ ⃗ b similarity(A, C) = ⃗ a ⋅ ⃗ c

Slide 11

Slide 11 text

©2019 Wantedly, Inc.  ෳࡶͳؔ܎ΛදݱͰ͖Δʢ͜ͱ͕ظ଴͞ΕΔʣ  άϥϑͱ͍͏ෳࡶͳσʔλߏ଄ͷ৘ใΛ࣍ݩϕΫτϧͱͯ͠ѻ͑ͨΒ͏Ε͍͠ʂ  ʢยํ޲ͷΤοδ΋ؚΊΔͱʣ਺ԯΤοδͷάϥϑ͕γϯϓϧͳදݱʹམͱ͠ࠐΊΔ  ਪનީิͷݕࡧ͕ߴ଎ʹ࣮ݱͰ͖Δ  શ୳ࡧ͢Δͱͯ͠΋಺ੵΛऔΔ͚ͩ  ʢଞͷλεΫ΁ͷԠ༻ʣ  ಘΒΕͨembeddingsΛdownstream tasksͷೖྗʹ͔ͭ͑Δ  Link Prediction΋ downstream tasksͷͻͱͭ ͳͥGraph EmbeddingsΛ͔ͭ͏ͷ͔

Slide 12

Slide 12 text

©2019 Wantedly, Inc. ਪનͷશମ૾ Embeddings ۙ๣఺୳ࡧ ճؼϞσϧ ਪન

Slide 13

Slide 13 text

©2019 Wantedly, Inc. ਪનͷશମ૾ Embeddings ۙ๣఺୳ࡧ ճؼϞσϧ ਪન  ͢΂ͯͷϖΞΛྻڍͯ͠ܭࢉ͢Δͷ͸ܭࢉྔతʹ͖ͼ͍͠  embeddingͷ৘ใΛ͔ͭͬͯਪનީิΛߜΔ  ճؼϞσϧ͸Re-Rankͷ໾ׂ  ۙࣅ࠷ۙ๣୳ࡧ(Approximate Nearest Neighbors) ͕࢖͑ΔͷͰεέʔϧ͢Δ

Slide 14

Slide 14 text

©2019 Wantedly, Inc. ਪનͷશମ૾ Embeddings ۙ๣఺୳ࡧ ճؼϞσϧ ਪન  ϢʔβͷϖΞ͔Βʮͭͳ͕Δ཰ʯΛ༧ଌ͢ΔϞσϧ  ಛ௃ྔͱ֤ͯ͠ϢʔβͷϕΫτϧදݱΛ࢖͏  ͦͷ΄͔ͷ৘ใʢ࣍਺΍ڞ௨ͷͭͳ͕Γ਺ͳͲʣΛಛ௃ྔͱ͔ͯͭ͠͏͜ͱ΋Ͱ͖Δ  ͳͯ͘΋embeddingͷ಺ੵͷେ͖͞ॱʹιʔτ্ͯ͠ҐΛਪન͢Δํ๏͕ͱΕΔ  ͋ͬͨ΄͏͕ਫ਼౓͕Α͔ͬͨ  ֬཰ͱͯ͠ղऍͰ͖ΔΑ͏ʹͳΔͷͰѻ͍΍͍͢

Slide 15

Slide 15 text

©2019 Wantedly, Inc. ਪનͷશମ૾ Embeddings ۙ๣఺୳ࡧ ճؼϞσϧ ਪન  ͕͜͜ॏཁʂ  ͪΌΜͱʮͭͳ͕Γͦ͏ͳϢʔβʯ͕ۙ๣ʹҐஔ͍ͯ͠ͳ͍ͱީิ͔Β࿙Εͯ͠·͏  άϥϑͷ৘ใΛ͏·͘ຒΊࠐΊ͍ͯͳ͍ͱճؼϞσϧͷਫ਼౓͕Ͱͳ͍  ͦ΋ͦ΋Embedding͕ఆٛͰ͖ͳ͍Ϣʔβʹ͸ਪનΛग़ͤͳ͍

Slide 16

Slide 16 text

©2019 Wantedly, Inc. ΍ͬͯΈͨ  ·ͣ͸૒ํ޲ͷͭͳ͕ΓͷΈΛ׆༻ֶͯ͠शͨ͠  γϯϓϧͳDeepWalk   Graph ্ΛϥϯμϜ΢ΥʔΫͯ͠ಘͨ Node ྻ͔Β Word2Vec ͱಉ͡Α͏ͳํ๏ Ͱֶश  ਪનΛग़ͤΔϢʔβ͕;͑ͨ͜ͱͰɺैདྷͷ໿2ഒͷͭͳ͕ΓΛͭͬͨ͘ʂ %FFQ8BMLPOMJOFMFBSOJOHPGTPDJBMSFQSFTFOUBUJPOT<#1SP[[J ,%%>

Slide 17

Slide 17 text

©2019 Wantedly, Inc. Ծઆ  ͭͳ͕ΓϦΫΤετ͕ঝೝ͞Εͳ͔ͬͨ͜ͱ  ʮ୯ํ޲ͷͭͳ͕ΓʯΛܦ༝ͯ͠ऑͭ͘ͳ͕͍ͬͯΔͱ͍͏৘ใ ҰํͰɺ૒ํ޲Ͱͳ͍ΤοδΛࣺ͍ͯͯΔͷͰ ͕׆͔͍ͤͯͳ͍ʜ ͜ΕΒΛ׆༻Ͱ͖ͨΒ΋ͬͱྑ͘Ͱ͖ΔͷͰ͸ʁ ϦΫΤετ

Slide 18

Slide 18 text

©2019 Wantedly, Inc. ͭͳ͕ΓͷछྨΛ׆༻͢Δ

Slide 19

Slide 19 text

©2019 Wantedly, Inc. ಺ੵ΍ίαΠϯྨࣅ౓ΛείΞͱͯ͠࢖͏৔߹ ͭͳ͕ΓͷछྨΛͲ͏දݱ͢Δ͔ similarity(A, B) = ⃗ a ⋅ ⃗ b = ⃗ b ⋅ ⃗ a = similarity(B, A) ͳͷͰɺʮ"ˠ#ͷΤοδ͸͋Δͷʹ#ˠ"ͷΤοδ͸ͳ͍ʯΛදݱͰ͖ͳ͍ɻ

Slide 20

Slide 20 text

©2019 Wantedly, Inc. ͭͳ͕ΓͷछྨΛͲ͏දݱ͢Δ͔ ͦ͜Ͱɺ"͔Β#ʹରͯ͠Sͱ͍͏Τοδ͕͋Δͱ͖ɺͦΕΛ࣍ͷΑ͏ʹදݱ͢Δ score(A, r, B) = ⃗ a ⋅ fr ( ⃗ b ) score(B, r, A) = ⃗ b ⋅ fr ( ⃗ a ) ͸Τοδͷछྨ͝ͱʹఆٛ͞ΕΔϕΫτϧΛม׵͢Δؔ਺ fr A B r fr ͸খ͘͞ͳΔΑ͏ʹ΋ֶश ⃗ a ⋅ fr ( ⃗ b ) ͸େ͖͘ ⃗ b ⋅ fr ( ⃗ a )

Slide 21

Slide 21 text

©2019 Wantedly, Inc. ࠓճ͸Complex EmbeddingsΛࢀߟʹ࣍ͷΑ͏ͳม׵Λߦͬͨ $PNQMFY&NCFEEJOHTGPS4JNQMF-JOL1SFEJDUJPO<55SPVJMMPO *$.-> real, imag = embedding[:dim // 2], embedding[dim // 2:] concat(real * W_real - imag * W_imag, real * W_imag + image * W_real) W_real, W_imag͸ֶश͞ΕΔύϥϝʔλ ͭͳ͕ΓͷछྨΛͲ͏දݱ͢Δ͔

Slide 22

Slide 22 text

©2019 Wantedly, Inc. ࣮ݧ  ʮ૒ํ޲ͷͭͳ͕ΓʯʮҰํ௨ߦͷͭͳ͕ΓʯΛ྆ํͻͱͭͷάϥϑ্Ͱදݱͯ͠ embeddingsΛֶश  ֶशͨ͠ϕΫτϧදݱΛಛ௃ྔʹͨ͠LightGBMͷϞσϧΛֶश͠ɺͦͷਫ਼౓ΛධՁ͢Δ  ʮ૒ํ޲ͷͭͳ͕ΓʯͷΈΛ࢖ͬͨ৔߹ͷੑೳͱൺֱ͢Δ

Slide 23

Slide 23 text

©2019 Wantedly, Inc. ࠓճͷ࣮ݧͰͷදݱ૒ํ޲ A͞Μ B͞Μ ⃗ a ⋅ ⃗ b ⃗ b ⋅ ⃗ a - src, dst ͱ΋ʹม׵ͳ͠  ͲͪΒͷ޲͖͔Βݟͯ΋ಉ݁͡ՌʹͳΔΑ͏ʹ score(A, bidirected, B) = ⃗ a ⋅ ⃗ b score(B, bidirected, A) = ⃗ b ⋅ ⃗ a

Slide 24

Slide 24 text

©2019 Wantedly, Inc. ࠓճͷ࣮ݧͰͷදݱҰํ௨ߦ A͞Μ B͞Μ ⃗ a ⋅ f( ⃗ b ) ⃗ b ⋅ f( ⃗ a ) - dstଆΛม׵  A → B ͸positive͔ͭB → A ͸negativeΛ
 දݱͰ͖ΔΑ͏ʹ͢Δ ⃗ a ⋅ f( ⃗ b ) Λେ͖͘ʢA → B͸Ұํ௨ߦʣ ⃗ b ⋅ f( ⃗ a ) Λখ͘͞ʢ# → A͸ଘࡏ͠ͳ͍ʣ ⃗ a ⋅ ⃗ b Λখ͘͞ʢA ͱ B͸૒ํ޲ͭͳ͕ΓͰ͸ͳ͍ʣ

Slide 25

Slide 25 text

©2019 Wantedly, Inc. ݁Ռ  ʮTop0Ҏ಺ʹͭͳ͕ΔϢʔβΛਪનͰ͖ΔʯϢʔβ਺͸10.3% ૿Ճ  ΋ͱ΋ͱਪનΛग़͍ͤͯͨϢʔβʹର͢Δਪનͷ࣭͸΄΅ಉ౳
 ʢAUC: 0.819 → 0.823ʣ  ʮ୯ํ޲ͷΤοδΛ׆༻͢Δ͜ͱͰ࣭΋͕͋Δʯͱ͍͏݁Ռ͸ಘΒΕͳ͔ͬͨ  खΛ޿͛ͯ΋ਫ਼౓͕མͪͳ͔ͬͨͱ͍͑ͦ͏

Slide 26

Slide 26 text

©2019 Wantedly, Inc. ͦͷଞͷ࣮ݧ͜Ε͔Β  ͍Ζ͍Ζ΍ͬͯΈ͚ͨͲྑ͍݁Ռ͕ಘΒΕͳ͔࣮ͬͨݧͨͪʜ  ʮ໊ࢗܦ༝ʯʮϦΫΤετܦ༝ʯͱ͍ͬͨ৘ใ΋Τοδͷϥϕϧͱͯ͠ѻͬͯΈΔ  ϦΫΤετͷReject΋Τοδͱͯ͠ѻͬͯΈΔ  ৽͘͠ਪન͕ग़ͤΔΑ͏ʹͳͬͨϢʔβʹߜͬͨධՁ͕Ͱ͖͍ͯͳ͍  ୯ํ޲ͷΤοδ΋͔ͭ͏Α͏ʹͳͬͨ͜ͱͰɺ࣍਺ͷখ͍͞Ϣʔβ΋ଟؚ͘Ήάϥϑ ʹͳͬͨ  ΄͔ͷλεΫ΁ͷԠ༻

Slide 27

Slide 27 text

©2019 Wantedly, Inc. ·ͱΊ  ૒ํ޲ͱ୯ํ޲ͷͭͳ͕ΓΛ۠ผͯ͠ѻ͏͜ͱͰɺΑΓϦονͳ৘ใΛຒΊࠐΊͨʢʁʣ  EstଆͷϕΫτϧදݱΛม׵͢Δ  ༗ޮͳਪન͕Ͱ͖ͨϢʔβ͕10.3%૿Ճͨ͠ʂ  ࠓޙ  Τοδʹ͸΋ͬͱͨ͘͞Μͷ৘ใ͕͋ΔͷͰ׆༻͍ͨ͠  Link PredictionҎ֎ͷλεΫ΁ͷ׆༻