Slide 1

Slide 1 text

©2020 Wantedly, Inc. WWW2020 ࿦จ঺հ The Web Conference2020 ࢀՃใࠂձ by Wantedly 30.Apr.2020 - দଜ༏໵ @yu-ya4 5BMFOU'MPXʹجͮ͘اۀؒͷڝ߹ੑͷ෼ੳ4QPUJGZʹ͓͚Δਪનͷଟ༷ੑʹ͍ͭͯͷ෼ੳ

Slide 2

Slide 2 text

©2020 Wantedly, Inc. ✓ Yuya Matsumura ✓ Wantedly, Inc. Recommendation Team ✓ Data Science, Team Lead ✓ Interested in Information Retrieval, Machine Learning Self-Introduction @yu-ya4 @yu__ya4

Slide 3

Slide 3 text

©2020 Wantedly, Inc. ຊൃදͰऔΓ্͛Δ࿦จ 1. (Zhang 2020) Large-Scale Talent Flow Embedding for Company Competitive Analysis • ਓࡐͷྲྀΕʹ஫໨ͯ͠اۀΛ෼ࢄදݱͰද͠ɺاۀؒͷʮڝ߹ʯʹ͍ͭͯ෼ੳ 2. (Anderson 2020) Algorithmic Effects on the Diversity of Consumption on Spotify • Spotify ʹ͓͚ΔਪનγεςϜ͕ʮଟ༷ੑʯʹ༩͑ΔӨڹٴͼɺʮଟ༷ੑʯͱαʔϏεͷॏཁࢦඪͷؔ܎ੑʹ͍ͭͯ෼ੳ WWW2020 Ͱ @yu-ya4 ͕ؾʹͳͬͨ࿦จ̎ຊͷ֓ཁΛ঺հ

Slide 4

Slide 4 text

©2020 Wantedly, Inc. ຊൃදͰऔΓ্͛Δ࿦จ 1. (Zhang 2020) Large-Scale Talent Flow Embedding for Company Competitive Analysis • ਓࡐͷྲྀΕʹ஫໨ͯ͠اۀΛ෼ࢄදݱͰද͠ɺاۀؒͷʮڝ߹ʯʹ͍ͭͯ෼ੳ 2. (Anderson 2020) Algorithmic Effects on the Diversity of Consumption on Spotify • Spotify ʹ͓͚ΔਪનγεςϜ͕ʮଟ༷ੑʯʹ༩͑ΔӨڹٴͼɺʮଟ༷ੑʯͱαʔϏεͷॏཁࢦඪͷؔ܎ੑʹ͍ͭͯ෼ੳ WWW2020 Ͱ @yu-ya4 ͕ؾʹͳͬͨ࿦จ̎ຊͷ֓ཁΛ঺հ

Slide 5

Slide 5 text

©2020 Wantedly, Inc. Ϟνϕʔγϣϯ (Zhang 2020) ਓࡐͷྲྀΕʢTalent Flowʣͷ෼ੳΛར༻ͨ͠اۀؒͷڝ߹ʹ͍ͭͯͷݚڀʹ͓͚Δ৽ ͍͠࿮૊ΈͷఏҊ (Zhang 2020) Fig. 1 (Zhang 2020) ൃදεϥΠυ P.6 • ͜Ε·Ͱͷݚڀ͸ɺاۀͷ౷ܭతͳσʔλΛ༻͍ͨ΋ͷ΍ɺώϡʔϦ εςΟοΫͳख๏ʹΑΔ෼ੳ͕ଟ͔ͬͨ • 2006-2010 ʹଔۀֶͨ͠ੜ͸࠷ॳͷ̑೥Ͱฏۉ 2.85 ݸͷ৬ʹ͍ͭͨ • ͜Ε͸ 1986-1990 ͷظؒͷ̎ഒʹ͋ͨΔ • ͜ΕΒͷσʔλ͕ Web ্ʹଟʑެ։͞Ε͍ͯΔʢex. LinkedInʣ

Slide 6

Slide 6 text

©2020 Wantedly, Inc. اۀͷڝ߹ੑʢCompany Competitivenessʣ (Zhang 2020) اۀ u ͷاۀ v ʹର͢Δڝ߹ੑ • personalized PageRank(PPR) proximity Λར༻ͯ͠දݱ • άϥϑ্ͷಛఆͷ node ͔Βଞͷର৅ͱ͢Δ node ΁ͷؔ࿈౓Λܭࢉ (Zhang 2020) ൃදεϥΠυ P.10 u ͷ talent ͕ v ʹҠΓ΍͘͢ɺv ʹҠͬͯ͘Δ talent ͷ͏ͪେ͖ͳׂ߹Λ u ͷ talent ͕ ઎Ί͍ͯΔࡍʹߴ͍ͱߟ͑Δ Talent Flow Network(TFN) • Node: اۀ • Edge: talent ͷྲྀΕ • talent ͕Ҡಈͨ͠ճ਺ΛॏΈʹ࣋ͭɻ༗޲ɻ : ΦϦδφϧͷ TFN ʹ͓͚Δ PPR : Edge ͷ޲͖Λ൓సͤͨ͞ TFN ʹ͓͚Δ PPR

Slide 7

Slide 7 text

©2020 Wantedly, Inc. ΍Γ͍ͨ͜ͱ: Talent Flow Embedding ͷ࡞੒ (Zhang 2020) Talent Flow Network ͔Βاۀͷڝ߹ੑʹجͮ̎ͭ͘ͷ attraction vector Λֶश͢Δ (Zhang 2020) Fig. 3 • : source vector of u • اۀ u ͷ talent ͔Βଞͷاۀ΁ͷ attraction Λදݱ • : target vector of u • ଞͷاۀͷ talent ͔Βاۀ u ΁ͷ attraction Λදݱ Su Tu

Slide 8

Slide 8 text

©2020 Wantedly, Inc. ࣮ݱํ๏ (Zhang 2020) ϕΫτϧۭؒʹ͓͚Δ෼෍ ΛɺΦϦδφϧͷ෼෍ ʹ fit ͤ͞Δ • ϕΫτϧۭؒʹ͓͚Δ෼෍ • : u ͔Β v ΁ͷڑ཭ • : v ͔Β u ΁ͷڑ཭ • • Kullback-Leibler(KL) divergence Λ࠷খԽ • • ΦϦδφϧͷ෼෍ • Monte Carlo + random walk Ͱۙࣅ • • ϙδγϣϯ͝ͱʹҟͳΔάϥϑΛ࢖ͬͨMulti-Task όʔδϣϯ΋ •

Slide 9

Slide 9 text

©2020 Wantedly, Inc. ࣮ݧ̍ɿLink Prediction (Zhang 2020) (Zhang 2020) Fig. 4

Slide 10

Slide 10 text

©2020 Wantedly, Inc. Case Study 1 (Zhang 2020) • Google ͱ Facebook ͸ޓ͍ʹڝ߹اۀɻ • positionʢ৬छʣ͝ͱʹ෼͚Δͱڝ߹اۀ΋มΘΔɻ

Slide 11

Slide 11 text

©2020 Wantedly, Inc. Case Study 2 (Zhang 2020) • source vectorʢاۀ͔Βͷ attractionʣΛ࢖͏ͱɺۀछ͝ͱʹ෼͔ΕΔɻ • target vectorʢtalent ͔Βͷ attractionʣΛ࢖͏ͱɺࠃ͝ͱʹ෼͔ΕΔɻ

Slide 12

Slide 12 text

©2020 Wantedly, Inc. ·ͱΊͱࡶײ • ਓࡐͷྲྀΕʹண໨ͯ͠اۀؒͷʮڝ߹ੑʯʹ͍ͭͯఆٛɺ෼ੳͨ͠ɻ • ఆٛͨ͠ʮڝ߹ੑʯʹج͍ͮͯاۀͷ෼ࢄදݱΛֶश͢Δ TFE ΛఏҊͨ͠ɻ • TFE ͷੑೳධՁͷ࣮ݧ΍ɺCase Study ʹΑΔΠϯαΠτͷൃݟΛߦͬͨɻ • Wantedly Visit ͱ͍͏αʔϏεΛѻͬͯΔਓؒͱͯ͠͸େมڵຯਂ͘ɺࣗࣾͷαʔϏ εͰ΋ࢼͯ͠Έ͍ͨͱࢥͬͨɻ • ϢʔβʹืूΛਪન͢Δͷʹ΋ɺاۀʹεΧ΢τ͍ͨ͠ϢʔβΛਪન͢Δͷʹ΋྆ํ ʹ͏·͘࢖͑ͦ͏ɻ • اۀͷ޷ΈͱϢʔβͷ޷ΈͰاۀͷ෼෍͕ҟͳΔͷ͸໘ന͍ɻ ·ͱΊ ࡶײ (Zhang 2020)

Slide 13

Slide 13 text

©2020 Wantedly, Inc. ຊൃදͰऔΓ্͛Δ࿦จ 1. (Zhang 2020) Large-Scale Talent Flow Embedding for Company Competitive Analysis • ਓࡐͷྲྀΕʹ஫໨ͯ͠اۀΛ෼ࢄදݱͰද͠ɺاۀؒͷʮڝ߹ʯʹ͍ͭͯ෼ੳ 2. (Anderson 2020) Algorithmic Effects on the Diversity of Consumption on Spotify • Spotify ʹ͓͚ΔਪનγεςϜ͕ʮଟ༷ੑʯʹ༩͑ΔӨڹٴͼɺʮଟ༷ੑʯͱαʔϏεͷॏཁࢦඪͷؔ܎ੑʹ͍ͭͯ෼ੳ WWW2020 Ͱ @yu-ya4 ͕ؾʹͳͬͨ࿦จ̎ຊͷ֓ཁΛ঺հ

Slide 14

Slide 14 text

©2020 Wantedly, Inc. Ϟνϕʔγϣϯ (Anderson 2020) ΞϧΰϦζϜʹΑΔਪનͱϢʔβ͕ফඅ͢ΔίϯςϯπͷؒʹͲͷΑ͏ͳؔ܎ੑ͕ ͋Δͷ͔஌Γ͍ͨɻ (Anderson 2020) ൃදεϥΠυ • Ϣʔβͷաڈͷߦಈཤྺ΍ᅂ޷ʹج͍ͮͯਪન͢Δͱ͍͏͜ͱ͸ɺࣅͨΑ͏ͳ ΋ͷ͹͔ΓΛਪન͢Δ͜ͱʹͳΓɺଟ༷ੑ͕ࣦΘΕΔͷͰ͸ʁ

Slide 15

Slide 15 text

©2020 Wantedly, Inc. ʮଟ༷ੑʯΛఆྔతʹදͨ͢Ίͷई౓ GS-score ͷఏҊ (Anderson 2020) • ಉ͡ playlist ʹΑ͘ग़ͯ͘ΔۂͲ͏͠͸ࣅ͍ͯΔͱ͍͏ԾఆʹΑΓ word2vec (CBOW) Ͱۂͷ embedding Λ࡞੒ɻ • Ϣʔβͷফඅͨ͠ۂϕΫτϧͷॏ৺ϕΫτϧʢcenter of massʣͱɺϢʔβͷফඅ͠ ͨ͢΂ͯͷۂϕΫτϧͱͷॏΈʢ࠶ੜճ਺ʣ͖ͭͷ಺ੵ࿨Λ GS-score ͱ͢Δɻ • GS-score ͕খ͍͞ -> ফඅͨ͠ۂͲ͏͕͠཭ΕͯΔ -> Generalist • GS-score ͕େ͖͍ -> ফඅͨ͠ۂͲ͏͕͍ۙ͠ -> Specialist Center of mass: User consumption diversity measure: (Anderson 2020) ൃදεϥΠυ

Slide 16

Slide 16 text

©2020 Wantedly, Inc. ଟ༷ੑͷ෼෍ (Anderson 2020) • ΞΫςΟϒ౓ͰϢʔβΛ෼͚ͯଟ༷ੑʢGS- scoreʣͷ෼෍Λ෼ੳɻ • ࠷΋ΞΫςΟϒ౓͕௿͍૚͸ଟ༷ੑ͕௿͍ํʹ Α͍ͬͯΔ͕ɺͦ΋ͦ΋ফඅͨ͠ۂ͕গͳ͍ͨΊ Ͱ͋Ζ͏ͱߟ͑ΒΕΔɻ • ΞΫςΟϒ౓ͱଟ༷ੑʹ૬ؔ͸ͳͦ͞͏ɻ less diverse more diverse (Anderson 2020) Fig. 2

Slide 17

Slide 17 text

©2020 Wantedly, Inc. ࣌ؒͷܦաʹΑΔଟ༷ੑͷมԽ (Anderson 2020) • 2018೥7݄ͱ2019೥7݄ʹ͓͚ΔಉҰϢʔβͷ GS- score ΛՄࢹԽ • શମతʹ΄ͱΜͲมԽͳ͠ɻ • ಛʹɺۃ୺ͳ GS-score ͷϢʔβʢ0, 1 ʹ͍ۙʣ ͸มΘΒͳ͍ɻ • Ϣʔβͷফඅ͢Δۂͷଟ༷ੑ͸࣌ؒͰมԽ͠ͳ͞ ͦ͏ɻ (Anderson 2020) Fig. 7

Slide 18

Slide 18 text

©2020 Wantedly, Inc. ଟ༷ੑͱ churn rate ͱͷؔ܎ੑ (Anderson 2020) • શମతʹଟ༷ੑ͕௿͍ํ͕ churn ͠΍͍͢ɻ • ΞΫςΟϒ͕࠷΋௿͍૚Ͱ͸ 25% ΋͕ࠩ։͍ͨɻ • ϢʔβͷΞΫςΟϒ౓͕มΘͬͯ΋ɺଟ༷ੑͷߴ͍ ૚ͷ churn rate ʹ͸ؔ܎͕ͳ͍Մೳੑ͕͋Δɻ (Anderson 2020) Fig. 5 less diverse more diverse

Slide 19

Slide 19 text

©2020 Wantedly, Inc. ଟ༷ੑͱ conversion rate ͱͷؔ܎ੑ (Anderson 2020) • શମతʹଟ༷ੑ͕ߴ͍ํ͕ conversion ͠΍͍͢ɻ • ΞΫςΟϒ͕࠷΋ߴ͍૚Ͱ͸ 30% ΋͕ࠩ։͍ͨɻ • ϢʔβͷΞΫςΟϒ౓্͕͕ͬͨࡍɺଟ༷ੑͷߴ͍ Ϣʔβͷํ͕ conversion ͢ΔՄೳੑ͕ߴͦ͏ɻ (Anderson 2020) Fig. 6 less diverse more diverse

Slide 20

Slide 20 text

©2020 Wantedly, Inc. ଟ༷ੑͱਪનΞϧΰϦζϜͷؔ܎ੑ (Anderson 2020) • organic: Ϣʔβ͕ࣗ෼Ͱݕࡧͨ͠Γͯ͠ೳಈతʹফඅ͠ ͨۂ • programmed: Ϣʔβͷ܏޲ͳͲ͔ΒΞϧΰϦζϜʹ Αͬͯਪન͞Εͯडಈతʹফඅͨ͠ۂ • Ϣʔβ͝ͱʹ organic Ͱফඅͨ͠ۂͷΈΛ࢖ͬͯܭࢉ͠ ͨ GS-score ͱɺprogrammed Ͱফඅͨ͠ۂͷΈΛ࢖ͬ ͯܭࢉͨ͠ GS-score Λࢉग़ɻ • Ϣʔβ͕ organic Ͱফඅͨ͠ۂͷํ͕ɺprogrammed Ͱফඅͨ͠ۂΑΓ΋ଟ༷ੑ͕ߴ͍܏޲ʹ͋Δɻ (Anderson 2020) Fig. 3 less diverse more diverse more diverse less diverse

Slide 21

Slide 21 text

©2020 Wantedly, Inc. ଟ༷ੑ͕มԽ͍ͯ͠Δࡍʹͳʹ͕Ͳ͏มԽ͍ͯ͠Δ͔ʁ (Anderson 2020) • ࣌ؒʹΑͬͯมΘΒͳ͍ͳΒͲ͏΍ͬͯมΘͬͯ Δʁʁʁ • ଟ༷ੑ͕૿͔͑ͨͲ͏͔ͱɺۂΛফඅ͢ΔࡍͷϦ ιʔεͷׂ߹͕૿͔͑ͨͲ͏͔Ͱର਺Φοζൺ • Ϣʔβͷফඅ͢Δۂͷଟ༷ੑ͕૿͑ͨࡍʹ͸ɺ organic ͳফඅ͕૿͑ɺprogrammed ͳফඅ͕ݮͬ ͍ͯΔ܏޲ʹ͋Δ͜ͱ͕Θ͔ͬͨɻ (Anderson 2020) Fig. 9 organic programmed

Slide 22

Slide 22 text

©2020 Wantedly, Inc. ਪનΞϧΰϦζϜ͕Ϣʔβʹ༩͑ΔӨڹʹؔ͢Δ࣮ݧʢAB Testingʣ (Anderson 2020) • Personalize ͞Εͨਪન͸ short-term wins Ͱ͋Δ ʢ໨ͷલͷརӹͰ͋Δʣ࠶ੜճ਺Λେ͖͘վળ͢ Δɻಛʹ specialist (ଟ༷ੑ͕௿͍ = ਪન͠΍͢ ͍) ΁ͷޮՌ͕େ͖͍ɻ • ͦͷࡍɺεΩοϓ཰΋Ұॹʹ্͕ͬͯ͠·͏͜ͱ ͕ଟ͘ɻtrade-off ͷؔ܎ʹ͋Γͦ͏ɻ • εΩοϓ཰͸ generalist ͷํ্͕͕ͬͯ͠·͏܏ ޲ʹ͋Δɻ (Anderson 2020) Table. 1 Popularity: ࠶ੜճ਺ॱ → un-personalzied baseline Relevance: ϢʔβϕΫτϧͱۂͷϕΫτϧͷۙ͞ॱ → γϯϓϧͳ cf personalized model Learned: Ϣʔβɺۂɺ૬ޓ࡞༻ͷಛ௃ྔΛ༻͍ͨ NN → ෳࡶͳֶशʹΑΔ personalized model 3छྨͷΞϧΰϦζϜΛϥϯμϜʹϢʔβʹग़͠෼͚ͯɺ ࠶ੜճ਺ͱεΩοϓ཰Λܭଌ

Slide 23

Slide 23 text

©2020 Wantedly, Inc. ·ͱΊͱࡶײ (Anderson 2020) • ଟ༷ੑΛߟྀͨ͠ਪનΞϧΰϦζϜΛߟ͍͑ͯ͘͜ͱ͕ඞཁͳͷ͸໌Β͔ɻ • short-term ͳ໨తʢ௚ۙͷ࠶ੜճ਺ͳͲʣʹ࠷దԽ͠ա͗Δͱɺlong-term ͳ ໨తʢ༗ྉ՝ۚ཰΍ churn rate ͳͲʣ͕ଛͳΘΕΔՄೳੑ͕͋Δɻ • Ϣʔβͷফඅ͢Δۂͷଟ༷ੑͱɺ༗ྉ՝ۚ཰ ΍ churn rate ͱͷҼՌؔ܎͸໌Β ͔Ͱ͸ͳ͍ʢൃදฉ͖ͳ͕ΒΊͪΌͪ͘Όࢥ͍ͬͯͨʣɻ • ࠓճͷݚڀͰࣔͨ͠ͷ͸͋͘·Ͱ૬ؔؔ܎ɻ • ීஈ࣮αʔϏεʹ͓͚ΔਪનγεςϜʹ͓͍ͯɺଟ༷ੑͱ shot-term ͳརӹͱͷ trade-off ʹ͍ͭͯ಄Λ೰·͍ͤͯΔਓؒͱͯ͠͸ɺଟ༷ੑͱ long-term ͳརӹʹେ͖ ͳؔ܎͕͋Γͦ͏ͩͱఆྔతʹࣔͯ͘͠Εͨͷ͸ͱͯ΋خ͘͠ࢥͬͨɻ • σʔλྔ΍͹͗͢ɻ ·ͱΊ ࡶײ

Slide 24

Slide 24 text

©2020 Wantedly, Inc. 3FGT • (Zhang 2020) Zhang, Le and Xu, Tong and Zhu, Hengshu and Qin, Chuan and Meng, Qingxin and Xiong, Hui and Chen, Enhong. Large-Scale Talent Flow Embedding for Company Competitive Analysis.Proceedings of The Web Conference 2020 P. 2354–2364. • (Anderson 2020) Anderson, Ashton and Maystre, Lucas and Anderson, Ian and Mehrotra, Rishabh and Lalmas, Mounia. Algorithmic Effects on the Diversity of Consumption on Spotify. Proceedings of The Web Conference 2020 P. 2155– 2165.