Upgrade to Pro — share decks privately, control downloads, hide ads and more …

WWW2020 論文紹介 / www2020-papers

WWW2020 論文紹介 / www2020-papers

2020年4月30日 The Web Conference2020 参加報告会 by Wantedly (https://connpass.com/event/174856/) における発表資料です。

以下の2つの論文について概要を紹介しました。

- Zhang, Le and Xu, Tong and Zhu, Hengshu and Qin, Chuan and Meng, Qingxin and Xiong, Hui and Chen, Enhong. Large-Scale Talent Flow Embedding for Company Competitive Analysis.Proceedings of The Web Conference 2020 P. 2354–2364.
- https://dl.acm.org/doi/abs/10.1145/3366423.3380299
- Anderson, Ashton and Maystre, Lucas and Anderson, Ian and Mehrotra, Rishabh and Lalmas, Mounia. Algorithmic Effects on the Diversity of Consumption on Spotify. Proceedings of The Web Conference 2020 P. 2155–2165.
- https://dl.acm.org/doi/10.1145/3366423.3380281

Yuya Matsumura

April 30, 2020
Tweet

More Decks by Yuya Matsumura

Other Decks in Research

Transcript

  1. ©2020 Wantedly, Inc. WWW2020 ࿦จ঺հ The Web Conference2020 ࢀՃใࠂձ by

    Wantedly 30.Apr.2020 - দଜ༏໵ @yu-ya4 5BMFOU'MPXʹجͮ͘اۀؒͷڝ߹ੑͷ෼ੳ4QPUJGZʹ͓͚Δਪનͷଟ༷ੑʹ͍ͭͯͷ෼ੳ
  2. ©2020 Wantedly, Inc. ✓ Yuya Matsumura ✓ Wantedly, Inc. Recommendation

    Team ✓ Data Science, Team Lead ✓ Interested in Information Retrieval, Machine Learning Self-Introduction @yu-ya4 @yu__ya4
  3. ©2020 Wantedly, Inc. ຊൃදͰऔΓ্͛Δ࿦จ 1. (Zhang 2020) Large-Scale Talent Flow

    Embedding for Company Competitive Analysis • ਓࡐͷྲྀΕʹ஫໨ͯ͠اۀΛ෼ࢄදݱͰද͠ɺاۀؒͷʮڝ߹ʯʹ͍ͭͯ෼ੳ 2. (Anderson 2020) Algorithmic Effects on the Diversity of Consumption on Spotify • Spotify ʹ͓͚ΔਪનγεςϜ͕ʮଟ༷ੑʯʹ༩͑ΔӨڹٴͼɺʮଟ༷ੑʯͱαʔϏεͷॏཁࢦඪͷؔ܎ੑʹ͍ͭͯ෼ੳ WWW2020 Ͱ @yu-ya4 ͕ؾʹͳͬͨ࿦จ̎ຊͷ֓ཁΛ঺հ
  4. ©2020 Wantedly, Inc. ຊൃදͰऔΓ্͛Δ࿦จ 1. (Zhang 2020) Large-Scale Talent Flow

    Embedding for Company Competitive Analysis • ਓࡐͷྲྀΕʹ஫໨ͯ͠اۀΛ෼ࢄදݱͰද͠ɺاۀؒͷʮڝ߹ʯʹ͍ͭͯ෼ੳ 2. (Anderson 2020) Algorithmic Effects on the Diversity of Consumption on Spotify • Spotify ʹ͓͚ΔਪનγεςϜ͕ʮଟ༷ੑʯʹ༩͑ΔӨڹٴͼɺʮଟ༷ੑʯͱαʔϏεͷॏཁࢦඪͷؔ܎ੑʹ͍ͭͯ෼ੳ WWW2020 Ͱ @yu-ya4 ͕ؾʹͳͬͨ࿦จ̎ຊͷ֓ཁΛ঺հ
  5. ©2020 Wantedly, Inc. Ϟνϕʔγϣϯ (Zhang 2020) ਓࡐͷྲྀΕʢTalent Flowʣͷ෼ੳΛར༻ͨ͠اۀؒͷڝ߹ʹ͍ͭͯͷݚڀʹ͓͚Δ৽ ͍͠࿮૊ΈͷఏҊ (Zhang

    2020) Fig. 1 (Zhang 2020) ൃදεϥΠυ P.6 • ͜Ε·Ͱͷݚڀ͸ɺاۀͷ౷ܭతͳσʔλΛ༻͍ͨ΋ͷ΍ɺώϡʔϦ εςΟοΫͳख๏ʹΑΔ෼ੳ͕ଟ͔ͬͨ • 2006-2010 ʹଔۀֶͨ͠ੜ͸࠷ॳͷ̑೥Ͱฏۉ 2.85 ݸͷ৬ʹ͍ͭͨ • ͜Ε͸ 1986-1990 ͷظؒͷ̎ഒʹ͋ͨΔ • ͜ΕΒͷσʔλ͕ Web ্ʹଟʑެ։͞Ε͍ͯΔʢex. LinkedInʣ
  6. ©2020 Wantedly, Inc. اۀͷڝ߹ੑʢCompany Competitivenessʣ (Zhang 2020) اۀ u ͷاۀ

    v ʹର͢Δڝ߹ੑ • personalized PageRank(PPR) proximity Λར༻ͯ͠දݱ • άϥϑ্ͷಛఆͷ node ͔Βଞͷର৅ͱ͢Δ node ΁ͷؔ࿈౓Λܭࢉ (Zhang 2020) ൃදεϥΠυ P.10 u ͷ talent ͕ v ʹҠΓ΍͘͢ɺv ʹҠͬͯ͘Δ talent ͷ͏ͪେ͖ͳׂ߹Λ u ͷ talent ͕ ઎Ί͍ͯΔࡍʹߴ͍ͱߟ͑Δ Talent Flow Network(TFN) • Node: اۀ • Edge: talent ͷྲྀΕ • talent ͕Ҡಈͨ͠ճ਺ΛॏΈʹ࣋ͭɻ༗޲ɻ : ΦϦδφϧͷ TFN ʹ͓͚Δ PPR : Edge ͷ޲͖Λ൓సͤͨ͞ TFN ʹ͓͚Δ PPR
  7. ©2020 Wantedly, Inc. ΍Γ͍ͨ͜ͱ: Talent Flow Embedding ͷ࡞੒ (Zhang 2020)

    Talent Flow Network ͔Βاۀͷڝ߹ੑʹجͮ̎ͭ͘ͷ attraction vector Λֶश͢Δ (Zhang 2020) Fig. 3 • : source vector of u • اۀ u ͷ talent ͔Βଞͷاۀ΁ͷ attraction Λදݱ • : target vector of u • ଞͷاۀͷ talent ͔Βاۀ u ΁ͷ attraction Λදݱ Su Tu
  8. ©2020 Wantedly, Inc. ࣮ݱํ๏ (Zhang 2020) ϕΫτϧۭؒʹ͓͚Δ෼෍ ΛɺΦϦδφϧͷ෼෍ ʹ fit

    ͤ͞Δ • ϕΫτϧۭؒʹ͓͚Δ෼෍ • : u ͔Β v ΁ͷڑ཭ • : v ͔Β u ΁ͷڑ཭ • • Kullback-Leibler(KL) divergence Λ࠷খԽ • • ΦϦδφϧͷ෼෍ • Monte Carlo + random walk Ͱۙࣅ • • ϙδγϣϯ͝ͱʹҟͳΔάϥϑΛ࢖ͬͨMulti-Task όʔδϣϯ΋ •
  9. ©2020 Wantedly, Inc. Case Study 1 (Zhang 2020) • Google

    ͱ Facebook ͸ޓ͍ʹڝ߹اۀɻ • positionʢ৬छʣ͝ͱʹ෼͚Δͱڝ߹اۀ΋มΘΔɻ
  10. ©2020 Wantedly, Inc. Case Study 2 (Zhang 2020) • source

    vectorʢاۀ͔Βͷ attractionʣΛ࢖͏ͱɺۀछ͝ͱʹ෼͔ΕΔɻ • target vectorʢtalent ͔Βͷ attractionʣΛ࢖͏ͱɺࠃ͝ͱʹ෼͔ΕΔɻ
  11. ©2020 Wantedly, Inc. ·ͱΊͱࡶײ • ਓࡐͷྲྀΕʹண໨ͯ͠اۀؒͷʮڝ߹ੑʯʹ͍ͭͯఆٛɺ෼ੳͨ͠ɻ • ఆٛͨ͠ʮڝ߹ੑʯʹج͍ͮͯاۀͷ෼ࢄදݱΛֶश͢Δ TFE ΛఏҊͨ͠ɻ

    • TFE ͷੑೳධՁͷ࣮ݧ΍ɺCase Study ʹΑΔΠϯαΠτͷൃݟΛߦͬͨɻ • Wantedly Visit ͱ͍͏αʔϏεΛѻͬͯΔਓؒͱͯ͠͸େมڵຯਂ͘ɺࣗࣾͷαʔϏ εͰ΋ࢼͯ͠Έ͍ͨͱࢥͬͨɻ • ϢʔβʹืूΛਪન͢Δͷʹ΋ɺاۀʹεΧ΢τ͍ͨ͠ϢʔβΛਪન͢Δͷʹ΋྆ํ ʹ͏·͘࢖͑ͦ͏ɻ • اۀͷ޷ΈͱϢʔβͷ޷ΈͰاۀͷ෼෍͕ҟͳΔͷ͸໘ന͍ɻ ·ͱΊ ࡶײ (Zhang 2020)
  12. ©2020 Wantedly, Inc. ຊൃදͰऔΓ্͛Δ࿦จ 1. (Zhang 2020) Large-Scale Talent Flow

    Embedding for Company Competitive Analysis • ਓࡐͷྲྀΕʹ஫໨ͯ͠اۀΛ෼ࢄදݱͰද͠ɺاۀؒͷʮڝ߹ʯʹ͍ͭͯ෼ੳ 2. (Anderson 2020) Algorithmic Effects on the Diversity of Consumption on Spotify • Spotify ʹ͓͚ΔਪનγεςϜ͕ʮଟ༷ੑʯʹ༩͑ΔӨڹٴͼɺʮଟ༷ੑʯͱαʔϏεͷॏཁࢦඪͷؔ܎ੑʹ͍ͭͯ෼ੳ WWW2020 Ͱ @yu-ya4 ͕ؾʹͳͬͨ࿦จ̎ຊͷ֓ཁΛ঺հ
  13. ©2020 Wantedly, Inc. Ϟνϕʔγϣϯ (Anderson 2020) ΞϧΰϦζϜʹΑΔਪનͱϢʔβ͕ফඅ͢ΔίϯςϯπͷؒʹͲͷΑ͏ͳؔ܎ੑ͕ ͋Δͷ͔஌Γ͍ͨɻ (Anderson 2020)

    ൃදεϥΠυ • Ϣʔβͷաڈͷߦಈཤྺ΍ᅂ޷ʹج͍ͮͯਪન͢Δͱ͍͏͜ͱ͸ɺࣅͨΑ͏ͳ ΋ͷ͹͔ΓΛਪન͢Δ͜ͱʹͳΓɺଟ༷ੑ͕ࣦΘΕΔͷͰ͸ʁ
  14. ©2020 Wantedly, Inc. ʮଟ༷ੑʯΛఆྔతʹදͨ͢Ίͷई౓ GS-score ͷఏҊ (Anderson 2020) • ಉ͡

    playlist ʹΑ͘ग़ͯ͘ΔۂͲ͏͠͸ࣅ͍ͯΔͱ͍͏ԾఆʹΑΓ word2vec (CBOW) Ͱۂͷ embedding Λ࡞੒ɻ • Ϣʔβͷফඅͨ͠ۂϕΫτϧͷॏ৺ϕΫτϧʢcenter of massʣͱɺϢʔβͷফඅ͠ ͨ͢΂ͯͷۂϕΫτϧͱͷॏΈʢ࠶ੜճ਺ʣ͖ͭͷ಺ੵ࿨Λ GS-score ͱ͢Δɻ • GS-score ͕খ͍͞ -> ফඅͨ͠ۂͲ͏͕͠཭ΕͯΔ -> Generalist • GS-score ͕େ͖͍ -> ফඅͨ͠ۂͲ͏͕͍ۙ͠ -> Specialist Center of mass: User consumption diversity measure: (Anderson 2020) ൃදεϥΠυ
  15. ©2020 Wantedly, Inc. ଟ༷ੑͷ෼෍ (Anderson 2020) • ΞΫςΟϒ౓ͰϢʔβΛ෼͚ͯଟ༷ੑʢGS- scoreʣͷ෼෍Λ෼ੳɻ •

    ࠷΋ΞΫςΟϒ౓͕௿͍૚͸ଟ༷ੑ͕௿͍ํʹ Α͍ͬͯΔ͕ɺͦ΋ͦ΋ফඅͨ͠ۂ͕গͳ͍ͨΊ Ͱ͋Ζ͏ͱߟ͑ΒΕΔɻ • ΞΫςΟϒ౓ͱଟ༷ੑʹ૬ؔ͸ͳͦ͞͏ɻ less diverse more diverse (Anderson 2020) Fig. 2
  16. ©2020 Wantedly, Inc. ࣌ؒͷܦաʹΑΔଟ༷ੑͷมԽ (Anderson 2020) • 2018೥7݄ͱ2019೥7݄ʹ͓͚ΔಉҰϢʔβͷ GS- score

    ΛՄࢹԽ • શମతʹ΄ͱΜͲมԽͳ͠ɻ • ಛʹɺۃ୺ͳ GS-score ͷϢʔβʢ0, 1 ʹ͍ۙʣ ͸มΘΒͳ͍ɻ • Ϣʔβͷফඅ͢Δۂͷଟ༷ੑ͸࣌ؒͰมԽ͠ͳ͞ ͦ͏ɻ (Anderson 2020) Fig. 7
  17. ©2020 Wantedly, Inc. ଟ༷ੑͱ churn rate ͱͷؔ܎ੑ (Anderson 2020) •

    શମతʹଟ༷ੑ͕௿͍ํ͕ churn ͠΍͍͢ɻ • ΞΫςΟϒ͕࠷΋௿͍૚Ͱ͸ 25% ΋͕ࠩ։͍ͨɻ • ϢʔβͷΞΫςΟϒ౓͕มΘͬͯ΋ɺଟ༷ੑͷߴ͍ ૚ͷ churn rate ʹ͸ؔ܎͕ͳ͍Մೳੑ͕͋Δɻ (Anderson 2020) Fig. 5 less diverse more diverse
  18. ©2020 Wantedly, Inc. ଟ༷ੑͱ conversion rate ͱͷؔ܎ੑ (Anderson 2020) •

    શମతʹଟ༷ੑ͕ߴ͍ํ͕ conversion ͠΍͍͢ɻ • ΞΫςΟϒ͕࠷΋ߴ͍૚Ͱ͸ 30% ΋͕ࠩ։͍ͨɻ • ϢʔβͷΞΫςΟϒ౓্͕͕ͬͨࡍɺଟ༷ੑͷߴ͍ Ϣʔβͷํ͕ conversion ͢ΔՄೳੑ͕ߴͦ͏ɻ (Anderson 2020) Fig. 6 less diverse more diverse
  19. ©2020 Wantedly, Inc. ଟ༷ੑͱਪનΞϧΰϦζϜͷؔ܎ੑ (Anderson 2020) • organic: Ϣʔβ͕ࣗ෼Ͱݕࡧͨ͠Γͯ͠ೳಈతʹফඅ͠ ͨۂ

    • programmed: Ϣʔβͷ܏޲ͳͲ͔ΒΞϧΰϦζϜʹ Αͬͯਪન͞Εͯडಈతʹফඅͨ͠ۂ • Ϣʔβ͝ͱʹ organic Ͱফඅͨ͠ۂͷΈΛ࢖ͬͯܭࢉ͠ ͨ GS-score ͱɺprogrammed Ͱফඅͨ͠ۂͷΈΛ࢖ͬ ͯܭࢉͨ͠ GS-score Λࢉग़ɻ • Ϣʔβ͕ organic Ͱফඅͨ͠ۂͷํ͕ɺprogrammed Ͱফඅͨ͠ۂΑΓ΋ଟ༷ੑ͕ߴ͍܏޲ʹ͋Δɻ (Anderson 2020) Fig. 3 less diverse more diverse more diverse less diverse
  20. ©2020 Wantedly, Inc. ଟ༷ੑ͕มԽ͍ͯ͠Δࡍʹͳʹ͕Ͳ͏มԽ͍ͯ͠Δ͔ʁ (Anderson 2020) • ࣌ؒʹΑͬͯมΘΒͳ͍ͳΒͲ͏΍ͬͯมΘͬͯ Δʁʁʁ •

    ଟ༷ੑ͕૿͔͑ͨͲ͏͔ͱɺۂΛফඅ͢ΔࡍͷϦ ιʔεͷׂ߹͕૿͔͑ͨͲ͏͔Ͱର਺Φοζൺ • Ϣʔβͷফඅ͢Δۂͷଟ༷ੑ͕૿͑ͨࡍʹ͸ɺ organic ͳফඅ͕૿͑ɺprogrammed ͳফඅ͕ݮͬ ͍ͯΔ܏޲ʹ͋Δ͜ͱ͕Θ͔ͬͨɻ (Anderson 2020) Fig. 9 organic programmed
  21. ©2020 Wantedly, Inc. ਪનΞϧΰϦζϜ͕Ϣʔβʹ༩͑ΔӨڹʹؔ͢Δ࣮ݧʢAB Testingʣ (Anderson 2020) • Personalize ͞Εͨਪન͸

    short-term wins Ͱ͋Δ ʢ໨ͷલͷརӹͰ͋Δʣ࠶ੜճ਺Λେ͖͘վળ͢ Δɻಛʹ specialist (ଟ༷ੑ͕௿͍ = ਪન͠΍͢ ͍) ΁ͷޮՌ͕େ͖͍ɻ • ͦͷࡍɺεΩοϓ཰΋Ұॹʹ্͕ͬͯ͠·͏͜ͱ ͕ଟ͘ɻtrade-off ͷؔ܎ʹ͋Γͦ͏ɻ • εΩοϓ཰͸ generalist ͷํ্͕͕ͬͯ͠·͏܏ ޲ʹ͋Δɻ (Anderson 2020) Table. 1 Popularity: ࠶ੜճ਺ॱ → un-personalzied baseline Relevance: ϢʔβϕΫτϧͱۂͷϕΫτϧͷۙ͞ॱ → γϯϓϧͳ cf personalized model Learned: Ϣʔβɺۂɺ૬ޓ࡞༻ͷಛ௃ྔΛ༻͍ͨ NN → ෳࡶͳֶशʹΑΔ personalized model 3छྨͷΞϧΰϦζϜΛϥϯμϜʹϢʔβʹग़͠෼͚ͯɺ ࠶ੜճ਺ͱεΩοϓ཰Λܭଌ
  22. ©2020 Wantedly, Inc. ·ͱΊͱࡶײ (Anderson 2020) • ଟ༷ੑΛߟྀͨ͠ਪનΞϧΰϦζϜΛߟ͍͑ͯ͘͜ͱ͕ඞཁͳͷ͸໌Β͔ɻ • short-term

    ͳ໨తʢ௚ۙͷ࠶ੜճ਺ͳͲʣʹ࠷దԽ͠ա͗Δͱɺlong-term ͳ ໨తʢ༗ྉ՝ۚ཰΍ churn rate ͳͲʣ͕ଛͳΘΕΔՄೳੑ͕͋Δɻ • Ϣʔβͷফඅ͢Δۂͷଟ༷ੑͱɺ༗ྉ՝ۚ཰ ΍ churn rate ͱͷҼՌؔ܎͸໌Β ͔Ͱ͸ͳ͍ʢൃදฉ͖ͳ͕ΒΊͪΌͪ͘Όࢥ͍ͬͯͨʣɻ • ࠓճͷݚڀͰࣔͨ͠ͷ͸͋͘·Ͱ૬ؔؔ܎ɻ • ීஈ࣮αʔϏεʹ͓͚ΔਪનγεςϜʹ͓͍ͯɺଟ༷ੑͱ shot-term ͳརӹͱͷ trade-off ʹ͍ͭͯ಄Λ೰·͍ͤͯΔਓؒͱͯ͠͸ɺଟ༷ੑͱ long-term ͳརӹʹେ͖ ͳؔ܎͕͋Γͦ͏ͩͱఆྔతʹࣔͯ͘͠Εͨͷ͸ͱͯ΋خ͘͠ࢥͬͨɻ • σʔλྔ΍͹͗͢ɻ ·ͱΊ ࡶײ
  23. ©2020 Wantedly, Inc. 3FGT • (Zhang 2020) Zhang, Le and

    Xu, Tong and Zhu, Hengshu and Qin, Chuan and Meng, Qingxin and Xiong, Hui and Chen, Enhong. Large-Scale Talent Flow Embedding for Company Competitive Analysis.Proceedings of The Web Conference 2020 P. 2354–2364. • (Anderson 2020) Anderson, Ashton and Maystre, Lucas and Anderson, Ian and Mehrotra, Rishabh and Lalmas, Mounia. Algorithmic Effects on the Diversity of Consumption on Spotify. Proceedings of The Web Conference 2020 P. 2155– 2165.