$30 off During Our Annual Pro Sale. View Details »

WWW2020 論文紹介 / www2020-papers

WWW2020 論文紹介 / www2020-papers

2020年4月30日 The Web Conference2020 参加報告会 by Wantedly (https://connpass.com/event/174856/) における発表資料です。

以下の2つの論文について概要を紹介しました。

- Zhang, Le and Xu, Tong and Zhu, Hengshu and Qin, Chuan and Meng, Qingxin and Xiong, Hui and Chen, Enhong. Large-Scale Talent Flow Embedding for Company Competitive Analysis.Proceedings of The Web Conference 2020 P. 2354–2364.
- https://dl.acm.org/doi/abs/10.1145/3366423.3380299
- Anderson, Ashton and Maystre, Lucas and Anderson, Ian and Mehrotra, Rishabh and Lalmas, Mounia. Algorithmic Effects on the Diversity of Consumption on Spotify. Proceedings of The Web Conference 2020 P. 2155–2165.
- https://dl.acm.org/doi/10.1145/3366423.3380281

Yuya Matsumura

April 30, 2020
Tweet

More Decks by Yuya Matsumura

Other Decks in Research

Transcript

  1. ©2020 Wantedly, Inc.
    WWW2020 ࿦จ঺հ
    The Web Conference2020 ࢀՃใࠂձ by Wantedly
    30.Apr.2020 - দଜ༏໵ @yu-ya4
    5BMFOU'MPXʹجͮ͘اۀؒͷڝ߹ੑͷ෼ੳ4QPUJGZʹ͓͚Δਪનͷଟ༷ੑʹ͍ͭͯͷ෼ੳ

    View Slide

  2. ©2020 Wantedly, Inc.
    ✓ Yuya Matsumura
    ✓ Wantedly, Inc. Recommendation Team
    ✓ Data Science, Team Lead
    ✓ Interested in Information Retrieval, Machine Learning
    Self-Introduction
    @yu-ya4
    @yu__ya4

    View Slide

  3. ©2020 Wantedly, Inc.
    ຊൃදͰऔΓ্͛Δ࿦จ
    1. (Zhang 2020) Large-Scale Talent Flow Embedding for Company Competitive Analysis
    • ਓࡐͷྲྀΕʹ஫໨ͯ͠اۀΛ෼ࢄදݱͰද͠ɺاۀؒͷʮڝ߹ʯʹ͍ͭͯ෼ੳ
    2. (Anderson 2020) Algorithmic Effects on the Diversity of Consumption on Spotify
    • Spotify ʹ͓͚ΔਪનγεςϜ͕ʮଟ༷ੑʯʹ༩͑ΔӨڹٴͼɺʮଟ༷ੑʯͱαʔϏεͷॏཁࢦඪͷؔ܎ੑʹ͍ͭͯ෼ੳ
    WWW2020 Ͱ @yu-ya4 ͕ؾʹͳͬͨ࿦จ̎ຊͷ֓ཁΛ঺հ

    View Slide

  4. ©2020 Wantedly, Inc.
    ຊൃදͰऔΓ্͛Δ࿦จ
    1. (Zhang 2020) Large-Scale Talent Flow Embedding for Company Competitive Analysis
    • ਓࡐͷྲྀΕʹ஫໨ͯ͠اۀΛ෼ࢄදݱͰද͠ɺاۀؒͷʮڝ߹ʯʹ͍ͭͯ෼ੳ
    2. (Anderson 2020) Algorithmic Effects on the Diversity of Consumption on Spotify
    • Spotify ʹ͓͚ΔਪનγεςϜ͕ʮଟ༷ੑʯʹ༩͑ΔӨڹٴͼɺʮଟ༷ੑʯͱαʔϏεͷॏཁࢦඪͷؔ܎ੑʹ͍ͭͯ෼ੳ
    WWW2020 Ͱ @yu-ya4 ͕ؾʹͳͬͨ࿦จ̎ຊͷ֓ཁΛ঺հ

    View Slide

  5. ©2020 Wantedly, Inc.
    Ϟνϕʔγϣϯ
    (Zhang 2020)
    ਓࡐͷྲྀΕʢTalent Flowʣͷ෼ੳΛར༻ͨ͠اۀؒͷڝ߹ʹ͍ͭͯͷݚڀʹ͓͚Δ৽
    ͍͠࿮૊ΈͷఏҊ
    (Zhang 2020) Fig. 1
    (Zhang 2020) ൃදεϥΠυ P.6
    • ͜Ε·Ͱͷݚڀ͸ɺاۀͷ౷ܭతͳσʔλΛ༻͍ͨ΋ͷ΍ɺώϡʔϦ
    εςΟοΫͳख๏ʹΑΔ෼ੳ͕ଟ͔ͬͨ
    • 2006-2010 ʹଔۀֶͨ͠ੜ͸࠷ॳͷ̑೥Ͱฏۉ 2.85 ݸͷ৬ʹ͍ͭͨ
    • ͜Ε͸ 1986-1990 ͷظؒͷ̎ഒʹ͋ͨΔ
    • ͜ΕΒͷσʔλ͕ Web ্ʹଟʑެ։͞Ε͍ͯΔʢex. LinkedInʣ

    View Slide

  6. ©2020 Wantedly, Inc.
    اۀͷڝ߹ੑʢCompany Competitivenessʣ
    (Zhang 2020)
    اۀ u ͷاۀ v ʹର͢Δڝ߹ੑ
    • personalized PageRank(PPR) proximity Λར༻ͯ͠දݱ
    • άϥϑ্ͷಛఆͷ node ͔Βଞͷର৅ͱ͢Δ node ΁ͷؔ࿈౓Λܭࢉ
    (Zhang 2020) ൃදεϥΠυ P.10
    u ͷ talent ͕ v ʹҠΓ΍͘͢ɺv ʹҠͬͯ͘Δ talent ͷ͏ͪେ͖ͳׂ߹Λ u ͷ talent ͕
    ઎Ί͍ͯΔࡍʹߴ͍ͱߟ͑Δ
    Talent Flow Network(TFN)
    • Node: اۀ
    • Edge: talent ͷྲྀΕ
    • talent ͕Ҡಈͨ͠ճ਺ΛॏΈʹ࣋ͭɻ༗޲ɻ
    : ΦϦδφϧͷ TFN ʹ͓͚Δ PPR
    : Edge ͷ޲͖Λ൓సͤͨ͞ TFN ʹ͓͚Δ PPR

    View Slide

  7. ©2020 Wantedly, Inc.
    ΍Γ͍ͨ͜ͱ: Talent Flow Embedding ͷ࡞੒
    (Zhang 2020)
    Talent Flow Network ͔Βاۀͷڝ߹ੑʹجͮ̎ͭ͘ͷ attraction vector Λֶश͢Δ
    (Zhang 2020) Fig. 3
    • : source vector of u
    • اۀ u ͷ talent ͔Βଞͷاۀ΁ͷ attraction Λදݱ
    • : target vector of u
    • ଞͷاۀͷ talent ͔Βاۀ u ΁ͷ attraction Λදݱ
    Su
    Tu

    View Slide

  8. ©2020 Wantedly, Inc.
    ࣮ݱํ๏
    (Zhang 2020)
    ϕΫτϧۭؒʹ͓͚Δ෼෍ ΛɺΦϦδφϧͷ෼෍ ʹ fit ͤ͞Δ
    • ϕΫτϧۭؒʹ͓͚Δ෼෍
    • : u ͔Β v ΁ͷڑ཭
    • : v ͔Β u ΁ͷڑ཭

    • Kullback-Leibler(KL) divergence Λ࠷খԽ

    • ΦϦδφϧͷ෼෍
    • Monte Carlo + random walk Ͱۙࣅ

    • ϙδγϣϯ͝ͱʹҟͳΔάϥϑΛ࢖ͬͨMulti-Task όʔδϣϯ΋

    View Slide

  9. ©2020 Wantedly, Inc.
    ࣮ݧ̍ɿLink Prediction
    (Zhang 2020)
    (Zhang 2020) Fig. 4

    View Slide

  10. ©2020 Wantedly, Inc.
    Case Study 1
    (Zhang 2020)
    • Google ͱ Facebook ͸ޓ͍ʹڝ߹اۀɻ
    • positionʢ৬छʣ͝ͱʹ෼͚Δͱڝ߹اۀ΋มΘΔɻ

    View Slide

  11. ©2020 Wantedly, Inc.
    Case Study 2
    (Zhang 2020)
    • source vectorʢاۀ͔Βͷ attractionʣΛ࢖͏ͱɺۀछ͝ͱʹ෼͔ΕΔɻ
    • target vectorʢtalent ͔Βͷ attractionʣΛ࢖͏ͱɺࠃ͝ͱʹ෼͔ΕΔɻ

    View Slide

  12. ©2020 Wantedly, Inc.
    ·ͱΊͱࡶײ
    • ਓࡐͷྲྀΕʹண໨ͯ͠اۀؒͷʮڝ߹ੑʯʹ͍ͭͯఆٛɺ෼ੳͨ͠ɻ
    • ఆٛͨ͠ʮڝ߹ੑʯʹج͍ͮͯاۀͷ෼ࢄදݱΛֶश͢Δ TFE ΛఏҊͨ͠ɻ
    • TFE ͷੑೳධՁͷ࣮ݧ΍ɺCase Study ʹΑΔΠϯαΠτͷൃݟΛߦͬͨɻ
    • Wantedly Visit ͱ͍͏αʔϏεΛѻͬͯΔਓؒͱͯ͠͸େมڵຯਂ͘ɺࣗࣾͷαʔϏ
    εͰ΋ࢼͯ͠Έ͍ͨͱࢥͬͨɻ
    • ϢʔβʹืूΛਪન͢Δͷʹ΋ɺاۀʹεΧ΢τ͍ͨ͠ϢʔβΛਪન͢Δͷʹ΋྆ํ
    ʹ͏·͘࢖͑ͦ͏ɻ
    • اۀͷ޷ΈͱϢʔβͷ޷ΈͰاۀͷ෼෍͕ҟͳΔͷ͸໘ന͍ɻ
    ·ͱΊ
    ࡶײ
    (Zhang 2020)

    View Slide

  13. ©2020 Wantedly, Inc.
    ຊൃදͰऔΓ্͛Δ࿦จ
    1. (Zhang 2020) Large-Scale Talent Flow Embedding for Company Competitive Analysis
    • ਓࡐͷྲྀΕʹ஫໨ͯ͠اۀΛ෼ࢄදݱͰද͠ɺاۀؒͷʮڝ߹ʯʹ͍ͭͯ෼ੳ
    2. (Anderson 2020) Algorithmic Effects on the Diversity of Consumption on Spotify
    • Spotify ʹ͓͚ΔਪનγεςϜ͕ʮଟ༷ੑʯʹ༩͑ΔӨڹٴͼɺʮଟ༷ੑʯͱαʔϏεͷॏཁࢦඪͷؔ܎ੑʹ͍ͭͯ෼ੳ
    WWW2020 Ͱ @yu-ya4 ͕ؾʹͳͬͨ࿦จ̎ຊͷ֓ཁΛ঺հ

    View Slide

  14. ©2020 Wantedly, Inc.
    Ϟνϕʔγϣϯ
    (Anderson 2020)
    ΞϧΰϦζϜʹΑΔਪનͱϢʔβ͕ফඅ͢ΔίϯςϯπͷؒʹͲͷΑ͏ͳؔ܎ੑ͕
    ͋Δͷ͔஌Γ͍ͨɻ
    (Anderson 2020) ൃදεϥΠυ
    • Ϣʔβͷաڈͷߦಈཤྺ΍ᅂ޷ʹج͍ͮͯਪન͢Δͱ͍͏͜ͱ͸ɺࣅͨΑ͏ͳ
    ΋ͷ͹͔ΓΛਪન͢Δ͜ͱʹͳΓɺଟ༷ੑ͕ࣦΘΕΔͷͰ͸ʁ

    View Slide

  15. ©2020 Wantedly, Inc.
    ʮଟ༷ੑʯΛఆྔతʹදͨ͢Ίͷई౓ GS-score ͷఏҊ
    (Anderson 2020)
    • ಉ͡ playlist ʹΑ͘ग़ͯ͘ΔۂͲ͏͠͸ࣅ͍ͯΔͱ͍͏ԾఆʹΑΓ word2vec
    (CBOW) Ͱۂͷ embedding Λ࡞੒ɻ
    • Ϣʔβͷফඅͨ͠ۂϕΫτϧͷॏ৺ϕΫτϧʢcenter of massʣͱɺϢʔβͷফඅ͠
    ͨ͢΂ͯͷۂϕΫτϧͱͷॏΈʢ࠶ੜճ਺ʣ͖ͭͷ಺ੵ࿨Λ GS-score ͱ͢Δɻ
    • GS-score ͕খ͍͞ -> ফඅͨ͠ۂͲ͏͕͠཭ΕͯΔ -> Generalist
    • GS-score ͕େ͖͍ -> ফඅͨ͠ۂͲ͏͕͍ۙ͠ -> Specialist
    Center of mass:
    User consumption
    diversity measure:
    (Anderson 2020) ൃදεϥΠυ

    View Slide

  16. ©2020 Wantedly, Inc.
    ଟ༷ੑͷ෼෍
    (Anderson 2020)
    • ΞΫςΟϒ౓ͰϢʔβΛ෼͚ͯଟ༷ੑʢGS-
    scoreʣͷ෼෍Λ෼ੳɻ
    • ࠷΋ΞΫςΟϒ౓͕௿͍૚͸ଟ༷ੑ͕௿͍ํʹ
    Α͍ͬͯΔ͕ɺͦ΋ͦ΋ফඅͨ͠ۂ͕গͳ͍ͨΊ
    Ͱ͋Ζ͏ͱߟ͑ΒΕΔɻ
    • ΞΫςΟϒ౓ͱଟ༷ੑʹ૬ؔ͸ͳͦ͞͏ɻ
    less diverse
    more diverse
    (Anderson 2020) Fig. 2

    View Slide

  17. ©2020 Wantedly, Inc.
    ࣌ؒͷܦաʹΑΔଟ༷ੑͷมԽ
    (Anderson 2020)
    • 2018೥7݄ͱ2019೥7݄ʹ͓͚ΔಉҰϢʔβͷ GS-
    score ΛՄࢹԽ
    • શମతʹ΄ͱΜͲมԽͳ͠ɻ
    • ಛʹɺۃ୺ͳ GS-score ͷϢʔβʢ0, 1 ʹ͍ۙʣ
    ͸มΘΒͳ͍ɻ
    • Ϣʔβͷফඅ͢Δۂͷଟ༷ੑ͸࣌ؒͰมԽ͠ͳ͞
    ͦ͏ɻ
    (Anderson 2020) Fig. 7

    View Slide

  18. ©2020 Wantedly, Inc.
    ଟ༷ੑͱ churn rate ͱͷؔ܎ੑ
    (Anderson 2020)
    • શମతʹଟ༷ੑ͕௿͍ํ͕ churn ͠΍͍͢ɻ
    • ΞΫςΟϒ͕࠷΋௿͍૚Ͱ͸ 25% ΋͕ࠩ։͍ͨɻ
    • ϢʔβͷΞΫςΟϒ౓͕มΘͬͯ΋ɺଟ༷ੑͷߴ͍
    ૚ͷ churn rate ʹ͸ؔ܎͕ͳ͍Մೳੑ͕͋Δɻ
    (Anderson 2020) Fig. 5
    less diverse
    more diverse

    View Slide

  19. ©2020 Wantedly, Inc.
    ଟ༷ੑͱ conversion rate ͱͷؔ܎ੑ
    (Anderson 2020)
    • શମతʹଟ༷ੑ͕ߴ͍ํ͕ conversion ͠΍͍͢ɻ
    • ΞΫςΟϒ͕࠷΋ߴ͍૚Ͱ͸ 30% ΋͕ࠩ։͍ͨɻ
    • ϢʔβͷΞΫςΟϒ౓্͕͕ͬͨࡍɺଟ༷ੑͷߴ͍
    Ϣʔβͷํ͕ conversion ͢ΔՄೳੑ͕ߴͦ͏ɻ
    (Anderson 2020) Fig. 6
    less diverse
    more diverse

    View Slide

  20. ©2020 Wantedly, Inc.
    ଟ༷ੑͱਪનΞϧΰϦζϜͷؔ܎ੑ
    (Anderson 2020)
    • organic: Ϣʔβ͕ࣗ෼Ͱݕࡧͨ͠Γͯ͠ೳಈతʹফඅ͠
    ͨۂ
    • programmed: Ϣʔβͷ܏޲ͳͲ͔ΒΞϧΰϦζϜʹ
    Αͬͯਪન͞Εͯडಈతʹফඅͨ͠ۂ
    • Ϣʔβ͝ͱʹ organic Ͱফඅͨ͠ۂͷΈΛ࢖ͬͯܭࢉ͠
    ͨ GS-score ͱɺprogrammed Ͱফඅͨ͠ۂͷΈΛ࢖ͬ
    ͯܭࢉͨ͠ GS-score Λࢉग़ɻ
    • Ϣʔβ͕ organic Ͱফඅͨ͠ۂͷํ͕ɺprogrammed
    Ͱফඅͨ͠ۂΑΓ΋ଟ༷ੑ͕ߴ͍܏޲ʹ͋Δɻ
    (Anderson 2020) Fig. 3
    less diverse
    more diverse
    more diverse less diverse

    View Slide

  21. ©2020 Wantedly, Inc.
    ଟ༷ੑ͕มԽ͍ͯ͠Δࡍʹͳʹ͕Ͳ͏มԽ͍ͯ͠Δ͔ʁ
    (Anderson 2020)
    • ࣌ؒʹΑͬͯมΘΒͳ͍ͳΒͲ͏΍ͬͯมΘͬͯ
    Δʁʁʁ
    • ଟ༷ੑ͕૿͔͑ͨͲ͏͔ͱɺۂΛফඅ͢ΔࡍͷϦ
    ιʔεͷׂ߹͕૿͔͑ͨͲ͏͔Ͱର਺Φοζൺ
    • Ϣʔβͷফඅ͢Δۂͷଟ༷ੑ͕૿͑ͨࡍʹ͸ɺ
    organic ͳফඅ͕૿͑ɺprogrammed ͳফඅ͕ݮͬ
    ͍ͯΔ܏޲ʹ͋Δ͜ͱ͕Θ͔ͬͨɻ
    (Anderson 2020) Fig. 9
    organic programmed

    View Slide

  22. ©2020 Wantedly, Inc.
    ਪનΞϧΰϦζϜ͕Ϣʔβʹ༩͑ΔӨڹʹؔ͢Δ࣮ݧʢAB Testingʣ
    (Anderson 2020)
    • Personalize ͞Εͨਪન͸ short-term wins Ͱ͋Δ
    ʢ໨ͷલͷརӹͰ͋Δʣ࠶ੜճ਺Λେ͖͘վળ͢
    Δɻಛʹ specialist (ଟ༷ੑ͕௿͍ = ਪન͠΍͢
    ͍) ΁ͷޮՌ͕େ͖͍ɻ
    • ͦͷࡍɺεΩοϓ཰΋Ұॹʹ্͕ͬͯ͠·͏͜ͱ
    ͕ଟ͘ɻtrade-off ͷؔ܎ʹ͋Γͦ͏ɻ
    • εΩοϓ཰͸ generalist ͷํ্͕͕ͬͯ͠·͏܏
    ޲ʹ͋Δɻ
    (Anderson 2020) Table. 1
    Popularity: ࠶ੜճ਺ॱ
    → un-personalzied baseline
    Relevance: ϢʔβϕΫτϧͱۂͷϕΫτϧͷۙ͞ॱ
    → γϯϓϧͳ cf personalized model
    Learned: Ϣʔβɺۂɺ૬ޓ࡞༻ͷಛ௃ྔΛ༻͍ͨ NN
    → ෳࡶͳֶशʹΑΔ personalized model
    3छྨͷΞϧΰϦζϜΛϥϯμϜʹϢʔβʹग़͠෼͚ͯɺ
    ࠶ੜճ਺ͱεΩοϓ཰Λܭଌ

    View Slide

  23. ©2020 Wantedly, Inc.
    ·ͱΊͱࡶײ
    (Anderson 2020)
    • ଟ༷ੑΛߟྀͨ͠ਪનΞϧΰϦζϜΛߟ͍͑ͯ͘͜ͱ͕ඞཁͳͷ͸໌Β͔ɻ
    • short-term ͳ໨తʢ௚ۙͷ࠶ੜճ਺ͳͲʣʹ࠷దԽ͠ա͗Δͱɺlong-term ͳ
    ໨తʢ༗ྉ՝ۚ཰΍ churn rate ͳͲʣ͕ଛͳΘΕΔՄೳੑ͕͋Δɻ
    • Ϣʔβͷফඅ͢Δۂͷଟ༷ੑͱɺ༗ྉ՝ۚ཰ ΍ churn rate ͱͷҼՌؔ܎͸໌Β
    ͔Ͱ͸ͳ͍ʢൃදฉ͖ͳ͕ΒΊͪΌͪ͘Όࢥ͍ͬͯͨʣɻ
    • ࠓճͷݚڀͰࣔͨ͠ͷ͸͋͘·Ͱ૬ؔؔ܎ɻ
    • ීஈ࣮αʔϏεʹ͓͚ΔਪનγεςϜʹ͓͍ͯɺଟ༷ੑͱ shot-term ͳརӹͱͷ
    trade-off ʹ͍ͭͯ಄Λ೰·͍ͤͯΔਓؒͱͯ͠͸ɺଟ༷ੑͱ long-term ͳརӹʹେ͖
    ͳؔ܎͕͋Γͦ͏ͩͱఆྔతʹࣔͯ͘͠Εͨͷ͸ͱͯ΋خ͘͠ࢥͬͨɻ
    • σʔλྔ΍͹͗͢ɻ
    ·ͱΊ
    ࡶײ

    View Slide

  24. ©2020 Wantedly, Inc.
    3FGT
    • (Zhang 2020) Zhang, Le and Xu, Tong and Zhu, Hengshu and Qin, Chuan and Meng, Qingxin and Xiong, Hui and Chen,
    Enhong. Large-Scale Talent Flow Embedding for Company Competitive Analysis.Proceedings of The Web Conference
    2020 P. 2354–2364.
    • (Anderson 2020) Anderson, Ashton and Maystre, Lucas and Anderson, Ian and Mehrotra, Rishabh and Lalmas, Mounia.
    Algorithmic Effects on the Diversity of Consumption on Spotify. Proceedings of The Web Conference 2020 P. 2155–
    2165.

    View Slide