Upgrade to Pro
— share decks privately, control downloads, hide ads and more …
Speaker Deck
Speaker Deck
PRO
Sign in
Sign up for free
Review: "Recommending Investors for Crowdfunding Projects"
yag_ays
PRO
July 09, 2014
Research
1
930
Review: "Recommending Investors for Crowdfunding Projects"
http://yagays.github.io/blog/2014/07/09/www2014review-kickstarter/
yag_ays
PRO
July 09, 2014
Tweet
Share
More Decks by yag_ays
See All by yag_ays
目と耳を持った自然言語処理 - スタートアップにおける価値創出のために
yag_ays
PRO
0
530
時間情報表現抽出とルールベース解析器のこれから / Temporal Expression Analysis in Japanese and Future of Rule-based Approach
yag_ays
PRO
0
750
Pythonで始める ドキュメント・インテリジェンス入門 / Introduction to Document Intelligence with Python
yag_ays
PRO
8
6.4k
"医者の言葉、患者の言葉、エンジニアの言葉" / MNTSQ Ubie Vertical ai
yag_ays
PRO
3
4.5k
LT at nlp_career
yag_ays
PRO
0
230
Other Decks in Research
See All in Research
Natural language processing tells us the shape of language
eumesy
0
270
OSSベースでのRパッケージ開発のすすめ / rjpusers2021rpkgdev
s_uryu
0
590
Stack-chanで始めるROS音声対話ロボット
yoshipon
1
150
GovTechとマーケットデザイン (渋谷区山室係長)
daimoriwaki
0
140
Instance-Based Neural Dependency Parsing
hiroki13
1
140
幼少期の自然体験が理科学習への態度に及ぼす影響
arumakan
0
830
第9回全日本コンピュータビジョン勉強会「StyleNeRF: A Style-based 3D Aware Generator for High-resolution Image Synthesis」発表資料
maguro27
1
880
ECサイトにおけるデータ駆動型意思決定のための非線形テンソル因子分解を用いたVisual Analyticsシステムの検討
ae14watanabe
0
210
深層学習によるセマンティックセグメンテーションとその最新動向
hf149
0
920
再帰化への認知的転回/the-turn-to-recursive-system
monochromegane
0
120
統計的因果探索: セミパラメトリックアプローチを中心に
sshimizu2006
2
460
Adversarial Training
hirokiadachi
3
1.3k
Featured
See All Featured
Building an army of robots
kneath
299
40k
Rebuilding a faster, lazier Slack
samanthasiow
62
7.2k
What's new in Ruby 2.0
geeforr
336
30k
Three Pipe Problems
jasonvnalue
89
8.6k
How to name files
jennybc
39
58k
Creating an realtime collaboration tool: Agile Flush - .NET Oxford
marcduiker
3
440
The Invisible Customer
myddelton
110
11k
A designer walks into a library…
pauljervisheath
196
16k
Building Better People: How to give real-time feedback that sticks.
wjessup
343
17k
Gamification - CAS2011
davidbonilla
75
3.9k
How to Ace a Technical Interview
jacobian
265
21k
Designing for humans not robots
tammielis
241
23k
Transcript
Recommending Investors for Crowdfunding Projects WWW 2014 Jisun An, Daniele
Quercia, Jon Crowcroft จհ @yag_ays 1
ࠓճհ͢Δจͷ֓ཁ • “Recommending Investors for Crowdfunding Project” [Jisun+ 2014] •
WWW 2014 (Seoul, KOREA) • Jisun AnͷYahoo Labs in Barcelona Πϯλʔϯγοϓͷࣄ ! • ࠷ऴతͳඪɿKickstarterͷϑΝϯμʔͱग़ࢿऀͷϚονϯά • KickstarterͷϓϩδΣΫτग़ࢿऀͷੳϝΠϯͳͱ͜Ζ͕͋Δ 2
• ΫϥυϑΝϯσΟϯά • 2012ʹूΊͨࢿ૯ֹ$320 million • c.f. ࢿՈ/ϕϯνϟʔΩϟϐλϧʹΑΔग़ࢿ ! •
ϑΝϯμʔඪֹۚΛઃఆͯ͠ࢿΛืΔ • ࢿՈࢿֹۚʹԠͯ͡ใु͕Β͑Δ • e.g. $100ࢿͰ1ݸϓϨθϯτɼ$300ࢿͰ5ݸϓϨθϯτ Kickstarterͱ https://www.kickstarter.com/help/style_guide 3
Kickstarterޭࣄྫ • Oculus Rift • 9,522 / $ 2,437,429 •
Memoto (Narrative Clip) • 2,871 / $ 550,189 • Little Witch Academia 2 • 7,938 / $ 625,518 https://www.kickstarter.com/projects/1523379957/oculus-rift-step-into-the-game https://www.kickstarter.com/projects/martinkallstrom/memoto-lifelogging-camera https://www.kickstarter.com/projects/1311401276/little-witch-academia-2 4
Kickstarterͷಛघੑɿࢿʹࣦഊͨ͘͠ͳ͍͚Ͳ… • All or Nothing • ඪֹۚʹୡ͠ͳ͚ΕϓϩδΣΫτࣦഊɼࢿۚશֹฦ٫ • ϓϩδΣΫτͷޭ/ࣦഊʹؔΘΒͣɼࢿऀଛΛ͠ͳ͍ !
• ιʔγϟϧͳଆ໘ • ॳظࢿʹ༑ୡ͕ଟ͍ʢ20-40%ͱ͍͏ࢉग़ʣ • ेͳࢿՈΛूΊΒΕͳ͍ͱϓϩδΣΫτ͕ࣦഊ͍͢͠ 5
จͷྲྀΕ • Kickstarterʹ͓͚ΔࢿՈͷڍಈʹ͍ͭͯԾઆΛཱͯΔ • KickstarterTwitterͷใ͔ΒԾઆΛݕূ͢Δ • ࣗಈతʹϑΝϯμʔͱࢿՈͷϚονϯάΛߦ͏ϞσϧΛཱͯΔ 6
Kickstarterͷσʔλऩू/ղੳ Dataset and Pledging Behavior 7
σʔληοτ • Kickstarter͔ΒΫϩʔϧ • 20137݄͔Β10݄ʹొ͞Εͨͷ • USA෦ͷϑΝϯυͷΈ • ߹ܭ 1,149ϓϩδΣΫτ/
78,460ग़ࢿऀ • Twitter͔ΒΫϩʔϧ • ϓϩδΣΫτʹݴٴ͢ΔtweetͷΈ • ߹ܭ71,315 tweetΛऩू 8 (Average)
ࢿՈͷߏ • ߹ܭ78,460ਓͷग़ࢿऀ ! • 4ճະຬͷࢿΛͨ͠ਓ51% • ؾ·͙Εࢿऀ “Occasional Investors”
• 32ճҎ্ͷࢿΛͨ͠ਓ11% • ৗ࿈ࢿऀ “Frequent Investors” ؾ·͙ΕࢿՈ ৗ࿈ࢿՈ 9
ϓϩδΣΫτͷΧςΰϦʔ͝ͱͷࢿऀͷ༁ • Music, DanceͳͲ୯ൃͷग़ࢿ͕ଟ͍ • Gamesৗ࿈ࢿՈ͕ଟ͍ˠେنͳήʔϜ։ൃͷืूͳͲ 10
ࢿՈͷڍಈʹؔ͢ΔԾઆ • ৗ࿈ࢿՈҎԼͷΑ͏ͳੑ࣭ͷϓϩδΣΫτʹࢿ͍͢͠ • ใ͕සൟʹΞοϓσʔτ͞ΕΔ • ϑΝϯμʔ͕ࢿՈͷ࣭ʹ͑Δ • ࢿͷใु͕ྑ͍ •
ߴ͍ࢿֹۚͷϓϩδΣΫτৗ࿈ࢿՈʹࢿ͞Ε͍͢ • ϩʔΧϧͳϓϩδΣΫτؾ·͙ΕࢿՈʹࢿ͞Ε͍͢ • ૣ͘ࢿΛूΊΔϓϩδΣΫτৗ࿈ࢿՈʹࢿ͞Ε͍͢ • ৗ࿈ࢿՈࣗͷڵຯ͋ΔϓϩδΣΫτʹࢿ͍͢͠ 11
ϓϩδΣΫτͰ͢Δಛྔ • ϓϩδΣΫτͷߋ৽ • ϑΝϯμʔͷίϝϯτ • ใुͷϨϕϧ • ΣϒαΠτͷ༗ແ •
ඪֹۚ ($) • ཧతͳڑͷΒ͖ͭ • ϓϩδΣΫτͷ 12
ͦΕͧΕͷಛྔ͝ͱͷࢿऀͷ༁ ϓϩδΣΫτͷߋ৽ ϑΝϯμʔͷίϝϯτ ใुͷϨϕϧ 13 ಛྔͷ͕૿͑Δ΄Ͳʹৗ࿈ࢿՈͷׂ߹͕૿Ճ͢Δ
ͦΕͧΕͷಛྔ͝ͱͷࢿऀͷ༁ (cont’d) ඪֹۚ ($) ཧతͳڑͷΒ͖ͭ ϓϩδΣΫτͷ 14 ඪֹ͕ۚ૿Ճ͢Δ΄Ͳʹ ؾ·͙ΕࢿՈͷׂ߹͕ݮগ͢Δ ؾ·͙ΕࢿՈͷࢿ
ʹӨڹ͞Εͳ͍
ԾઆɿࢿՈͷڵຯͱࢿઌͷؔ • LDA (Latent Dirichlet Allocation) ΛͬͨτϐοΫͷྨࣅ • ࢿͨ͠ϓϩδΣΫτͷ֓ཁͱࢿՈͷTweetͷ༰ (200
tweetsఔ) • (τϐοΫṖ) ! • ৗ࿈ࢿՈࣗͷڵຯ͋ΔෳͷτϐοΫʹࢿ͕ͪ͠ • ؾ·͙ΕࢿՈࣗͷڵຯͱؔͳ͍τϐοΫʹࢿ or ͻͱͭͷτ ϐοΫʹภͬͯࢿ͍ͯ͠Δ 15
͜͜·Ͱͷ·ͱΊ • ৗ࿈ࢿՈ (4ϲ݄ؒʹ32ճҎ্ࢿͨ͠Frequent Investors) • Α͘Ϛωʔδϝϯτ͞Εɼඪֹ͕ۚߴ͘ɼࣗͷڵຯʹ߹͏ϓϩδΣ Ϋτʹࢿ͢Δʹ͋Δ • ௨ৗͷࢿՈͷΑ͏ͳڍಈΛࣔ͢
• ؾ·͙ΕࢿՈ (4ϲ݄ؒʹ4ճະຬͷࢿΛͨ͠Occasional Investors) • ࠓճબΜͩಛྔʹ͋·ΓӨڹ͞ΕͣࢿΛߦ͏ • ࢿͱ͍͏ΑΓدͱ͍͏ײ͡ 16
ืूऀͷ༑ୡ͕ଟ͍΄Ͳ୯ൃͷग़ࢿׂ߹͕૿͑Δ • ϑΝϯμʔͷFacebookͷ༑ୡͷͱ ࢿՈͷ༁ͷؔ • ҼՌؔෆ໌ ! • Facebookͷ༑ୡͷ͕ଟ͍΄Ͳؾ·͙ ΕࢿՈΛूΊ͍͢
• Facebookͷ༑ୡͷ͕গͳ͍΄Ͳৗ࿈ ࢿՈΛूΊ͍͢…??? 17
ϑΝϯμʔͱग़ࢿऀͷϚονϯά Recommending Investors 18
ϓϩδΣΫτͱࢿՈͷϚονϯάํ๏ • Twitterʹ͍ΔજࡏతͳࢿՈʹରͯ͠ϓϩδΣΫτΛਪન͢Δ • KickstarterͷϢʔβ໊͔ΒTwitterͷΞΧϯτΛඥ͚ • 7,429ਓͷࢿՈ͕891ͷϓϩδΣΫτʹࢿͨ͠σʔλΛݩʹਪન • ࢿऀ͕ࢿ͢Δ= 1ɼࢿऀ͕ࢿ͠ͳ͍
= 0ͱͨ͠ೋྨ ! • Ϋϩʔϧͨ͠σʔλਖ਼ྫͷΈͳͷͰɼϥϯμϜͰෛྫΛࠞͥΔ • ਖ਼ྫෛྫͷׂ߹50-50 19
Ϩίϝϯσʔγϣϯͷख๏ͱධՁํ๏ • ੑೳධՁ͢Δख๏4ͭ : {LR, SVM-linear, SVM-poly, SVM-RBF} • ϩδεςΟοΫճؼʢLRʣ
• 3छྨͷΧʔωϧΛ༻͍ͨSVM ʢLinear, polynomial, RBFʣ ! • ධՁɿ5-fold cross validation • σʔληοτͷ80%Ͱֶशˠ20%ͰධՁ Λ5ճ܁Γฦͯ͠ධՁΛฏۉ 20
༻͢Δಛྔ • Static Feature: ϓϩδΣΫτൃ࣌ʹΘ͔Δಛྔ • ඪֹۚɾใुͷϨϕϧɾաڈʹࢿͨ͠ϓϩδΣΫτͷΧςΰϦɾ TwitterͷߘʹΑΔࢿՈͷڵຯ • Dynamic
Feature: ϓϩδΣΫτͷਐߦʹΑͬͯ໌ͯ͘͠Δಛྔ • ϓϩδΣΫτͷޭɾߋ৽ɾίϝϯτɾཧతͳڑͷΒ͖ͭ 21
ϨίϝϯσʔγϣϯͷධՁ • RBFΧʔωϧΛ༻͍ͨSVM • Static͚ͩͷಛྔ ɿ82% • Dynamic͚ͩͷಛྔɿ73% ! •
StaticͱDynamicΛ߹ΘͤΔͱACC 84% ! • ਖ਼ྫෛྫͷׂ߹͕50-50ɿϕʔεϥΠϯ50% ACC : Accuracy P : Precision R : Recall F1 : F-score AUC : ROCۂઢԼͷ໘ੵ 22
Ͳͷಛྔ͕ޮ͍͍ͯΔͷ͔ʁ • C : ίϝϯτ • R : ใुͷϨϕϧ •
S : ཧతͳڑͷΒ͖ͭ • G: • E : ΧςΰϦʔͷҰக • TS: ڵຯ͋ΔτϐοΫͱͷྨࣅ →EͱTS͕ਫ਼্ʹد༩͍ͯ͠Δ 23
จͷ·ͱΊ • ࢿՈʹΑͬͯKickstarterͷࢿελΠϧ͕ҧ͏ • ৗ࿈ࢿՈ৺తͳϓϩδΣΫτʹࢿ͢Δ • ؾ·͙ΕࢿՈدײ֮Ͱࢿ / ܳज़ʹؔ࿈ͨ͠ϓϩδΣΫτʹࢿ •
ࢿՈͱϓϩδΣΫτͷϚονϯάՄೳ • ࢿՈ͕ϓϩδΣΫτʹࢿΛ͢Δ͔Ͳ͏͔84%ͷਫ਼ͰਪଌՄೳ • ࢿՈͷڵຯ͋ΔΧςΰϦʔ༰͕Ϛονϯάʹڧ͘Өڹ͢Δ 24