jujudubai
May 30, 2015
4.1k

# juju1008

May 30, 2015

## Transcript

2. ### Agenda 1. Uberͱ͸… 2. Bayesian Modelingʹؔ͢Δجૅதͷجૅ 3. Bayesian ModelingΛ༻͍ͨUserͷ໨త஍ͷ༧ଌ 1.

σʔλͷ֓ཁ 2. Ϟσϧઃܭʢࣄલ෼෍ɺ໬౓ͷઃܭʣ 3. ࣄޙ֬཰ͷਪఆʢMAPਪఆʣ 4. Ϟσϧ݁Ռ 4. ຊൃදͷ·ͱΊ
3. ### Proﬁle • ܚԠٛक़େֶӃ म࢜ʢ2015ʣ • ަ௨σʔλͷղੳʢProbe-Car Dataʣ  ޿ࠂؔ࿈ͷσʔλ΍εϙʔπσʔλͷ෼ੳ΍Βͦͷଞॾʑ… • Spatial

Statistics, Bayesian Statistics Λͪΐͼͬͱ • Python/R/HDFS/Impala/Hive etc…  ͦͷଞ΋ษڧத… • GISؔ࿈… • “Ad Technology”ք۾ͷΤϯδχΞͯ͠·͢ɻ

11. ### ϕΠζਪఆΛ࢝ΊΔલʹ… ࠷໬ਪఆ๏ • ύϥϝʔλͷ஋͸ະ஌Ͱ͋Δ͕ɺਓؒͷ௚؍ͱ͸ಠཱʹଘࡏ͢Δఆ਺ͱଊ͑Δɻ • ࣮ࡍʹ؍ଌσʔλ͕ಘΒΕͨ࣌ɺͦͷΑ͏ͳσʔλ͕ಘΒΕΔ֬཰Λ࠷େʹ͢Δ  ύϥϝʔλͷ஋Λɺ࠷ྑͷਪఆ஋ͱ͢Δɻ ϕΠζਪఆ • ύϥϝʔλΛɺط஌ͷࣄલ෼෍Λ΋ͭ֬཰ม਺ͱͯ͠ଊ͑Δɻ

• ؍ଌ݁ՌΛಘΔͱɺ͜ͷࣄલ෼෍͸ࣄޙ෼෍΁ͱมԽ͠ɺ  ύϥϝʔλ஋ʹର͢Δ֬৴౓͕मਖ਼͞ΕΔɻ ࠷໬ਪఆ๏ͱϕΠζਪఆͷൺֱʢ؆қ൛ʣ ˆ ✓ = argmax ✓ [ P(x (n) ; ✓) ] ˆ ✓ = argmax ✓ [ P(✓ | x (n) ) ] [1]ΑΓ [1]ΑΓ
12. ### ࠷໬ਪఆ๏ɿجૅதͷجૅͷ෮श ʮݱ࣮ʹզʑ͕ಘͨ؍ଌσʔλ͸ɺ֬཰࠷େͷ΋ͷ͕࣮ݱͨ݁͠ՌͰ͋Δɻʯʢ࠷໬ݪཧʣ ؆୯ʹݴ͏ͱɺ ˆ ✓ = argmax ✓ [ P(x

(n) ; ✓) ] ࠷໬ਪఆ๏͸ɺύϥϝʔλθͷਪఆ஋Λ།Ұʹಛఆ͢Δ఺ਪఆɻʢස౓ओٛʹجͮ͘ʣ ؍ଌ͞Εͨσʔλͷഎܠʹ͸ɺ  ਅͷ౷ܭϞσϧʢ།Ұʣ͕͋Δɻ ఆࣜԽͨ͠Ϟσϧʹ͓͍ͯɺ  θΛݻఆ͢Δ͜ͱʹΑΓɺ  ਅͷ౷ܭϞσϧΛ஋ΛٻΊΔɻ Figure1. ස౓ओٛʹجͮ͘౷ܭϞσϦϯάͷimage [4] [1]ΑΓ
13. ### ϕΠζͷఆཧ ϕΠζਪఆɿجૅதͷجૅͷ෮श ˎ ৄࡉͳϞσϦϯά΍MCMCؔ࿈ͷ࿩͸ͳ͠ θʹؔ͢Δ֬཰෼෍ θ͕༩͑ΒΕͨ࣌ͷyͷ֬཰ʢີ౓ʣؔ਺ y͕༩͑ΒΕͨ࣌ͷθͷ֬཰ʢີ౓ʣؔ਺ ͲΜͳԾઆͰ͋ͬͯ΋σʔλͷಘΒΕΔ֬཰ • ࣄલ෼෍

P(θ) • ໬౓ P(y|θ) • ࣄޙ෼෍ P(θ|y) • ਖ਼نԽఆ਺ P(y)  σʔλग़ݱ֬཰ P(✓|y) = P(y|✓)P(✓) P(y) / P(y|✓)P(✓) ➡ɹࣄޙ෼෍͸ɺ໬౓ͱࣄલ෼෍Λ͔͚ͨ΋ͷʹൺྫ͢Δʂ ɿ  ɿ  ɿ  ɿ [1]Λ΋ͱʹ
14. ### ϕΠζਪఆΛ༻͍Δར఺ ϕΠζਪఆ • ࣄલ෼෍ʢओ؍֬཰ʣΛࣗ༝ʹઃఆ  ʢਪఆ஋ͷ෼෍ʹਖ਼نੑΛԾఆ͠ͳͯ͘Α͍ʣ • ϕΠζߋ৽ʹΑΔϞσϧͷڧԽ  ʢσʔλΛߋ৽͍͚ͯ͠͹ɺཧ࿦্͸ਪఆਫ਼౓͸্͕͍ͬͯ͘…ʣ • ٻΊ͍ͨࣄ৅ͷ֬཰෼෍ͦͷ΋ͷΛ༧ଌ

• ఺ਪఆ஋Λ༻͍Ε͹ɺස౓Ϟσϧͱಉ༷ͷ݁ՌΛฦ͢͜ͱ΋Մೳ • ਅͷ஋ʢ݁Ռʣ͸ҰͭͰ͋Δඞཁ͕ͳ͍ • ؍ଌճ਺n͕খ͍͞৔߹ɺࣄલ෼෍͕ద੾ʹઃఆ͞Ε͍ͯΔͳΒ͹ɺ  ʮϕΠζਪఆ + ࣄޙ֬཰࠷େԽʯ͕༗ར [1]ͱ[7]ΑΓ
15. ### ϕΠζਪఆΛ༻͍Δܽ఺ ϕΠζਪఆ • ଎͘ͳͬͨͱ͸ݴ͑Ͳɺਪఆʹ͔͔Δ͕࣌ؒ௕͍… • ෳࡶͳϞσϧͩͱɺऩଋ͠ͳ͍৔߹͕ଟʑ… • ॳظͷڭҭ՝ఔʹֶ͓͍ͯͿස౓ओٛͱ͸ߟ͑ํ͕ҟͳΔͷͰɺ  ͪΐͬͱशಘίετ͕… •

ਪఆ͢Δ্Ͱɺशಘ͢΂͖ཧ࿦΍ݴޠ͕ଟʑ…  ʢMCMC΍ͦΕΒʹؔ܎͢ΔStan/Jags etc…ʣ [1]ͱ[7]ΑΓ ࠷ऴਪఆͰղ͘΂͖͔ɺϕΠζਪఆͰղ͘΂͖͔ɺΑٞ͘୊ʹͳΔ͕ɺ  ͦΕͧΕͷ໨త΍ίετͱͷ݉Ͷ߹͍ʹԠͯ͡ར༻͢Ε͹Α͍ɻ

σʔλͷ֓ཁ
19. ### ֬཰త༧ଌ ϕΠζͷఆཧʹ౰ͯ͸ΊΔ ໨త஍ ಛ௃ྔ ࣄલ֬཰ ໬౓ • جຊతͳϕΠζͷఆཧ • ࣄલ෼෍ͱ໬౓Λ޻෉͠ɺϞσϧઃܭ

P ( D = i | X = x ) = P ( X = x | D = i ) P ( D = i ) PN j=1 P ( X = x | D = j ) P ( D = j ) [9]ΑΓ
20. ### 3छྨͷࣄલ෼෍Λઃఆ͢Δɹˠɹࠞ߹ਖ਼ن෼෍ͷར༻ 1. ಛఆͷUser͸Ͳ͜ʹߦ͘܏޲͕͋Δ͔ɻ= Rider Prior  →ʮUserͷཤྺʯ  2. UberΛར༻͢ΔUser͸શମతʹͲ͜ʹߦ͘܏޲͕͋Δ͔ɻ= Uber Prior

→ʮUberʹ͓͚Δ܏޲ʯ  3. ͜ͷΤϦΞͰ͸Ͳͷ৔ॴ͕Ұൠతʹਓؾ͕ߴ͍ͷ͔ɻ= Popular Place Prior  →ʮਓؾͷ͋Δ৔ॴʹؔ͢Δσʔλʯ ࣄલ෼෍ͷߏங
21. ### 1. ʮUserͷཤྺʯ • UserʹͱͬͯೃછΈਂ͍৔ॴ͸΋ͪΖΜɺͦ͏Ͱ΋ͳ͍৔ॴΛߟྀ͢Δ • طʹ๚Εͨ͜ͱͷ͋Δ৔ॴʹ͸೚ҙͷ֬཰Λɺ  ͦ͏Ͱ͸ͳ͍৔ॴʹ͸શͯʮ0ʯͷ֬཰Λઃఆ • σʔλͷͳ͍Userʹ͸ɺશͯͷ৔ॴʹରͯ͠ʮ0ʯͷ֬཰Λઃఆ  ˠ͍Θ͹ɺʮίʔϧυελʔτ໰୊ʯରԠ

P(D = i|C = c) UserΛද֬͢཰ม਺ PRider(D = i) ࣄલ෼෍ͷߏங → [9]ΑΓ
22. ### 2. ʮUberʹ͓͚Δ܏޲ʯ • UberͷUser͕ಛఆͷ৔ॴʹߦ͘ಛੑΛར༻ • UberͷUser͕๚Εͨ͜ͱͷ͋Δ৔ॴຖͷճ਺Λར༻ʢਖ਼نԽʣ PUber(D = i) =

P(D = i|is Uber user) PUber(D = i) UberͷUser͕๚ΕΔ৔ॴͷਖ਼نԽ͞Εͨճ਺ ࣄલ෼෍ͷߏங → [9]ΑΓ

24. ### 3. ʮਓؾͷ͋Δ৔ॴʹؔ͢Δσʔλʯ • SFʹ͓͚Δ৔ॴͷ܏޲Λߟྀ • 1000Օॴ΄Ͳͷ঎ۀࢪઃΛؚΜͩσʔλΛར༻ • Ϩετϥϯ, φΠτεϙοτ, ϗςϧ,

γϣοϐϯά, ϛϡʔδΞϜ etc… • ͓ͦΒ͘ɺWeb্ͷͳΜΒ͔ͷධՁΛ΋ͱʹείΞϦϯά͍ͯ͠Δ…!?  (the normalized number of reviews left for a business establishment on the site.) P P opular P lace (D = i) P P opular P lace (D = i) ࣄલ෼෍ͷߏங → [9]ΑΓ
25. ### P(D = i) = ↵P Popular Place (D = i)

+ P Uber (D = i) + (1 ↵ )P Rider (D = i) Popular Place Prior Uber Prior Rider Prior Destination Prior .3 .3 .4 ←ɹ͜ΕΛࣄલ෼෍ͱͯ͠ઃఆʂ  ɹɹʢ࣮ࡍͷ஋͸Θ͔Γ·ͤΜ…ʣ Hyper Parameter ࣄલ෼෍ͷ૊Έ߹Θͤ [9]ΑΓ
26. ### 1. ʮUserͷཤྺʯ  ‎ɹ͋ͳ͕ͨΑ͘ߦ͘৔ॴ…  2. ʮUberʹ͓͚Δ܏޲ʯ  ‎ɹ͋ͳͨͷ༑ୡ͕Α͘ߦ͘৔ॴ…  3.   ʮਓؾͷ͋Δ৔ॴʹؔ͢Δσʔλʯ  ‎ɹҰൠతʹΑ͘ߦ͔ΕΔ৔ॴ…

3छྨͷࣄલ෼෍Λઃఆ͢Δ͜ͱʹΑΓ… ࣄલ෼෍ͷ૊Έ߹Θͤ
27. ### ৐٬͸͠͹͠͹ɺ࠷ऴ໨త஍ͱ͸ҟͳΔ৔ॴͰԼं͢Δ܏޲͕͋Δɻ P(Y = y|D = i) ˎ Haversineڑ཭ = ׂѪ

໬౓ͷߏங → Figure3. Լं৔ॴͱ࠷ऴ໨త஍ͷڑ཭ͷ෼෍[9] ߫֎ͱ౎৺෦Ͱ͸ɺौ଺΍ަࠩ఺ͳͲ༷ʑͳӨڹͰɺ  ໨త஍ͱԼं஍఺ʹޡ͕ࠩੜ·ΕΔɻ
28. ### Ψ΢ε෼෍ʹै͏ͱԾఆ͠ɺ࠷໬ਪఆ஋ɹɹɹɹͱɹɹɹɹɹΛར༻ɻ ˆ µMLE ˆ2 MLE P(Y = y|D = i)

= N(Y = y|µ, 2) • ໨త஍ͱԼं஍఺ͷڑ཭ͷ֬཰෼෍ͷࢉग़ ໨త஍ͱԼं஍఺ͷڑ཭ͷ֬཰෼෍ Ψ΢ε෼෍ʹै͏ P(Y = y|D = i) ໬౓ͷߏங → [9]ΑΓ
29. ### • ฏۉ஋ͱ෼ࢄͷਪఆ஋ͷࢉग़ ฏۉ஋ ෼ࢄ Ψ΢ε෼෍ͷ͋Ε ໨త஍ͱԼं஍఺ͷڑ཭ͷ֬཰෼෍ uniform distribution ˆ 2

Z=z = 1 Pn k=1 1( Z = z ) n X k=1 ( xk ˆ µZ=z)2 ˆ µZ=z = 1 Pn k=1 1( Z = z ) n X k=1 xk 1( Z = z ) P(Y = y | D = i) = 1 p 2⇡ exp[ (xk ˆ µZ=z) 2 ˆ 2 2 Z=z ] P(Y = y|D = i) ໬౓ͷߏங → [9]ΑΓ [9]Λ΋ͱʹ
30. ### →͋Δಛఆͷ৔ॴ͸ɺ࣌ؒʹґଘ͍ͯ͠ΔՄೳੑ͕ߴ͍ɻ ɹʢྫɿࣗ୐ɺ࢓ࣄ৔ɺφΠτεϙοτ etc…ʣ • ໬౓ʹ”࣌ؒతཁૉ”Λߟྀͤ͞Δ ௨ۈύλʔϯ ֎৯ύλʔϯ ໷༡ͼύλʔϯ ฏ೔ͷேɺۚ༥֗΁ 5

- 8 pm. ݄༵ͷேʹߦ͘Մೳੑ͸௿͍ P(Y = y|D = i) ໬౓ͷߏங →
31. ### • “࣌ؒతཁૉ”Λߟྀ͢ΔͨΊͷ֬཰෼෍ P(T = t|D = i) ֤৐ंͷ࣌ؒଳΛද֬͢཰ม਺ →ɹΠϕϯτ֬཰Λ֤࣌ؒଳɹʹ͓͚Δ৔ॴɹ΁ͷ৐ंΧ΢ϯτ਺ʢਖ਼نԽʣΛ  ɹɹΧςΰϦʔ෼෍ͱͯ͠දݱʢ࣌ؒ͸1࣌ؒ۠੾Γʣ

t i P(T = t|D = i) P(Y = y|D = i) ໬౓ͷߏங → × [9]ΑΓ
32. ### ✦ ໬౓ͷ׬੒ P ( X = x | D =

i ) = P ( Y = y, T = t | D = i ) = P ( Y = y | D = i ) P ( T = t | D = i ) • ʮݸਓͷཤྺʯʮUberͷUserͷ܏޲ʯʮSFͷ܏޲ʯͷ3ͭͷ֬཰෼෍Λࠞ߹ • ֤࣌ؒଳຖͷΧ΢ϯτ਺Λ΋ͱʹΧςΰϦʔ෼෍Λੜ੒ ɹࣄલ෼෍Λઃఆ ࣌ؒଳຖͷ֤৐ंճ਺ ໨త஍ͱԼं஍఺ͷڑ཭ͷ֬཰෼෍ ໬౓ ໬౓ͷߏங → P ( X = x | D = i ) [9]ΑΓ
33. ### ࣄޙ෼෍ͷਪఆ • ࣄޙ෼෍ΛٻΊΔ • Userͷਅͷ໨త஍ΛٻΊΔɻ • “ࣄޙ෼෍”͸ɺઌʹٻΊͨ”ࣄલ෼෍”ͱ”໬౓”ͷ৐ࢉͰٻΊΔ͜ͱ͕Մೳ ࣄલ֬཰ ໬౓ P

( D = i | X = x ) = P ( X = x | D = i ) P ( D = i ) PN j=1 P ( X = x | D = j ) P ( D = j ) ࣄޙ෼෍ [9]ΑΓ

35. ### ݁Ռͱ݁࿦ 1. ৐ं͢ΔUserʹରͯ͠ɺ༧ଌ໨త஍ͷީิϦετʢ100mҎ಺ʣΛࢉग़ 2. ࠷େࣄޙ֬཰ʢMAPਪఆʣͷީิ஍Λબ୒ 3. ͦͷީิ஍ͷॅॴ͕ਅͷ໨త஍ͱҰக͔ͨ͠ ςετํ๏ 1. native

baselineͱsmart baselineͷൺֱ 1. native baseline  ީิ஍ͷத͔ΒϥϯμϜʹબ୒͠ɺ40%ͷਫ਼౓Λୡ੒ 2. smart baseline  ީิ஍ͷத͔Β࠷΋͍ۙީิ஍Λબ୒͠ɺ44%ͷਫ਼౓Λୡ੒ Ϟσϧͷൺֱ ਫ਼౓ͷج४͕͍·͍ͪΘ͔Βͳ͍…
36. ### ࣄޙ֬཰࠷େԽɹʙɹMaximum a posteriori (MAP) ࣄޙ֬཰ͷࢉग़ʢMAPਪఆฤʣ ✓ ⇤ = argmax ✓

log[ P(y | ✓)P(✓) ] → ໬౓Ͱ͸ͳ͘ɺࣄޙ֬཰͕࠷େͱͳΔύϥϝʔλθΛٻΊΔ ஫ʣMAPਪఆ͸ɺϕΠζͷఆཧΛ࢖༻͢Δ͕ɺ఺ਪఆͰ͋ΔͨΊΨνͷBayesian Modelingͱ͸Έͳ͞Εͳ͍ → 0.777 0.182 0.041 ࣄޙ֬཰࠷େ஋ ← ఺ਪఆ → Figure4. ࠷େࣄޙ֬཰ͷબ୒[1] [1]Λ΋ͱʹ

39. ### ·ͱΊ 1. ໨త஍༧ଌʹ͓͍ͯɺ74%͸ͳ͔ͳ͔ͷਫ਼౓Ͱ͸… 2. ҰԠϕΠζϞσϦϯάͰ͸͋Δ͕ɺMAPਪఆͰ͋Δ 3. ଟ͘ͷϑΟʔυόοΫΛಘͨ݁Ռɺগ͠৘ใ͕গͳ͍ͨΊɺ  މࢄष͞΋൱Ίͳ͍… 4. ࠷ऴ໨త஍͕ਖ਼Ͱ͋Δ͔ͷ൑ఆ͸ͲͷΑ͏ʹ΍ͬͯΔͷ͔ٙ໰͕࢒Δ

5. ϏδωεΠϯύΫτ͕·ͩ·ͩখ͍͞ Ґஔ৘ใؔ࿈ͷσʔλΛ࢖ͬͯɺ͓ۚΛੜΈग़ͬͯ͢೉͍͠Ͷ…
40. ### ࢀߟจݙ & ࢀߟURL 1. ੴҪ݈Ұ ଞ,ʮଓΘ͔Γ΍͍͢ύλʔϯೝࣝ ڭࢣͳֶ͠शೖ໳ʯ, Ԣจࣾ, 2014/10/30 2.

ࣛౡٱ࢚, ʮ਺ཧ৘ใ޻ֶಛ࿦ୈҰʲػցֶशͱσʔλϚΠχϯάʳճؼᶄʯ,   URL: (www.geocities.co.jp/Technopolis/5893/2-2.pdf) 3. ݹ୩஌೭,ʮϕΠζ౷ܭσʔλ෼ੳ -R&WinBUGS -ʯ, ே૔ॻళ, 2008/09/15 4. ҆ಓ஌׮,ʮϕΠζ౷ܭϞσϦϯάʯ, ே૔ॻళ, 2010/02/25 5. Allen B Downey,ʮThink Bayes - ϓϩάϥϚͷͨΊͷϕΠζ౷ܭೖ໳ʯ, O`Reilly,   2014/9 6. aidiary, “ਓޱ஌ೳʹؔ͢Δஅย࿥”, ‘࠷໬ਪఆɺMAPਪఆɺϕΠζਪఆ’,   URL: (http://aidiary.hatenablog.com/entry/20100404/1270359720),   posted on 2010/04/04 7. noriume, “Sunny side up”, ‘ैདྷͷਪఆ๏ͱϕΠζਪఆ๏ͷҧ͍’,   URL: (http://norimune.net/708), posted on 2013/02/26 8. Masayuki Isobe, “਺ࣜΛͳΔ΂͘࢖Θͳ͍ϕΠζਪఆೖ໳”,   URL: (https://speakerdeck.com/chiral/shu-shi-wonarubekushi-wanaibeizutui-ding- ru-men), posted on 2013/2 9. Uber, “Making a Bayesian Model to Infer Uber Rider Destinations,”,   URL: (http://blog.uber.com/passenger-destinations)