Slide 1

Slide 1 text

ϩάऩूͷͦͷઌʹ ࡾ୐༔հ(.01&1"#0JOD ͸ͯͳɾϖύϘٕज़େձʙΠϯϑϥٕज़ج൫ʙ!ژ౎ αʔϏεʹدΓఴ͏ϩάج൫

Slide 2

Slide 2 text

ϓϦϯγύϧΤϯδχΞ ࡾ୐༔հ!NPOPDISPNFHBOF NJOOFࣄۀ෦ IUUQCMPHNPOPDISPNFHBOFDPN

Slide 3

Slide 3 text

NJOOF IUUQTNJOOFDPN

Slide 4

Slide 4 text

໨࣍ w8FCαʔϏεͱߦಈϩά w#JHGPPU wαʔϏεʹدΓఴ͏ϩάج൫

Slide 5

Slide 5 text

8FCαʔϏεͱߦಈϩά

Slide 6

Slide 6 text

ϩά͸͍͍ͧ

Slide 7

Slide 7 text

ߦಈϩά

Slide 8

Slide 8 text

ߦಈϩά ΞϓϦέʔγϣϯ૚Ͱग़ྗ͢Δϩά ͍ͭɺͩΕ͕ɺͳʹΛ΍͔͕ͬͨಛఆͰ͖Δ ࠷ऴతͳߦಈ݁Ռ͚ͩͰͳ͘ɺ్தͷͲ͜Ͱ͖͋ΒΊ͔ͨɺͲ͏໎͔ͬͨ ͕Θ͔Δ

Slide 9

Slide 9 text

ߦಈϩάʹ͸ αʔϏεվળͷώϯτ͕ͭ·͍ͬͯΔ

Slide 10

Slide 10 text

ߦಈϩάͷ׆༻ஈ֊

Slide 11

Slide 11 text

ߦಈϩάͷ׆༻ஈ֊ ऩूߦಈϩά͕ग़ྗ͞ΕɺऔΓ·ͱΊΒΕ͍ͯΔঢ়ଶ ෼ੳऔΓ·ͱΊͨߦಈϩάΛࢹ֮Խɺ෼ੳͰ͖Δঢ়ଶ ׆༻෼ੳͨ͠ߦಈϩάΛ΋ͱʹܧଓతͳαʔϏεվળ͕ߦ͍͑ͯΔঢ়ଶ

Slide 12

Slide 12 text

ϩάج൫

Slide 13

Slide 13 text

ϩάج൫ʹେ੾ͳ͜ͱ

Slide 14

Slide 14 text

lϩάͷ׆༻z

Slide 15

Slide 15 text

ϩάl׆༻zج൫

Slide 16

Slide 16 text

#JHGPPU

Slide 17

Slide 17 text

#JHGPPU wϖύϘͷ࣍ੈ୅ϩάl׆༻zج൫ wߦಈϩάͷऩूɺ෼ੳɺ׆༻ͷ֤ஈ֊ʹ͓͍ͯɺશࣾͰར༻Ͱ͖Δ൚༻ੑͱ۩ ମతͳ׆༻ํ๏Λఏڙ wࠃ಺࠷େڃϋϯυϝΠυϚʔέοτNJOOFΛࢧ͑Δϩάج൫

Slide 18

Slide 18 text

#JHGPPU IDFA/GAID UID rack-bigfoot Service Request Activity log Services DB Attribute Big Cube Cube https://icons8.com BI Recommendation Bandit algorithm Re-marketing Feedback Name identification Cookie Sync

Slide 19

Slide 19 text

#JHGPPUΛࢧ͑Δٕज़

Slide 20

Slide 20 text

ऩूɾ෼ੳ

Slide 21

Slide 21 text

ϩάΛૹΔ

Slide 22

Slide 22 text

SBDLCJHGPPU w3BJMTΞϓϦέʔγϣϯͱ'MVFOUEΛͭͳ͙3BDLϛυϧ΢ΣΞ w#JHGPPUʹඞཁͳڞ௨ύϥϝλΛϦΫΤετɾϨεϙϯεϔομ͔Βऔಘ wαʔϏεݻ༗ͷύϥϝλΛ෇༩͢Δ͜ͱ΋Մೳ Rails.application.config.app_middleware.insert_after ActionDispatch::Callbacks, Rack::Bigfoot do |config| config.service = 'minne' config.environment = Rails.env config.enable_fluent = Rails.env.production? || Rails.env.staging? config.ignore_path_patterns << %r(\A/healthcheck) config.headers << 'HTTP_X_CLIENT_VERSION' end

Slide 23

Slide 23 text

ϩάΛͨΊΔ

Slide 24

Slide 24 text

5SFBTVSF%BUB wΫϥ΢υܕσʔλϚωδϝϯταʔϏε wIUUQTXXXUSFBTVSFEBUBDPN wେ༰ྔͷϩάอଘɺ෼ࢄॲཧʹΑΔߴ଎ͳϩάૢ࡞ log Plasma DB HiveQL export SQL aggregate Data Tanks

Slide 25

Slide 25 text

ϩάΛѻ͏

Slide 26

Slide 26 text

)JWF2- w5SFBTVSF%BUB্ͷߦಈϩάΛ42-ϥΠΫʹѻ͏ wIUUQIJWFBQBDIFPSH wIUUQTEPDTUSFBTVSFEBUBDPNBSUJDMFTIJWF SELECT TD_TIME_FORMAT(time, 'yyyy-MM-dd HH:mm:ss', 'JST') AS timestamp, response_time, request_method, path_info FROM activity WHERE TD_TIME_RANGE(time, '2016-07-01 10:00:00', '2016-07-01 12:00:00', 'JST');

Slide 27

Slide 27 text

ϫʔΫϑϩʔ w5SFBTVSF%BUBͷεέδϡʔϧΫΤϦΛར༻ wΫΤϦͷίʔυ؅ཧ༻ʹ1FOEVMVNΛ։ൃ wIUUQTHJUIVCDPNNPOPDISPNFHBOFQFOEVMVN w%4-ʹΑͬͯεέδϡʔϧΫΤϦΛهड़͠ɺίʔυ؅ཧ Scheduled queries Queries on GitHub Apply Pendulum

Slide 28

Slide 28 text

1FOEVMVN schedule 'test-scheduled-job' do database 'db_name' query 'select time from access;' retry_limit 0 priority :normal cron '30 0 * * *' timezone 'Asia/Tokyo' delay 0 result_url 'td://@/db_name/table_name' end Schedfile Apply $ pendulum --apikey='...' -a --dry-run $ pendulum --apikey='...' -a

Slide 29

Slide 29 text

%JHEBHҠߦத IUUQTHJUIVCDPNUSFBTVSFEBUBEJHEBH

Slide 30

Slide 30 text

ϩάΛศརʹ͢Δ

Slide 31

Slide 31 text

ଐੑ৘ใ wߦಈϩάͱଐੑ৘ใΛ૊Έ߹ΘͤΔ͜ͱͰ෼ੳ࣌ͷ෯͕޿͕Δ Attribute Master 1,000 records each Sidekiq workers def perform(*args) User.order(:id).select(:id).find_in_batches do |users| UserAttributesUploadJob.perform_later(users.first.id, users.last.id) end end Activity Join HiveQL No temporary file

Slide 32

Slide 32 text

໊دͤ wαʔϏεͷΞΧ΢ϯτͱ֤ΫϥΠΞϯτΛϚοϐϯά wະϩάΠϯঢ়ଶͷΞΧ΢ϯτ΋໊دͤޙʹաڈʹḪͬͯඥ෇͚ w$PPLJF4ZODͱ૊Έ߹ΘͤͯαʔϏεΛ·͍ͨͩϚοϐϯά΋Մೳ Name identification

Slide 33

Slide 33 text

ϩάΛ෼ੳ͢Δ

Slide 34

Slide 34 text

#JH$VCFͱ$VCF wશαʔϏεͷߦಈϩάΛू໿ͨ͠#JH$VCF w੾ޱ͕֬ఆͨ͠΋ͷ͸ϝδϟʔΧϥϜɺσΟ ϝϯγϣϯΧϥϜͷ୯ҐͰ$VCFʹ੾Γग़͠ wϝδϟʔఆྔԽՄೳͳΧϥϜ wσΟϝϯγϣϯूܭͷ੾ΓޱͱͳΔΧϥϜ wྫ࣌ؒ͝ͱͷച্ɺ౎ಓ෎ݝ͝ͱͷ࡞඼਺ w$VCF͸σʔλϚʔτʹஔ͖ɺߴ଎ʹࢀরͰ ͖ΔΑ͏ʹ͢Δ Activity Big Cube Cube HiveQL SQL BI, Dashboard ad-hoc query Analyst Managers, Product owners, Promotion groups

Slide 35

Slide 35 text

ࢹ֮Խͱ෼ੳ wࢹ֮Խͱ෼ੳʹ͸5BCMFBVࣾͷ5BCMFBV%FTLUPQΛར༻ wIUUQXXXUBCMFBVDPN wσʔλιʔεͱͯ͠5SFBTVSF%BUBΛબ୒Մ w μογϡϘʔυྫ w ྲྀ௨ֹɺΩϟϯηϧֹۚɺ஫จֹۚɺϢʔβʔ୯Ձ w ྦྷੵձһ਺ɺ஫จ୯Ձɺ৽نొ࿥Ϣʔβʔɺ%"6$৽نɺ%"6$طଘ w ड஫࡞඼਺ɺड஫཰ɺड஫࡞඼Ձ֨ɺड஫Մೳ࡞඼਺ w ૯ࡏݿ਺ɺࡏݿ୯Ձɺࡏݿ૯ֹ w ड஫Մೳ࡞Ո਺ɺൢചத࡞඼਺ɺ։ళத࡞඼਺ɺ૯࡞඼਺

Slide 36

Slide 36 text

׆༻

Slide 37

Slide 37 text

׆༻ w෼ੳͨ݁͠ՌΛ΋ͱʹԾઆΛཱͯͯγεςϜͷվमΛߦ͏ wը໘σβΠϯͷมߋɺεςοϓͷݟ௚͠ w"#ςετ ˠ੩తͳϑΟʔυόοΫ

Slide 38

Slide 38 text

ಈతͳϑΟʔυόοΫ

Slide 39

Slide 39 text

όϯσΟοτΞϧΰϦζϜ w୳ٻͱ׆༻ͷׂ߹Λߋ৽͠ଓ͚Δ͜ͱͰ"#ςετͷػձଛࣦΛݮΒ͢ wIUUQTXXXPSFJMMZDPKQCPPLT wྫ͑͹ɺ͋Δػೳͷ$53Λվળ͢ΔͨΊʹׂ͸࠷ળͷख๏ʢ׆༻ʣɺ࢒Γ ׂͰෳ਺ͷख๏Λࢼ͢ʢ୳ٻʣ Activity Epsilon-Greedy algorithm User 1-ε: exploitation ε/pattern: exploration Click or not click Import

Slide 40

Slide 40 text

Ϩίϝϯυ wNJOOFʮ͋ͳͨʹ͓͢͢Ίͷ࡞Ոʯ wϢʔβʔͷߦಈΛجʹ࡞ՈΛϨʔςΟϯά Activity Filter and shuffle Users fav, follow etc… Matrix Factorization Recommendation import DB

Slide 41

Slide 41 text

Ϩίϝϯυ.BUSJY'BDUPSJ[BUJPO wڠௐϑΟϧλϦϯάϢʔβͷᅂ޷৘ใΛ஝ੵ͠ɺ͋ΔϢʔβͱᅂ޷ͷྨࣅ͠ ͨଞͷϢʔβͷ৘ใΛ༻͍ͯਪ࿦Λߦ͏ w.BUSJY'BDUPSJ[BUJPO w࣍ݩ࡟ݮ wϢʔβʔ΍࡞඼͝ͱͷධՁͷภΓ͕͋Γɺૄͳσʔλʹର͢ΔධՁ༧ଌ Item User R ≈ = m P n n Q × m k k

Slide 42

Slide 42 text

Ϩίϝϯυ.BUSJY'BDUPSJ[BUJPO R’ui = μ + Bu + Bi + Pu TQi minP,Q,B Σ (Rui - R’ui )2 + λ(||Bu ||2 + ||Bi ||2 + ||Pu ||2 + ||Qi ||2) ༧ଌ ֶश (u,i)∈R ਖ਼ଇԽ߲ ޡࠩ ฏۉ όΠΞε

Slide 43

Slide 43 text

)JWFNBMM

Slide 44

Slide 44 text

Ϩίϝϯυ.BUSJY'BDUPSJ[BUJPO SELECT idx, array_avg(u_rank) as Pu, array_avg(i_rank) as Qi, avg(u_bias) as Bu, avg(i_bias) as Bi, min(mu) as mu FROM ( SELECT train_mf_sgd(account_id, creator_id, rating, '-factor 20 -iter 50 -update_mu') AS (idx, u_rank, i_rank, u_bias, i_bias, mu) FROM training ) t GROUP BY idx;

Slide 45

Slide 45 text

ͳͲͳͲ w͍ΘΏΔɺӾཡ์غɺΧʔτ์غͷ࡞඼Λߦಈϩά͔Βநग़ wಛఆͷ৚݅Ͱݺͼ໭͠ͷ௨஌Λߦ͏ wߦಈϩά͔Βؔ࿈ੑͷߴ͍޿ࠂΛग़͢ wϦϚʔέςΟϯά w޿ࠂର৅ͷηάϝϯτԽʢߜࠐɺআ֎ʣ CSPXTF DBSUBCBOEPONFOU ޿ࠂ࿈ܞ

Slide 46

Slide 46 text

αʔϏεʹدΓఴ͏ϩάج൫

Slide 47

Slide 47 text

αʔϏεʹدΓఴ͏ϩάج൫ w୯ʹϩάΛूΊΔ͚ͩʹͤͣɺ෼ੳɺ׆༻ͷஈ֊Λิॿ͢Δ w੩తͳϑΟʔυόοΫ͔ΒಈతͳϑΟʔυόοΫ΁ wߦಈϩάͷ॥؀ʹΑΓɺͳΊΒ͔ͳੈք΁

Slide 48

Slide 48 text

ϩά͸͍͍ͧ

Slide 49

Slide 49 text

͓ΘΓ

Slide 50

Slide 50 text

܅΋ϖύϘͰಇ͔ͳ͍͔ʁ ࠷৽ͷ࠾༻৘ใΛνΣοΫˠ !QC@SFDSVJU