Upgrade to Pro
— share decks privately, control downloads, hide ads and more …
Speaker Deck
Features
Speaker Deck
PRO
Sign in
Sign up for free
Search
Search
Introduction to Data Science for PHP Users
Search
Sotaro Karasawa
September 14, 2013
Technology
5
14k
Introduction to Data Science for PHP Users
PHPカンファレンス2013「PHPerのためのデータサイエンス入門」 #phpcon2013
Sotaro Karasawa
September 14, 2013
Tweet
Share
More Decks by Sotaro Karasawa
See All by Sotaro Karasawa
大「個人開発サービス」時代に僕たちはどう生きるか
sotarok
22
12k
P2B Haus法人サポータープランのご提案
sotarok
2
1.6k
ソフトウェアxスタートアップから見た飲食と配送の世界 / The World of Food Deliverlies and Restaurant Businesses from a Software and Startup Perspective
sotarok
2
1.3k
CTO 3度目の正直 / My 3rd CTO Career
sotarok
21
11k
Introduction to the Corporate Solutions Engineering at MTC2018
sotarok
1
36k
Mercari meetup for Corporate Engineering #1 / What is "Corporate Engineering"?
sotarok
2
2.4k
Markdown and WYSIWYG
sotarok
1
6.3k
20 Jan 2017 / Moving Beyond Borders - Mercari DAY
sotarok
8
15k
PHPBLT の心得 / PHPBLT #5 @ペパボ
sotarok
5
3.6k
Other Decks in Technology
See All in Technology
Redshift認可、アップデートでどう変わった?
handy
1
130
BidiAgent と Nova 2 Sonic から考える音声 AI について
yama3133
2
140
Digitization部 紹介資料
sansan33
PRO
1
6.4k
Eight Engineering Unit 紹介資料
sansan33
PRO
0
6.1k
Cloud WAN MCP Serverから考える新しいネットワーク運用 / 20251228 Masaki Okuda
shift_evolve
PRO
0
130
CQRS/ESになぜアクターモデルが必要なのか
j5ik2o
0
540
SES向け、生成AI時代におけるエンジニアリングとセキュリティ
longbowxxx
0
290
研究開発部メンバーの働き⽅ / Sansan R&D Profile
sansan33
PRO
4
21k
業務の煩悩を祓うAI活用術108選 / AI 108 Usages
smartbank
9
19k
AI: The stuff that nobody shows you
jnunemaker
PRO
1
150
Keynoteから見るAWSの頭の中
nrinetcom
PRO
1
160
AWS re:Invent2025最新動向まとめ(NRIグループre:Cap 2025)
gamogamo
0
150
Featured
See All Featured
From Legacy to Launchpad: Building Startup-Ready Communities
dugsong
0
120
AI Search: Implications for SEO and How to Move Forward - #ShenzhenSEOConference
aleyda
1
1.1k
Building Flexible Design Systems
yeseniaperezcruz
330
40k
Agile Actions for Facilitating Distributed Teams - ADO2019
mkilby
0
100
Tell your own story through comics
letsgokoyo
0
770
Optimizing for Happiness
mojombo
379
70k
How to Ace a Technical Interview
jacobian
281
24k
Utilizing Notion as your number one productivity tool
mfonobong
2
190
[RailsConf 2023 Opening Keynote] The Magic of Rails
eileencodes
31
9.8k
Building a Scalable Design System with Sketch
lauravandoore
463
34k
The B2B funnel & how to create a winning content strategy
katarinadahlin
PRO
0
220
How STYLIGHT went responsive
nonsquared
100
6k
Transcript
Crocos, Inc. Sotaro Karasawa @sotarok http://facebook.com/sotarok 1)1FSͷͨΊͷ σʔλαΠΤϯεೖ QIQDPO 1)1ΧϯϑΝϨϯε
ࣗݾհ 4PUBSP,BSBTBXB!TPUBSPL ฑ૱ଠ EIBUFOBOFKQTPUBSPL גࣜձࣾΫϩίε$SPDPT*OD 1)1 (JU 5% 3FE#VMM
ύʔϑΣΫτ1)1 ٕज़ධࣾ વΈͳ͞Μ࣋ͬͯ·͢ΑͶʂʁ ˡ
σʔλαΠΤϯε
ৄ͍͜͠ͱ σʔλαΠΤϯςΟετ ཆಡຊ ٕज़ධࣾ IUUQXXXBNB[PODPKQEQ
σʔλαΠΤϯε ۀཧղ σʔλཧղ σʔλநग़ σʔλՃ ϞσϦϯά ޮՌݕূ αʔϏε࣮ Ҿ༻σʔλαΠΤϯςΟετཆಡຊ 1ୈষσʔλαΠΤϯεͷϓϩηε
σʔλαΠΤϯε ੵ͞ΕͨσʔλΛੳɾϞσϦϯάͯ͠ ϏδωεΛߦ͢ΔͨΊʹॏཁͳ ࢦඪΛಘΔ Λ܁Γฦ͢
σʔλαΠΤϯε ੵ͞ΕͨσʔλΛੳɾϞσϦϯάͯ͠ ϏδωεΛߦ͢ΔͨΊʹॏཁͳ ࢦඪΛಘΔ Λ܁Γฦ͢ Βͳ͚Ε͍͚ͳ͍͜ͱ͕ଟ͍ ࣝͷྖҬɾ෯͕͍
࠷ݶͷͱ͜Ζ͔Β खܰʹ࢝ΊΒΕΔͱ͜Ζ͔Β ࠷ॳͷาΛ;Έͩͦ͏
σʔλαΠΤϯε ۀཧղ σʔλཧղ σʔλநग़ σʔλՃ ϞσϦϯά ޮՌݕূ αʔϏε࣮ Ҿ༻σʔλαΠΤϯςΟετཆಡຊ 1ୈষσʔλαΠΤϯεͷϓϩηε
1)1FS 8FCΞϓϦέʔγϣϯʹͱͬͯ σʔλͱԿ͔
1)1FS 8FCΞϓϦέʔγϣϯʹͱͬͯ σʔλͱԿ͔ σʔλϕʔε ϩά
ࠓճϩάͷ͓
େྔͷΞϓϦέʔγϣϯϩάΛ ͍͔ʹऩू͠ ͲͷΑ͏ʹूܭ͢Δ͔
ͦΕΛ౿·͑ͯ ࠓͷΞδΣϯμ ϩάऩूͱੳͷΈ 1)1ΞϓϦέʔγϣϯͷϩάऩू ੳ
ϩάͷऩूͱੳͷΈ
Έͷਚ͖ͳ͍ ϩάͷऩूͱੳ େྔͷσʔλ Ͳ͏ूΊΔ Ͳ͜ʹஷΊΔ Ͳ͏औΓग़͢ Ͳ͏ूܭ͢Δ
Έͷਚ͖ͳ͍ ϩάͷऩूͱੳ େྔͷσʔλ Ͳ͏ूΊΔ Ͳ͜ʹஷΊΔ Ͳ͏औΓग़͢ Ͳ͏ूܭ͢Δ ωοτϫʔΫଳҬ σΟεΫ༰ྔ Ϗοάσʔλॲཧܥ
ॲཧ࣌ؒ
IUUQXXXUSFBTVSFEBUBDPN
TD Web Server Web Server fluentd S3 Hadoop Client Hive
MySQL etc... Result
TD Web Server Web Server fluentd S3 Hadoop Client Hive
MySQL etc... Result ͋ͬͪଆʹσʔλ͕ஷ·ΓɺΫΤ ϦΛ͛Δͱ͋ͬͪͰ)BEPPQ ͕ىಈͯ݁͠ՌΛฦͯ͘͠ΕΔ
ϩάੳΛਐΊΔʹ͋ͨΓ հͳɺσʔλͷऩूɾੵɾσʔλॲཧ ɹˠ5%͕ͬͯ͘ΕΔ ຊ࣭తͳۀ ɾͲͷΑ͏ͳσʔλ ɾͲͷΑ͏ʹूܭ ͷઃܭɾ࣮ʹίϛοτͰ͖Δʂ
$SPDPTʹ͓͚Δϩάͷ׆༻ wΞϓϦέʔγϣϯϩά w'BDFCPPLͷଐੑใʹجͮ͘ੳ wओཁͳΞΫγϣϯͷ࣮ߦ࣮ߦ࣌ؒ wτϥϯβΫγϣϯɾଐੑผɾܦ࿏ผ wΠϕϯτϩά wιʔγϟϧͷγΣΞ w.PEBMͷ։ดFUD wͦͷଞΖΖ
1)1ΞϓϦέʔγϣϯͷ ϩάऩू
ͲΜͳΞϓϦέʔγϣϯϩά جຊతͳϩάઃܭ
ͲΜͳϩάΛूΊͯΔʁ
8FCαʔόͷϩά
ϩάͱ͍͑ 8FCαʔόʔͷϩά 5SFBTVSF%BUBͷνϡʔτϦ Ξϧ"QBDIFͷϩά http://docs.treasure-data.com/articles/quickstart
͚ͩͲຊʹཉ͍͠ͷ
ͲΜͳϢʔβʔ͕ʁ ͲΜͳͰʁͲ͔͜Βʁ ͍ͭԿΛͨ͠ͷ͔ʁ ͲΜͳϘλϯΛΫϦοΫͨ͠ ͷ͔ʁλοϓͨ͠ͷ͔ʁ
ΞϓϦέʔγϣϯϩά
ͲΜͳϢʔβʔ͕ʁ ɹˠϢʔβʔొใ ͲΜͳͰʁͲ͔͜Βʁ ɹˠ6"(&0 ͍ͭԿΛͨ͠ͷ͔ʁ ɹˠ63*ΞΫγϣϯ
ΞϓϦέʔγϣϯϩάΛ Ͳ͏ूΊΔ͔
ͦͷલʹ ܰ͘εΩʔϚϨεϩάʹ͍ͭͯ
εΩʔϚϨεϩάͱʁ εΩʔϚͷແ͍ϩά
ϩάͷεΩʔϚ ͜Ε·Ͱ ˠྫ͑547
ΧϥϜUJNF ΧϥϜTUBUVT ΧϥϜVSJ ΧϥϜVTFS@JE IPHF εΩʔϚ
foreach (file('app.log') as $line) { $column = explode("\t", trim($line)); $time
= $column[0]; $status = $column[1]; ... } ˞࣮ࡍʹ1)1ͳΜ͔ͰͬͯΒΕͳ͍ͷͰTFEBXLͰ
߲ͷΘ͔ΓͮΒ͞ εΩʔϚมߋͷ͠͞ ੳऀͱऩूऀͷೝࣝࠩҟʹ ΑΔࣄނ
5%ͷϩά ͱ͍͏͔qVFOUE +40/ { "time":1373876885, "status":200, "uri":"/52495/facebook", "session_id":"kn6avn2fuh21r25a65mgm3rjh3", "fb_id":"7c40c5dd2e55cde37a8c40ed80e1", ...
}
ϩάͷ1045
qVFOUQIQMPHHFS use Fluent\Logger\FluentLogger; $logger = new FluentLogger("localhost","24224"); $logger->post( "debug.test", array("hello"=>"world")
); IUUQTHJUIVCDPNqVFOUqVFOUMPHHFSQIQ
جຊతͳϩάઃܭ
ΞΫηεϨίʔυͱͳΔΑ ͏ʹه͢Δ
Ϩεϙϯεʹͻ͔͚ͬΔ ϑϨʔϜϫʔΫʹ͍͍ͩͨ ϨεϙϯεΠϕϯτͷϑοΫϙΠϯτ͋ΔΑͶʁ 4ZNGPOZͳΒ PO,FSOFM3FTQPOTF
tags: - { name: kernel.event_listener, event: kernel.response } public function
onKernelResponse(FilterResponseEvent $event) { $request = $event->getRequest(); $response = $event->getResponse(); // ͳΜ͔ྻͭͬͯ͘ $data = $this->onAccess($request, $response); // log data $this->logger->post("access",$data); } ˞࣮ࡍʹͬͱෳͷ-JTUFOFS-PHHFS͕ొͰ͖ΔΑ͏ʹͯ͋͠Γ·͕͢
جຊతͳεΩʔϚΛܾΊΔ
εΩʔϚϨεͱ͍ͬͯ Ͳ͏͍͏ϩάΛѻ͍ͬͯΔͷ͔ ֤ϨίʔυͰҙຯ͕ҧͬͯҙ ຯ͕ແ͍
جຊతͳεΩʔϚΛܾΊΔ UJNF TUBUVT VSJ VB SFGFSSFS LTSVͬΆ໊͍લʹ߹Θͤͯ ͓͘ͱΘ͔Γ͍͔͢
8FCαʔόʹ͋Δϩά ͚ͩͰͳ͘ BQQ SPVUF DPOUSPMMFS QSPDFTT@UJNF EFWJDF ϑϨʔϜϫʔΫͰͷ ϧʔςΟϯά໊ͱ͔ɺ
ίϯτϩʔϥ໊ͱ͔ (uri ʹϊΠζ͕͋ͬͯ routing ໊ͰूܭͰ͖Δ)
ΞϓϦέʔγϣϯͷΓ͏Δ ଐੑΛඇਖ਼نԽͯ͠Ϩίʔυ ʹؚΊΔ
ඇਖ਼نԽ͞ΕͨϨίʔυ TFTTJPO@JE VTFS@JE HFOEFS BHF EFWJDF
ͳͥඇਖ਼نԽ͔ͷϝϦοτ +0*/ͤͣʹूܭؔʹ͔ΔͨΊ )BEPPQͰ+0*/Ͱ͖Δ͕ɺ ͜͏͓ͯ͘͠ͱఔ͕ݮΔ͔Β ͍ˍγϯϓϧ
ͪͳΈʹ VTFS@JE TFTTJPO@JE ͳͲIBTIԽ͓ͯ͘͠ͱྑ͍ ˞ສҰͷͱ͖ͷϓϥΠόγʔʹ ྀ
·ͱΊΔͱ ΞΫηεϨίʔυͱͳΔΑ͏ ʹه͢Δ جຊతͳεΩʔϚΛܾΊΔ ΞϓϦέʔγϣϯͷΓ͏Δଐ ੑΛඇਖ਼نԽͯ͠ϨίʔυʹؚΊΔ
͜͜·ͰདྷΔͱɺ͏ੳ͕Մೳ
ੳͷྫ SELECT AVG(v['process_time']) FROM access WHERE v['route'] = 'crocos_index'
ੳͷྫ SELECT v['gender'], COUNT(*) FROM access GROUP BY v['gender'] ඇਖ਼نԽ͓͍ͯ͠
ͯΑ͔ͬͨʂ
ੳͷྫ Τϥʔͷௐࠪʹ SELECT v['route'], v['status'], v['ua'] FROM access WHERE v['user_id']
= 'xxx'
˞͘ͳΔͷͰؔ࿈ͷॲཧলུͯ͠·͢ ɹຊผʹ(3061#:ͨ͠Γ8&)&3۟ͰߜͬͨΓ
εΩʔϚϨεϩάͷ׆༻ྫ τϥϯβΫγϣϯ
ͯ͞ جຊతͳεΩʔϚΛ࣋ͭ ϩά͕ͨ·Γ࢝Ί·ͨ͠
ಛผͳҙຯΛ࣋ͭ ΞΫγϣϯͷޭͳͲΛ ه͍ͨ͠
τϥϯβΫγϣϯ uri route: ϦΫΤετ͕དྷͨ͜ͱΘ͔Δ ͔͠͠ɺຊʹޭ͔ͨ͠ɺ ΞϓϦέʔγϣϯͰ͔͠Θ͔Β ͳ͍
εΩʔϚϨεͷग़൪
جຊతͳεΩʔϚ ՃͷεΩʔϚ UJNF TUBUVT VSJ VB SFGFSSFS ͳΜͪΌΒ ͔ΜͪΌΒ
ಛఆͷϨίʔυʹɺಛผ ͳҙຯΛͨͤΔ͜ͱ͕Ͱ ͖Δʂ ͔͠ଞͷϨίʔυʹӨڹ Λ͋ͨ͑Δ͜ͱͳ͘ɻ
τϥϯβΫγϣϯ key_action key_attr_*
τϥϯβΫγϣϯ key_action shop:buy:completed ΞϓϦ:ಈ࡞:ঢ়گ ※͜ͷྫʮߪೖྃʯ
τϥϯβΫγϣϯ key_attr_* τϥϯβΫγϣϯʹؔΘΔՃ తͳใΛͭͬ͜Ή εΩʔϚɺkey_action ͝ͱʹ ҟͳΔ
τϥϯβΫγϣϯྫ key_action = shop:buy:completed key_attr_item_id = xxxxx key_attr_ref = fb_share
τϥϯβΫγϣϯੳͷྫ SELECT item_id, ref, COUNT(*) FROM access WHERE key_action =
'shop:buy:completed' GROUP BY item_id, ref ˞จࣈͷ্ؔW<>ল͍ͯΔ
τϥϯβΫγϣϯੳ ׆༻ྫ: ࢪࡦผʹΞΫηεݩΛه τϥϯβΫγϣϯޭ͔Β ࠷ޮՌతͳࢪࡦΛݟ͚ͭΔ
/&9545&1
ूܭ݁Ռ͔Β ɾ౷ܭతղੳख๏ ɾϞσϦϯά Ϗδωεʹରͯ͠ΫϦςΟΧϧͳࢦඪ ͷࢉग़ͱվળϓϩηεͷཱ֬
·ͱΊ
ϩάΛूΊͨΓੳͨ͠Γ͢Δͷେม ɹ→ Fluentd Hadoop ͏ ɹ→ Treasure Data ͏
Ͳ͏͍͏ϩάΛूΊΕ͍͍ͷ͔ ɹ→ 1ΞΫηε1Ϩίʔυඇਖ਼نԽϩά ɹ→ ϩάϑΥʔϚοτࣗମͷઃܭ ɹ→ εΩʔϚϨεͷ׆༻
࠷ޙʹ 8FBSFIJSJOH ύʔϑΣΫτ1)1ஶऀਓ ݩ1)1ΧϯϑΝϨϯεҕһਓ ݩඇϞςਓ ݩυϥ່ਓ ͱಇ͚Δͷ$SPDPT͚ͩ
None