Upgrade to Pro
— share decks privately, control downloads, hide ads and more …
Speaker Deck
Features
Speaker Deck
PRO
Sign in
Sign up for free
Search
Search
Design Patterns for Collecting and Analyzing Sc...
Search
Sotaro Karasawa
August 09, 2013
Technology
5
870
Design Patterns for Collecting and Analyzing Schemaless Log
スキーマレスなログデータの収集と集計のためのデザインパターン
at
http://www.zusaar.com/event/876003
Sotaro Karasawa
August 09, 2013
Tweet
Share
More Decks by Sotaro Karasawa
See All by Sotaro Karasawa
大「個人開発サービス」時代に僕たちはどう生きるか
sotarok
22
12k
P2B Haus法人サポータープランのご提案
sotarok
2
1.6k
ソフトウェアxスタートアップから見た飲食と配送の世界 / The World of Food Deliverlies and Restaurant Businesses from a Software and Startup Perspective
sotarok
2
1.3k
CTO 3度目の正直 / My 3rd CTO Career
sotarok
21
11k
Introduction to the Corporate Solutions Engineering at MTC2018
sotarok
1
36k
Mercari meetup for Corporate Engineering #1 / What is "Corporate Engineering"?
sotarok
2
2.4k
Markdown and WYSIWYG
sotarok
1
6.4k
20 Jan 2017 / Moving Beyond Borders - Mercari DAY
sotarok
8
15k
PHPBLT の心得 / PHPBLT #5 @ペパボ
sotarok
5
3.7k
Other Decks in Technology
See All in Technology
usermode linux without MMU - fosdem2026 kernel devroom
thehajime
0
240
小さく始めるBCP ― 多プロダクト環境で始める最初の一歩
kekke_n
1
560
OCI Database Management サービス詳細
oracle4engineer
PRO
1
7.4k
データの整合性を保ちたいだけなんだ
shoheimitani
8
3.2k
コミュニティが変えるキャリアの地平線:コロナ禍新卒入社のエンジニアがAWSコミュニティで見つけた成長の羅針盤
kentosuzuki
0
130
仕様書駆動AI開発の実践: Issue→Skill→PRテンプレで 再現性を作る
knishioka
2
680
Cosmos World Foundation Model Platform for Physical AI
takmin
0
970
マーケットプレイス版Oracle WebCenter Content For OCI
oracle4engineer
PRO
5
1.6k
量子クラウドサービスの裏側 〜Deep Dive into OQTOPUS〜
oqtopus
0
140
外部キー制約の知っておいて欲しいこと - RDBMSを正しく使うために必要なこと / FOREIGN KEY Night
soudai
PRO
12
5.6k
AI駆動開発を事業のコアに置く
tasukuonizawa
1
360
Webhook best practices for rock solid and resilient deployments
glaforge
2
300
Featured
See All Featured
ピンチをチャンスに:未来をつくるプロダクトロードマップ #pmconf2020
aki_iinuma
128
55k
Visualizing Your Data: Incorporating Mongo into Loggly Infrastructure
mongodb
49
9.9k
Bridging the Design Gap: How Collaborative Modelling removes blockers to flow between stakeholders and teams @FastFlow conf
baasie
0
450
Taking LLMs out of the black box: A practical guide to human-in-the-loop distillation
inesmontani
PRO
3
2k
Mobile First: as difficult as doing things right
swwweet
225
10k
Designing for Performance
lara
610
70k
Intergalactic Javascript Robots from Outer Space
tanoku
273
27k
Side Projects
sachag
455
43k
We Have a Design System, Now What?
morganepeng
54
8k
How to Ace a Technical Interview
jacobian
281
24k
Build your cross-platform service in a week with App Engine
jlugia
234
18k
Measuring Dark Social's Impact On Conversion and Attribution
stephenakadiri
1
130
Transcript
Crocos, Inc. Sotaro Karasawa @sotarok http://facebook.com/sotarok εΩʔϚϨεͳ ϩάσʔλͷ ऩूͱूܭͷͨΊͷ σβΠϯύλʔϯ
#ds2013 ·ͨ Treasure Data ϋΠύʔ׆༻ज़
ࣗݾհ 4PUBSP,BSBTBXB!TPUBSPL ฑ૱ଠ EIBUFOBOFKQTPUBSPL גࣜձࣾΫϩίε$SPDPT*OD 1)1 3FE#VMM
ࣾһਓͰۀ ։ൃऀ࣌ਓ ݄ʹαʔϏεϩʔϯν ݄ʹ5%ಋೖ
ࠓ͍ͨ͜͠ͱ ΞϓϦέʔγϣϯϩάΛͲ͏ू ΊΔ͔ εΩʔϚϨεͳϩάͱ جຊతͳϩάઃܭ ϩάऩूͷσβΠϯύλʔϯ
ࠓͷత ใڞ༗ɺใަ ͏ͪͰ͜͏ͬͯΔΑɺͱ͍͏Ұྫ ܾͯ͠ߨࢣͱͯ͠ɺ͜͏Γ·͠ΐ͏ͱݴ ͍ʹདྷͨΘ͚Ͱͳ͘ɻ ࠓޙɺ͜͏͍͏ωλ͕σΟεΧογϣϯͰ ͖Ε͍͍ͳͱ
ओʹ 8FCΞϓϦέʔγϣϯ ͷ Ͱ͕͢ɺ8FCΞϓϦέʔγϣϯଟ༷Խ͠ ͍ͯ·͢ ޙͷσβΠϯύλʔϯͷͳ͔Ͱ͍͔ͭ͘ ৮ΕΒΕΔ͔ͳʁ
2ϩάऩूΛ͍ͯ͠Δ 2qVFOUEΛ͍ͬͯΔ 25%Λ͍ͬͯΔ
ͲΜͳϩάΛूΊͯΔʁ
8FCαʔόͷϩά
ϩάͱ͍͑ 8FCαʔόʔͷϩά 5SFBTVSF%BUBͷνϡʔτϦ Ξϧ"QBDIFͷϩά http://docs.treasure-data.com/articles/quickstart
͚ͩͲຊʹཉ͍͠ͷ
ͲΜͳϢʔβʔ͕ʁ ͲΜͳͰʁͲ͔͜Βʁ ͍ͭԿΛͨ͠ͷ͔ʁ ͲΜͳϘλϯΛΫϦοΫͨ͠ ͷ͔ʁλοϓͨ͠ͷ͔ʁ
ΞϓϦέʔγϣϯϩά
ͲΜͳϢʔβʔ͕ʁ ɹˠϢʔβʔొใ ͲΜͳͰʁͲ͔͜Βʁ ɹˠ6"(&0 ͍ͭԿΛͨ͠ͷ͔ʁ ɹˠ63*ΞΫγϣϯ
ΞϓϦέʔγϣϯϩάΛ Ͳ͏ूΊΔ͔
εΩʔϚϨεϩάͱʁ
εΩʔϚϨεϩάͱʁ εΩʔϚͷແ͍ϩά
ϩάͷεΩʔϚ ͜Ε·Ͱ ˠྫ͑547
ΧϥϜUJNF ΧϥϜTUBUVT ΧϥϜVSJ ΧϥϜVTFS@JE JOEFY
ΧϥϜUJNF ΧϥϜTUBUVT ΧϥϜVSJ ΧϥϜVTFS@JE JOEFY εΩʔϚ
for line in open('app.log', 'r'): columns = line.split("\t") time =
columns[0] ...
߲ͷΘ͔ΓͮΒ͞ εΩʔϚมߋͷ͠͞ ੳऀͱऩूऀͷೝࣝࠩҟʹ ΑΔࣄނ
5%ͷϩά ͱ͍͏͔qVFOUE +40/ { "time":1373876885, "status":200, "request_uri":"/52495/facebook", "session_id":"kn6avn2fuh21r25a65mgm3rjh3", "fb_id":"7c40c5dd2e55cde37a8c40ed80e1", ...
}
Θ͔Γ͍͢ ߲ΛՃͰ͖Δ σʔλྔ૿͑Δɾ
ΞϓϦέʔγϣϯϩάΛ Ͳ͏ूΊΔ͔
جຊతͳϩάઃܭ
ΠϕϯτϨίʔυͱͳΔΑ ͏ʹه͢Δ ˞8FCΞϓϦέʔγϣϯͷ߹ɺΞΫηε
Πϕϯτͱ 8FCΞϓϦέʔγϣϯͳΒ ɾΞΫηε ωΠςΟϒΞϓϦͳΒ ɾΠϕϯτ
جຊతͳεΩʔϚΛܾΊΔ
εΩʔϚϨεͱ͍ͬͯ Ͳ͏͍͏ϩάΛѻ͍ͬͯΔͷ͔ ֤ϨίʔυͰҙຯ͕ҧͬͯҙ ຯ͕ແ͍
جຊతͳεΩʔϚΛܾΊΔ UJNF TUBUVT VSJ VB SFGFSSFS
جຊతͳεΩʔϚΛܾΊΔ UJNF TUBUVT VSJ VB SFGFSSFS LTSVͬΆ໊͍લʹ ߹Θ͓ͤͯ͘ͱΘ ͔Γ͍͔͢
8FCαʔόʹ͋Δϩά ͚ͩͰͳ͘ BQQ SPVUF DPOUSPMMFS QSPDFTT@UJNF EFWJDF
8FCαʔόʹ͋Δϩά ͚ͩͰͳ͘ BQQ SPVUF DPOUSPMMFS QSPDFTT@UJNF EFWJDF ϑϨʔϜϫʔΫͰͷϧʔ ςΟϯά໊ͱ͔ɺίϯτ
ϩʔϥ໊ͱ͔ (uri ʹϊΠζ͕͋ͬͯ routing ໊ͰूܭͰ͖Δ)
ΞϓϦέʔγϣϯͷΓ͏Δ ଐੑΛඇਖ਼نԽͯ͠Ϩίʔυ ʹؚΊΔ
ඇਖ਼نԽ͞ΕͨϨίʔυ TFTTJPO@JE VTFS@JE HFOEFS BHF EFWJDF
ͳͥඇਖ਼نԽ͔ͷϝϦοτ +0*/ͤͣʹूܭؔʹ͔ΔͨΊ
ͪͳΈʹ VTFS@JE TFTTJPO@JE ͳͲIBTIԽ͓ͯ͘͠
·ͱΊΔͱ ΠϕϯτϨίʔυͱͳΔΑ͏ ʹه͢Δूܭؔʹ͔ΔͨΊ جຊతͳεΩʔϚΛܾΊΔ ΞϓϦέʔγϣϯͷΓ͏Δଐ ੑΛඇਖ਼نԽͯ͠ϨίʔυʹؚΊΔ
͜͜·ͰདྷΔͱɺ͏ੳ͕Մೳ
ੳͷྫ SELECT AVG(v[‘process_time’]) FROM access WHERE v[‘route’] = ‘crocos_index’
ੳͷྫ SELECT v[‘gender’], COUNT(*) FROM access GROUP BY v[‘gender’] ඇਖ਼نԽ͓͍ͯ͠
ͯΑ͔ͬͨʂ
ੳͷྫ SELECT v[‘gender’], COUNT(*) FROM access GROUP BY v[‘gender’]
ੳͷྫ Τϥʔͷௐࠪʹ SELECT v[‘route’], v[‘status’], v[‘ua’] FROM access WHERE v[‘user_id’]
= ‘xxx’
˞͘ͳΔͷͰؔ࿈ͷॲཧলུͯ͠·͢ ɹຊผʹ(3061#:ͨ͠Γ8&)&3۟ͰߜͬͨΓ
εΩʔϚϨεͳ ΞϓϦέʔγϣϯϩά ͷͨΊͷ σβΠϯύλʔϯ Λߟ͑Δ
ͯ͞ جຊతͳεΩʔϚΛ࣋ͭ ϩά͕ͨ·Γ࢝Ί·ͨ͠
͔͜͜Βઌ ԿΛੳΛ͍ͨ͠߹ʹ ͲΜͳϩάΛೖΕ͓͚ͯྑ ͍͔ ύλʔϯʹ͚ͯߟ͑·͢
εΩʔϚϨεͷग़൪
جຊతͳεΩʔϚ UJNF TUBUVT VSJ VB SFGFSSFS ͳΜͪΌΒ ͔ΜͪΌΒ
جຊతͳεΩʔϚ UJNF TUBUVT VSJ VB SFGFSSFS ͳΜͪΌΒ ͔ΜͪΌΒ ಛఆͷϨίʔυʹɺಛ
ผͳҙຯΛͨͤΔ͜ͱ ͕Ͱ͖Δʂ ͔͠ଞͷϨίʔυʹӨ ڹΛ͋ͨ͑Δ͜ͱͳ͘ɻ
ύλʔϯ τϥϯβΫγϣϯ
ಛผͳҙຯΛ࣋ͭ ΞΫγϣϯͷޭͳͲΛ ه͍ͨ͠
τϥϯβΫγϣϯ uri route: ϦΫΤετ͕དྷͨ͜ͱΘ͔Δ ͔͠͠ɺຊʹޭ͔ͨ͠ɺ ΞϓϦέʔγϣϯͰ͔͠Θ͔Β ͳ͍
τϥϯβΫγϣϯ key_action key_attr_*
τϥϯβΫγϣϯ key_action present:entry:completed ΞϓϦ:ಈ࡞:ঢ়گ ※͜ͷྫʮొྃʯ
τϥϯβΫγϣϯ key_attr_* τϥϯβΫγϣϯʹؔΘΔՃ తͳใΛͭͬ͜Ή εΩʔϚɺkey_action ͝ͱʹ ҟͳΔ
τϥϯβΫγϣϯྫ key_action = shop:register:completed key_attr_user_id = xxxxx key_attr_ref = fb_share
τϥϯβΫγϣϯੳͷྫ SELECT v[‘key_attr_ref’], COUNT(*) FROM access WHERE v[‘key_action’] = ‘...’
GROUP BY v[‘key_attr_ref’]
τϥϯβΫγϣϯੳ ࠷ۙΑ͘ݟͯΔσʔλ ... Ͳͷࢪࡦ͕Ұ൪ޮ͍ͨͷ͔
ύλʔϯ Πϕϯτ
ΞΫηεʹґଘ͠ͳ͍ ΠϕϯτͷൃੜΛΓ͍ͨ
ɾ+BWB4DSJQUʹΑΔΠϕϯτ ɾϞʔμϧͷදࣔ ɾ5XJUUFS'BDFCPPLͷ γΣΞ ɾωΠςΟϒΞϓϦ
Πϕϯτ tag = app:action:location & some attributes
Πϕϯτྫ tag = shop:tweet:shop_item item_id = 1234 tweet_id = xxxxx
Πϕϯτੳͷྫ SELECT v[‘item_id’], COUNT(*) FROM events WHERE v[‘tag’] = ‘shop:tweet:shop_item’
GROUP BY v[‘item_id’]
τϥϯβΫγϣϯͱ ࣮Έ͔ΘΒͳ͍
εΩʔϚϨεϩάͷѻ͍ํͰ ࠷ॏཁͳͷ ղऍͷϧʔϧΛܾΊΔ͜ͱ
ଟ͕࣌ؒແ͍ͷͰ ͜ͷΜͰ
͜͏͍͏࣌ʹ ͜͏͍͏෩ʹσʔλͷूΊͯ ͜͏ղੳ͠Α͏ ͱ͍͏ͷΛڞ༗͍ͨ͠
ҙ͍ͨ͠ͱ͜Ζ
εΩʔϚϨεͱ͍͑Ͳ ࣄલͷϩάઃܭΛ͔ͬ͠ΓΔ ϩάҰೖΕΔͱมߋ͕͍͠ ˠੳ͍߲ͨ͠ͷ࿙Ε͕ແ͍͔ ϓϥΠόγʔʹؾΛ͚ͭΔ
None