Upgrade to Pro
— share decks privately, control downloads, hide ads and more …
Speaker Deck
Features
Speaker Deck
PRO
Sign in
Sign up for free
Search
Search
Design Patterns for Collecting and Analyzing Sc...
Search
Sotaro Karasawa
August 09, 2013
Technology
5
870
Design Patterns for Collecting and Analyzing Schemaless Log
スキーマレスなログデータの収集と集計のためのデザインパターン
at
http://www.zusaar.com/event/876003
Sotaro Karasawa
August 09, 2013
Tweet
Share
More Decks by Sotaro Karasawa
See All by Sotaro Karasawa
大「個人開発サービス」時代に僕たちはどう生きるか
sotarok
22
12k
P2B Haus法人サポータープランのご提案
sotarok
2
1.6k
ソフトウェアxスタートアップから見た飲食と配送の世界 / The World of Food Deliverlies and Restaurant Businesses from a Software and Startup Perspective
sotarok
2
1.3k
CTO 3度目の正直 / My 3rd CTO Career
sotarok
21
11k
Introduction to the Corporate Solutions Engineering at MTC2018
sotarok
1
36k
Mercari meetup for Corporate Engineering #1 / What is "Corporate Engineering"?
sotarok
2
2.4k
Markdown and WYSIWYG
sotarok
1
6.3k
20 Jan 2017 / Moving Beyond Borders - Mercari DAY
sotarok
8
15k
PHPBLT の心得 / PHPBLT #5 @ペパボ
sotarok
5
3.6k
Other Decks in Technology
See All in Technology
みんなでAI上手ピーポーになろう! / Let’s All Get AI-Savvy!
kaminashi
0
160
フルカイテン株式会社 エンジニア向け採用資料
fullkaiten
0
10k
スクラムを一度諦めたチームにアジャイルコーチが入ってどう変化したか / A Team's Second Try at Scrum with an Agile Coach
kaonavi
0
270
AIAgentを駆使してSREが貢献する開発体験の向上
yoshiiryo1
1
100
なぜCREを8年間続けているのか / cre-camp-4-2026-01-21
missasan
0
110
Bill One 開発エンジニア 紹介資料
sansan33
PRO
4
17k
歴史から学ぶ、Goのメモリ管理基礎
logica0419
14
2.9k
Kaggleコンペティション「MABe Challenge - Social Action Recognition in Mice」振り返り
yu4u
1
600
RALGO : AIを組織に組み込む方法 -アルゴリズム中心組織設計- #RSGT2026 / RALGO: How to Integrate AI into an Organization – Algorithm-Centric Organizational Design
kyonmm
PRO
3
1.5k
1万人を変え日本を変える!!多層構造型ふりかえりの大規模組織変革 / 20260108 Kazuki Mori
shift_evolve
PRO
6
1.6k
kintone開発のプラットフォームエンジニアの紹介
cybozuinsideout
PRO
0
540
Introduction to Sansan, inc / Sansan Global Development Center, Inc.
sansan33
PRO
0
2.9k
Featured
See All Featured
Art, The Web, and Tiny UX
lynnandtonic
304
21k
Principles of Awesome APIs and How to Build Them.
keavy
127
17k
Digital Ethics as a Driver of Design Innovation
axbom
PRO
0
140
Jamie Indigo - Trashchat’s Guide to Black Boxes: Technical SEO Tactics for LLMs
techseoconnect
PRO
0
43
Faster Mobile Websites
deanohume
310
31k
How to Ace a Technical Interview
jacobian
281
24k
The Success of Rails: Ensuring Growth for the Next 100 Years
eileencodes
47
7.9k
The untapped power of vector embeddings
frankvandijk
1
1.5k
SEOcharity - Dark patterns in SEO and UX: How to avoid them and build a more ethical web
sarafernandez
0
110
The Mindset for Success: Future Career Progression
greggifford
PRO
0
220
Fantastic passwords and where to find them - at NoRuKo
philnash
52
3.6k
The Cost Of JavaScript in 2023
addyosmani
55
9.4k
Transcript
Crocos, Inc. Sotaro Karasawa @sotarok http://facebook.com/sotarok εΩʔϚϨεͳ ϩάσʔλͷ ऩूͱूܭͷͨΊͷ σβΠϯύλʔϯ
#ds2013 ·ͨ Treasure Data ϋΠύʔ׆༻ज़
ࣗݾհ 4PUBSP,BSBTBXB!TPUBSPL ฑ૱ଠ EIBUFOBOFKQTPUBSPL גࣜձࣾΫϩίε$SPDPT*OD 1)1 3FE#VMM
ࣾһਓͰۀ ։ൃऀ࣌ਓ ݄ʹαʔϏεϩʔϯν ݄ʹ5%ಋೖ
ࠓ͍ͨ͜͠ͱ ΞϓϦέʔγϣϯϩάΛͲ͏ू ΊΔ͔ εΩʔϚϨεͳϩάͱ جຊతͳϩάઃܭ ϩάऩूͷσβΠϯύλʔϯ
ࠓͷత ใڞ༗ɺใަ ͏ͪͰ͜͏ͬͯΔΑɺͱ͍͏Ұྫ ܾͯ͠ߨࢣͱͯ͠ɺ͜͏Γ·͠ΐ͏ͱݴ ͍ʹདྷͨΘ͚Ͱͳ͘ɻ ࠓޙɺ͜͏͍͏ωλ͕σΟεΧογϣϯͰ ͖Ε͍͍ͳͱ
ओʹ 8FCΞϓϦέʔγϣϯ ͷ Ͱ͕͢ɺ8FCΞϓϦέʔγϣϯଟ༷Խ͠ ͍ͯ·͢ ޙͷσβΠϯύλʔϯͷͳ͔Ͱ͍͔ͭ͘ ৮ΕΒΕΔ͔ͳʁ
2ϩάऩूΛ͍ͯ͠Δ 2qVFOUEΛ͍ͬͯΔ 25%Λ͍ͬͯΔ
ͲΜͳϩάΛूΊͯΔʁ
8FCαʔόͷϩά
ϩάͱ͍͑ 8FCαʔόʔͷϩά 5SFBTVSF%BUBͷνϡʔτϦ Ξϧ"QBDIFͷϩά http://docs.treasure-data.com/articles/quickstart
͚ͩͲຊʹཉ͍͠ͷ
ͲΜͳϢʔβʔ͕ʁ ͲΜͳͰʁͲ͔͜Βʁ ͍ͭԿΛͨ͠ͷ͔ʁ ͲΜͳϘλϯΛΫϦοΫͨ͠ ͷ͔ʁλοϓͨ͠ͷ͔ʁ
ΞϓϦέʔγϣϯϩά
ͲΜͳϢʔβʔ͕ʁ ɹˠϢʔβʔొใ ͲΜͳͰʁͲ͔͜Βʁ ɹˠ6"(&0 ͍ͭԿΛͨ͠ͷ͔ʁ ɹˠ63*ΞΫγϣϯ
ΞϓϦέʔγϣϯϩάΛ Ͳ͏ूΊΔ͔
εΩʔϚϨεϩάͱʁ
εΩʔϚϨεϩάͱʁ εΩʔϚͷແ͍ϩά
ϩάͷεΩʔϚ ͜Ε·Ͱ ˠྫ͑547
ΧϥϜUJNF ΧϥϜTUBUVT ΧϥϜVSJ ΧϥϜVTFS@JE JOEFY
ΧϥϜUJNF ΧϥϜTUBUVT ΧϥϜVSJ ΧϥϜVTFS@JE JOEFY εΩʔϚ
for line in open('app.log', 'r'): columns = line.split("\t") time =
columns[0] ...
߲ͷΘ͔ΓͮΒ͞ εΩʔϚมߋͷ͠͞ ੳऀͱऩूऀͷೝࣝࠩҟʹ ΑΔࣄނ
5%ͷϩά ͱ͍͏͔qVFOUE +40/ { "time":1373876885, "status":200, "request_uri":"/52495/facebook", "session_id":"kn6avn2fuh21r25a65mgm3rjh3", "fb_id":"7c40c5dd2e55cde37a8c40ed80e1", ...
}
Θ͔Γ͍͢ ߲ΛՃͰ͖Δ σʔλྔ૿͑Δɾ
ΞϓϦέʔγϣϯϩάΛ Ͳ͏ूΊΔ͔
جຊతͳϩάઃܭ
ΠϕϯτϨίʔυͱͳΔΑ ͏ʹه͢Δ ˞8FCΞϓϦέʔγϣϯͷ߹ɺΞΫηε
Πϕϯτͱ 8FCΞϓϦέʔγϣϯͳΒ ɾΞΫηε ωΠςΟϒΞϓϦͳΒ ɾΠϕϯτ
جຊతͳεΩʔϚΛܾΊΔ
εΩʔϚϨεͱ͍ͬͯ Ͳ͏͍͏ϩάΛѻ͍ͬͯΔͷ͔ ֤ϨίʔυͰҙຯ͕ҧͬͯҙ ຯ͕ແ͍
جຊతͳεΩʔϚΛܾΊΔ UJNF TUBUVT VSJ VB SFGFSSFS
جຊతͳεΩʔϚΛܾΊΔ UJNF TUBUVT VSJ VB SFGFSSFS LTSVͬΆ໊͍લʹ ߹Θ͓ͤͯ͘ͱΘ ͔Γ͍͔͢
8FCαʔόʹ͋Δϩά ͚ͩͰͳ͘ BQQ SPVUF DPOUSPMMFS QSPDFTT@UJNF EFWJDF
8FCαʔόʹ͋Δϩά ͚ͩͰͳ͘ BQQ SPVUF DPOUSPMMFS QSPDFTT@UJNF EFWJDF ϑϨʔϜϫʔΫͰͷϧʔ ςΟϯά໊ͱ͔ɺίϯτ
ϩʔϥ໊ͱ͔ (uri ʹϊΠζ͕͋ͬͯ routing ໊ͰूܭͰ͖Δ)
ΞϓϦέʔγϣϯͷΓ͏Δ ଐੑΛඇਖ਼نԽͯ͠Ϩίʔυ ʹؚΊΔ
ඇਖ਼نԽ͞ΕͨϨίʔυ TFTTJPO@JE VTFS@JE HFOEFS BHF EFWJDF
ͳͥඇਖ਼نԽ͔ͷϝϦοτ +0*/ͤͣʹूܭؔʹ͔ΔͨΊ
ͪͳΈʹ VTFS@JE TFTTJPO@JE ͳͲIBTIԽ͓ͯ͘͠
·ͱΊΔͱ ΠϕϯτϨίʔυͱͳΔΑ͏ ʹه͢Δूܭؔʹ͔ΔͨΊ جຊతͳεΩʔϚΛܾΊΔ ΞϓϦέʔγϣϯͷΓ͏Δଐ ੑΛඇਖ਼نԽͯ͠ϨίʔυʹؚΊΔ
͜͜·ͰདྷΔͱɺ͏ੳ͕Մೳ
ੳͷྫ SELECT AVG(v[‘process_time’]) FROM access WHERE v[‘route’] = ‘crocos_index’
ੳͷྫ SELECT v[‘gender’], COUNT(*) FROM access GROUP BY v[‘gender’] ඇਖ਼نԽ͓͍ͯ͠
ͯΑ͔ͬͨʂ
ੳͷྫ SELECT v[‘gender’], COUNT(*) FROM access GROUP BY v[‘gender’]
ੳͷྫ Τϥʔͷௐࠪʹ SELECT v[‘route’], v[‘status’], v[‘ua’] FROM access WHERE v[‘user_id’]
= ‘xxx’
˞͘ͳΔͷͰؔ࿈ͷॲཧলུͯ͠·͢ ɹຊผʹ(3061#:ͨ͠Γ8&)&3۟ͰߜͬͨΓ
εΩʔϚϨεͳ ΞϓϦέʔγϣϯϩά ͷͨΊͷ σβΠϯύλʔϯ Λߟ͑Δ
ͯ͞ جຊతͳεΩʔϚΛ࣋ͭ ϩά͕ͨ·Γ࢝Ί·ͨ͠
͔͜͜Βઌ ԿΛੳΛ͍ͨ͠߹ʹ ͲΜͳϩάΛೖΕ͓͚ͯྑ ͍͔ ύλʔϯʹ͚ͯߟ͑·͢
εΩʔϚϨεͷग़൪
جຊతͳεΩʔϚ UJNF TUBUVT VSJ VB SFGFSSFS ͳΜͪΌΒ ͔ΜͪΌΒ
جຊతͳεΩʔϚ UJNF TUBUVT VSJ VB SFGFSSFS ͳΜͪΌΒ ͔ΜͪΌΒ ಛఆͷϨίʔυʹɺಛ
ผͳҙຯΛͨͤΔ͜ͱ ͕Ͱ͖Δʂ ͔͠ଞͷϨίʔυʹӨ ڹΛ͋ͨ͑Δ͜ͱͳ͘ɻ
ύλʔϯ τϥϯβΫγϣϯ
ಛผͳҙຯΛ࣋ͭ ΞΫγϣϯͷޭͳͲΛ ه͍ͨ͠
τϥϯβΫγϣϯ uri route: ϦΫΤετ͕དྷͨ͜ͱΘ͔Δ ͔͠͠ɺຊʹޭ͔ͨ͠ɺ ΞϓϦέʔγϣϯͰ͔͠Θ͔Β ͳ͍
τϥϯβΫγϣϯ key_action key_attr_*
τϥϯβΫγϣϯ key_action present:entry:completed ΞϓϦ:ಈ࡞:ঢ়گ ※͜ͷྫʮొྃʯ
τϥϯβΫγϣϯ key_attr_* τϥϯβΫγϣϯʹؔΘΔՃ తͳใΛͭͬ͜Ή εΩʔϚɺkey_action ͝ͱʹ ҟͳΔ
τϥϯβΫγϣϯྫ key_action = shop:register:completed key_attr_user_id = xxxxx key_attr_ref = fb_share
τϥϯβΫγϣϯੳͷྫ SELECT v[‘key_attr_ref’], COUNT(*) FROM access WHERE v[‘key_action’] = ‘...’
GROUP BY v[‘key_attr_ref’]
τϥϯβΫγϣϯੳ ࠷ۙΑ͘ݟͯΔσʔλ ... Ͳͷࢪࡦ͕Ұ൪ޮ͍ͨͷ͔
ύλʔϯ Πϕϯτ
ΞΫηεʹґଘ͠ͳ͍ ΠϕϯτͷൃੜΛΓ͍ͨ
ɾ+BWB4DSJQUʹΑΔΠϕϯτ ɾϞʔμϧͷදࣔ ɾ5XJUUFS'BDFCPPLͷ γΣΞ ɾωΠςΟϒΞϓϦ
Πϕϯτ tag = app:action:location & some attributes
Πϕϯτྫ tag = shop:tweet:shop_item item_id = 1234 tweet_id = xxxxx
Πϕϯτੳͷྫ SELECT v[‘item_id’], COUNT(*) FROM events WHERE v[‘tag’] = ‘shop:tweet:shop_item’
GROUP BY v[‘item_id’]
τϥϯβΫγϣϯͱ ࣮Έ͔ΘΒͳ͍
εΩʔϚϨεϩάͷѻ͍ํͰ ࠷ॏཁͳͷ ղऍͷϧʔϧΛܾΊΔ͜ͱ
ଟ͕࣌ؒແ͍ͷͰ ͜ͷΜͰ
͜͏͍͏࣌ʹ ͜͏͍͏෩ʹσʔλͷूΊͯ ͜͏ղੳ͠Α͏ ͱ͍͏ͷΛڞ༗͍ͨ͠
ҙ͍ͨ͠ͱ͜Ζ
εΩʔϚϨεͱ͍͑Ͳ ࣄલͷϩάઃܭΛ͔ͬ͠ΓΔ ϩάҰೖΕΔͱมߋ͕͍͠ ˠੳ͍߲ͨ͠ͷ࿙Ε͕ແ͍͔ ϓϥΠόγʔʹؾΛ͚ͭΔ
None