Upgrade to Pro
— share decks privately, control downloads, hide ads and more …
Speaker Deck
Features
Speaker Deck
PRO
Sign in
Sign up for free
Search
Search
Gunosyのデータ分析基盤、ログ基盤の全容
Search
Sponsored
·
Your Podcast. Everywhere. Effortlessly.
Share. Educate. Inspire. Entertain. You do you. We'll handle the rest.
→
moyomot
February 28, 2017
9.7k
14
Share
Embed
Copy iframe code
Copy JS code
Copy link
Start on current slide
Gunosyのデータ分析基盤、ログ基盤の全容
moyomot
February 28, 2017
More Decks by moyomot
See All by moyomot
DRIVE CHARTのMLOpsを体感しよう
moyomot
0
200
現場課題に向き合い MLOps成熟度を高める道
moyomot
1
1.1k
第1回 Data-Centric AI勉強会 LT: AIドラレコを支える一貫性のあるデータの作り方
moyomot
0
1.1k
DRIVE CHARTにおけるAI開発とアーキテクチャ全容
moyomot
0
1.3k
これからの強化学習2.7
moyomot
0
150
これからの強化学習2.6
moyomot
0
220
GunosyにおけるSparkStreaming活用事例
moyomot
1
5.4k
トピックモデル第2章
moyomot
0
340
adhoc analysis apache spark
moyomot
1
1.2k
Featured
See All Featured
Easily Structure & Communicate Ideas using Wireframe
afnizarnur
194
17k
SEO Brein meetup: CTRL+C is not how to scale international SEO
lindahogenes
1
2.7k
Tell your own story through comics
letsgokoyo
1
960
Jamie Indigo - Trashchat’s Guide to Black Boxes: Technical SEO Tactics for LLMs
techseoconnect
PRO
0
170
Information Architects: The Missing Link in Design Systems
soysaucechin
0
970
Building AI with AI
inesmontani
PRO
1
1.1k
Discover your Explorer Soul
emna__ayadi
2
1.1k
Why You Should Never Use an ORM
jnunemaker
PRO
61
9.9k
Lightning Talk: Beautiful Slides for Beginners
inesmontani
PRO
2
580
Context Engineering - Making Every Token Count
addyosmani
9
970
Design and Strategy: How to Deal with People Who Don’t "Get" Design
morganepeng
133
19k
[RailsConf 2023 Opening Keynote] The Magic of Rails
eileencodes
31
10k
Transcript
Gunosyͷσʔλੳج൫ɺϩάج൫ͷશ༰ σʔλੳج൫Night #1 2017.2 ຊ ३࢘
©Gunosy Inc. 2 גࣜձࣾGunosy – ʮใΛੈքதͷਓʹ࠷దʹಧ͚Δʯ Gunosy ใΩϡϨʔγϣϯαʔϏεʮά ϊγʔʯͱ 20166݄1ʹKDDIגࣜձࣾͱڞಉͰϦϦʔεͨ͠
ແྉχϡʔε৴ΞϓϦʮχϡʔεύεʯΛఏڙ͢Δ ձࣾͰ͢ɻʮใΛੈքதͷਓʹ࠷దʹಧ͚ΔʯΛ Ϗδϣϯʹ׆ಈ͍ͯ͠·͢ɻ ωοτ্ʹଘࡏ͢Δ͞·͟·ͳใΛɺ ಠࣗͷΞϧΰϦζϜͰऩूɺධՁ͚Λߦ͍ Ϣʔβʔʹಧ͚·͢ɻ ใΩϡϨʔγϣϯαʔϏε ʮά ϊγʔʯ 200ഔମҎ্ͷχϡʔειʔεΛϕʔεʹɺ ৽ͨʹ։ൃͨ͠ใղੳɾ৴ٕज़Λ༻͍ͯࣗಈతʹ બఆͨ͠χϡʔεใΛϢʔβʔʹಧ͚·͢ɻ ແྉχϡʔε৴ΞϓϦ ʮχϡʔεύεʯ
©Gunosy Inc. 3 ࣗݾհ • ຊ ३࢘ • גࣜձࣾGunosy σʔλੳ෦
• هࣄ৴ΞϧΰϦζϜͷվળ • ϩάج൫ͷඋ
©Gunosy Inc. 4 ࠓ͓͢Δ͜ͱ • Gunosyσʔλੳ෦ͷओͳۀ • σʔλੳج൫ɺϩάج൫ͷίϯηϓτ • σʔλੳج൫ɺϩάج൫ͷશମ૾
• ֬ఆϩάج൫ • ใϩάج൫ • ͏·͍͍ͬͯ͘Δ͜ͱɾݱঢ়ͷ՝
©Gunosy Inc. 5 σʔλੳ෦ͷओͳۀ • ΞϓϦΛ༻͢ΔϢʔβʔͷຬ্ͷͨΊͷੳ • ࡏ࣌ؒɺܧଓ্ͷͨΊͷσʔλੳ • هࣄ৴ΞϧΰϦζϜͷվળ
• ΞϧΰϦζϜݕূɺϓϩμΫγϣϯίʔυͷ࣮ • KPIूܭ • ଞ෦ॺ͔ΒͷूܭґཔɺABςετͷఆ؍ଌ • KPIʹҟৗ͕͋Δ߹ͷݪҼڀ໌ • σʔλੳج൫ɺϩάج൫ͷඋ
©Gunosy Inc. 6 ߈ΊͷλεΫɺकΓͷλεΫ • ΞϓϦΛ༻͢ΔϢʔβʔͷຬ্ͷͨΊͷੳ • ࡏ࣌ؒɺܧଓ্ͷͨΊͷσʔλੳ • هࣄ৴ΞϧΰϦζϜͷվળ
• ΞϧΰϦζϜݕূɺϓϩμΫγϣϯίʔυͷ࣮ • KPIूܭ • ଞ෦ॺ͔ΒͷूܭґཔɺABςετͷఆ؍ଌ • KPIʹҟৗ͕͋Δ߹ͷݪҼڀ໌ • σʔλੳج൫ɺϩάج൫ͷඋ
©Gunosy Inc. 7 σʔλੳج൫ͷϙΠϯτ • कΓͷλεΫࣗಈԽͯ͠ɺ߈ΊͷλεΫʹྗ͍ͨ͠ • ͍͔ʹଟ͘ͷ߈ΊλεΫʹνϟϨϯδ͕Ͱ͖Δ͔ • ੜ͖ͬͨͷ͕ՁΛੜΈग़͢
• ͦͷͨΊͷج൫ΛͲͷΑ͏ʹͭ͘Δ͔ • Θ͔Γ͍͢μογϡϘʔυͷߏங • KPIʹҟৗ͕͋ͬͨ߹ͷݪҼڀ໌লྗԽ • େنσʔλͰεέʔϧͰ͖Δϩάج൫ • σʔλੳ͍͢͠ڥ
©Gunosy Inc. 8 Gunosyͷσʔλੳج൫ɺϩάج൫ • ϩάج൫શମ૾Λհ • ཁॴͰσʔλੳج൫͕ొ • Gunosyͷϩάج൫େ͖͘2ͭ
• ֬ఆϩάج൫ • KPI • هࣄ৴ΞϧΰϦζϜ • σʔλੳ • ใϩάج൫ • ใ༻KPIʢHourly Active Userʣ • هࣄ৴ΞϧΰϦζϜ
©Gunosy Inc. 9 ֬ఆϩάج൫ • શମ૾ Redshift ϩάαʔόʔ S3 SQS
ίϯόʔλʔ Fluentd BigQuery KPIόον αʔόʔ μογϡϘʔυ
©Gunosy Inc. 10 KPIμογϡϘʔυ • Redash • ༷ʑͳσʔλɾιʔεʹ౷ҰతʹΞΫηεͰ͖ΔWeb αʔϏε •
σʔλՄࢹԽπʔϧ • SQLͰ݁Ͱ͖ɺը໘ͷ࣮ෆཁ • DjangoΦϦδφϧμογϡϘʔυ • ϑϧεΫϥον࣮ͳͷͰɺࣗ༝ߴ͍ • ͍ʹ͑͠ΑΓར༻ • SQLͰ݁͠ͳ͍ࢦඪΛݟΔͱ͖ʹ࣮
©Gunosy Inc. 11 μογϡϘʔυͷ • 2ͭͷμογϡϘʔυͷॅΈ͚͕Ͱ͖͍ͯΔ • RedashͰඇΤϯδχΞूܭͰ͖Δ • SQLษڧձ
• SlackͰSQL෦ • ҟৗ࣌ʹݪҼڀ໌ΛμογϡϘʔυͷΈͰ݁Ͱ͖Δ • λΠϜϦʔʹҟৗʹؾ͚ͮΔ • SQLൃߦ͠ͳ͍ɺੜϩάݟͳ͍
©Gunosy Inc. 12 μογϡϘʔυͷ՝ • ҟৗ͔Ͳ͏͔ͷධՁࢦඪ͕ෆे • ҟৗ࣌μογϡϘʔυ͑͞ݟͨ͘ͳ͍ • Ͳͷࢦඪ͕ͳ͔ͥΛݕग़͠ɺSlackΞϥʔτ͢Δ
Έ͕΄͍͠
©Gunosy Inc. 13 RedshiftͱBigQueryͷ͍͚ • ҎલΑΓRedshiftΛ༻͖ͯͨ͠ • BigQuery2016ՆΑΓຊ֨ར༻։࢝ • RedshiftɺBigQuery྆ํʹ΄΅͓ͳ͡σʔλ͕ೖ͍ͬͯΔ
• Ұ෦ͷ͔ͳΓେ͖ͳσʔλBigQueryͷΈ • Redshiftʹґଘͨ͠KPIΫΤϦʢKPIूܭόονʣ͕ଟ • BigQueryͷҠߦఘΊͨ • ৽͍͠KPI࣌ؒͷ͔͔ΔΫΤϦBigQuery
©Gunosy Inc. 14 Redshiftͷྑ͍ͱ͜Ζ Good in Redshift: • ʢओ؍తʹʣγϯϓϧͳΫΤϦ •
Postgresqlϕʔε • created_at::date else: • ઃܭɺӡ༻ਏ͍ • distkeyͷઃܭ͍͠ • nodeՃ࣌ϝϯςφϯεϞʔυ
©Gunosy Inc. 15 BigQueryͷྑ͍ͱ͜Ζ Good in BigQuery: • ετϨʔδεέʔϥϒϧ else:
• Standard SQLͬͱαϙʔτͯ͠΄͍͠ • percentileΑ • SELECTΫΤϦͰςʔϒϧεΩϟϯ͢ΔύʔςΟγϣϯ ࢦఆͨ͘͠ͳ͍
©Gunosy Inc. 16 ࣗՈϩάίϯόʔλʔ • ੜϩάͷܗΛ୲ • JSONϩάΛύΠϓ۠Γͷϩάʹ • ΞϓϦόʔδϣϯɺσόΠε(iOS,
Android)ʹΑͬͯ ϑΥʔϚοτ͕ҟͳΔϩάͷܗ • ϩάܗ୭͕୲͢Δ͔ • ϩάαʔόʔʢ಄ͰΔ͔ʣ • ϩάίϯόʔλʔʢඌͰΔ͔ʣ • ϩάܗ͠ͳ͍ • ܗΛʮඌʯͰ࣮ࢪͨ͠΄͏͕ӡ༻ָ • ੜϩάɺܗϩά྆ऀΛอଘ͢Δӡ༻ָ͕
©Gunosy Inc. 17 ϩάઃܭΛϛεͬͨΒ • ᘳͳϩάઃܭ͍͠ • σόΠεɺόʔδϣϯɺ৽ػೳɺ৽UIͰࢥΘ͵ࠩҟ͕Ͱ ͯ͠·͏ •
Ͳ͔͜ͰέΞ͢Δඞཁ͕ੜ͡Δ • ϩάAPIαʔόʔʁ • ϩάίϯόʔλʔʁ • SQLΫΤϦʁ
©Gunosy Inc. 18 σʔλੳج൫ • πʔϧηοτJupyter+pandas+DB • ϩάΛ༻ͨ͠KPIʹؔ͢Δσʔλੳ • هࣄσʔλΛ༻ͨ͠ΞϧΰϦζϜվળͷͨΊͷੳ
Redshift BigQuery Pandas
©Gunosy Inc. 19 σʔλੳج൫ͷ՝ • ΞϧΰϦζϜվળͷͨΊͷੳج൫ෳσʔλɾιʔε ʹ·͕͍ͨͬͯΔ • هࣄσʔλ •
هࣄΫϥελϦϯάσʔλ • είΞϦϯάσʔλ • ౷ҰతʹΞΫηεͰ͖Δͱੳͷޮ্͕͢Δ • ϩάΛ༻ͨ͠ੳBigQueryҰຊͰ݁͢Δ
©Gunosy Inc. 20 ABςετج൫ɺΦϑϥΠϯςετج൫ • ΰδϥʢABςετج൫ʣ • ༝དྷɿABςετΛࢧ͢Δଘࡏ • ABςετͷׂɺ࣮ࢪঢ়گΛWebπʔϧͰཧ
• σϩϦΞϯʢΦϑϥΠϯςετج൫ʣ • աڈϩάΛ༻ͯ͠ΞϧΰϦζϜมߋͷྑ͠ѱ͠Λஅ ͢Δ • ABςετʢΦϯϥΠϯςετʣ࣌ؒΛཁ͢Δ • ධՁࢦඪͷબͼํ͕؊ • ݱࡏίϚϯυϥΠϯπʔϧͷΈ
©Gunosy Inc. 21 ใϩάج൫ • هࣄ৴ΞϧΰϦζϜͰ༻ • KPIใʢHourly Active UserʣͰ༻
• σʔλੳʹ༻ͯ͠ͳ͍ • Spark Streaming͔ΒKinesis AnalyticsҠߦத
©Gunosy Inc. 22 Spark StreamingΛ༻ͨ͠ใج൫ • Spark StreamingͰ࣌ؒͷϩάूܭΛ࣮ࢪ • ใΛ༻͢ΔγεςϜ͕ѻ͍͍͢Α͏ʹRDSʹอଘ
• ՝ • αʔόʔίετʢEMRίετʣ͕ͦΕͳΓ • ӡ༻ίετʢΫϥελཧίετʣ͕ͦΕͳΓ
©Gunosy Inc. 23 Kinesis AnalyticsΛ༻ͨ͠ใج൫ • ετϦʔϛϯάσʔλΛSQLΫΤϦͰूܭͰ͖ΔαʔϏε • ूܭ݁ՌFirehoseͰElasticsearchʹอଘ •
ΤϯδχΞ͕࣮͢ΔͷूܭΫΤϦͷΈ • ظؒͷӡ༻࣮͜Ε͔Β • ίετݮ
©Gunosy Inc. 24 ·ͱΊ • Gunosyσʔλੳج൫ɺϩάج൫ͷશ༰ʹ͍ͭͯհ • ֬ఆϩάج൫ • ใϩάج൫
• ߈ΊͷλεΫ • ੳ͍͢͠ڥͮ͘Γ • ෳͷσʔλιʔεʹ౷ҰతʹΞΫηε • େྔͷσʔλΛ༰қʹѻ͑Δڥ • कΓͷλεΫՄೳͳݶΓࣗಈԽ