Upgrade to Pro
— share decks privately, control downloads, hide ads and more …
Speaker Deck
Features
Speaker Deck
PRO
Sign in
Sign up for free
Search
Search
Gunosyのデータ分析基盤、ログ基盤の全容
Search
Sponsored
·
Your Podcast. Everywhere. Effortlessly.
Share. Educate. Inspire. Entertain. You do you. We'll handle the rest.
→
moyomot
February 28, 2017
14
9.6k
Gunosyのデータ分析基盤、ログ基盤の全容
moyomot
February 28, 2017
Tweet
Share
More Decks by moyomot
See All by moyomot
DRIVE CHARTのMLOpsを体感しよう
moyomot
0
170
現場課題に向き合い MLOps成熟度を高める道
moyomot
1
1.1k
第1回 Data-Centric AI勉強会 LT: AIドラレコを支える一貫性のあるデータの作り方
moyomot
0
1k
DRIVE CHARTにおけるAI開発とアーキテクチャ全容
moyomot
0
1.2k
これからの強化学習2.7
moyomot
0
140
これからの強化学習2.6
moyomot
0
210
GunosyにおけるSparkStreaming活用事例
moyomot
1
5.3k
トピックモデル第2章
moyomot
0
320
adhoc analysis apache spark
moyomot
1
1.1k
Featured
See All Featured
Organizational Design Perspectives: An Ontology of Organizational Design Elements
kimpetersen
PRO
1
73
Technical Leadership for Architectural Decision Making
baasie
1
230
Practical Tips for Bootstrapping Information Extraction Pipelines
honnibal
25
1.7k
Distributed Sagas: A Protocol for Coordinating Microservices
caitiem20
333
22k
Claude Code のすすめ
schroneko
67
210k
Docker and Python
trallard
47
3.7k
Tips & Tricks on How to Get Your First Job In Tech
honzajavorek
0
430
Building a Scalable Design System with Sketch
lauravandoore
463
34k
Measuring Dark Social's Impact On Conversion and Attribution
stephenakadiri
1
110
How to build a perfect <img>
jonoalderson
1
4.9k
Prompt Engineering for Job Search
mfonobong
0
150
Are puppies a ranking factor?
jonoalderson
1
2.7k
Transcript
Gunosyͷσʔλੳج൫ɺϩάج൫ͷશ༰ σʔλੳج൫Night #1 2017.2 ຊ ३࢘
©Gunosy Inc. 2 גࣜձࣾGunosy – ʮใΛੈքதͷਓʹ࠷దʹಧ͚Δʯ Gunosy ใΩϡϨʔγϣϯαʔϏεʮά ϊγʔʯͱ 20166݄1ʹKDDIגࣜձࣾͱڞಉͰϦϦʔεͨ͠
ແྉχϡʔε৴ΞϓϦʮχϡʔεύεʯΛఏڙ͢Δ ձࣾͰ͢ɻʮใΛੈքதͷਓʹ࠷దʹಧ͚ΔʯΛ Ϗδϣϯʹ׆ಈ͍ͯ͠·͢ɻ ωοτ্ʹଘࡏ͢Δ͞·͟·ͳใΛɺ ಠࣗͷΞϧΰϦζϜͰऩूɺධՁ͚Λߦ͍ Ϣʔβʔʹಧ͚·͢ɻ ใΩϡϨʔγϣϯαʔϏε ʮά ϊγʔʯ 200ഔମҎ্ͷχϡʔειʔεΛϕʔεʹɺ ৽ͨʹ։ൃͨ͠ใղੳɾ৴ٕज़Λ༻͍ͯࣗಈతʹ બఆͨ͠χϡʔεใΛϢʔβʔʹಧ͚·͢ɻ ແྉχϡʔε৴ΞϓϦ ʮχϡʔεύεʯ
©Gunosy Inc. 3 ࣗݾհ • ຊ ३࢘ • גࣜձࣾGunosy σʔλੳ෦
• هࣄ৴ΞϧΰϦζϜͷվળ • ϩάج൫ͷඋ
©Gunosy Inc. 4 ࠓ͓͢Δ͜ͱ • Gunosyσʔλੳ෦ͷओͳۀ • σʔλੳج൫ɺϩάج൫ͷίϯηϓτ • σʔλੳج൫ɺϩάج൫ͷશମ૾
• ֬ఆϩάج൫ • ใϩάج൫ • ͏·͍͍ͬͯ͘Δ͜ͱɾݱঢ়ͷ՝
©Gunosy Inc. 5 σʔλੳ෦ͷओͳۀ • ΞϓϦΛ༻͢ΔϢʔβʔͷຬ্ͷͨΊͷੳ • ࡏ࣌ؒɺܧଓ্ͷͨΊͷσʔλੳ • هࣄ৴ΞϧΰϦζϜͷվળ
• ΞϧΰϦζϜݕূɺϓϩμΫγϣϯίʔυͷ࣮ • KPIूܭ • ଞ෦ॺ͔ΒͷूܭґཔɺABςετͷఆ؍ଌ • KPIʹҟৗ͕͋Δ߹ͷݪҼڀ໌ • σʔλੳج൫ɺϩάج൫ͷඋ
©Gunosy Inc. 6 ߈ΊͷλεΫɺकΓͷλεΫ • ΞϓϦΛ༻͢ΔϢʔβʔͷຬ্ͷͨΊͷੳ • ࡏ࣌ؒɺܧଓ্ͷͨΊͷσʔλੳ • هࣄ৴ΞϧΰϦζϜͷվળ
• ΞϧΰϦζϜݕূɺϓϩμΫγϣϯίʔυͷ࣮ • KPIूܭ • ଞ෦ॺ͔ΒͷूܭґཔɺABςετͷఆ؍ଌ • KPIʹҟৗ͕͋Δ߹ͷݪҼڀ໌ • σʔλੳج൫ɺϩάج൫ͷඋ
©Gunosy Inc. 7 σʔλੳج൫ͷϙΠϯτ • कΓͷλεΫࣗಈԽͯ͠ɺ߈ΊͷλεΫʹྗ͍ͨ͠ • ͍͔ʹଟ͘ͷ߈ΊλεΫʹνϟϨϯδ͕Ͱ͖Δ͔ • ੜ͖ͬͨͷ͕ՁΛੜΈग़͢
• ͦͷͨΊͷج൫ΛͲͷΑ͏ʹͭ͘Δ͔ • Θ͔Γ͍͢μογϡϘʔυͷߏங • KPIʹҟৗ͕͋ͬͨ߹ͷݪҼڀ໌লྗԽ • େنσʔλͰεέʔϧͰ͖Δϩάج൫ • σʔλੳ͍͢͠ڥ
©Gunosy Inc. 8 Gunosyͷσʔλੳج൫ɺϩάج൫ • ϩάج൫શମ૾Λհ • ཁॴͰσʔλੳج൫͕ొ • Gunosyͷϩάج൫େ͖͘2ͭ
• ֬ఆϩάج൫ • KPI • هࣄ৴ΞϧΰϦζϜ • σʔλੳ • ใϩάج൫ • ใ༻KPIʢHourly Active Userʣ • هࣄ৴ΞϧΰϦζϜ
©Gunosy Inc. 9 ֬ఆϩάج൫ • શମ૾ Redshift ϩάαʔόʔ S3 SQS
ίϯόʔλʔ Fluentd BigQuery KPIόον αʔόʔ μογϡϘʔυ
©Gunosy Inc. 10 KPIμογϡϘʔυ • Redash • ༷ʑͳσʔλɾιʔεʹ౷ҰతʹΞΫηεͰ͖ΔWeb αʔϏε •
σʔλՄࢹԽπʔϧ • SQLͰ݁Ͱ͖ɺը໘ͷ࣮ෆཁ • DjangoΦϦδφϧμογϡϘʔυ • ϑϧεΫϥον࣮ͳͷͰɺࣗ༝ߴ͍ • ͍ʹ͑͠ΑΓར༻ • SQLͰ݁͠ͳ͍ࢦඪΛݟΔͱ͖ʹ࣮
©Gunosy Inc. 11 μογϡϘʔυͷ • 2ͭͷμογϡϘʔυͷॅΈ͚͕Ͱ͖͍ͯΔ • RedashͰඇΤϯδχΞूܭͰ͖Δ • SQLษڧձ
• SlackͰSQL෦ • ҟৗ࣌ʹݪҼڀ໌ΛμογϡϘʔυͷΈͰ݁Ͱ͖Δ • λΠϜϦʔʹҟৗʹؾ͚ͮΔ • SQLൃߦ͠ͳ͍ɺੜϩάݟͳ͍
©Gunosy Inc. 12 μογϡϘʔυͷ՝ • ҟৗ͔Ͳ͏͔ͷධՁࢦඪ͕ෆे • ҟৗ࣌μογϡϘʔυ͑͞ݟͨ͘ͳ͍ • Ͳͷࢦඪ͕ͳ͔ͥΛݕग़͠ɺSlackΞϥʔτ͢Δ
Έ͕΄͍͠
©Gunosy Inc. 13 RedshiftͱBigQueryͷ͍͚ • ҎલΑΓRedshiftΛ༻͖ͯͨ͠ • BigQuery2016ՆΑΓຊ֨ར༻։࢝ • RedshiftɺBigQuery྆ํʹ΄΅͓ͳ͡σʔλ͕ೖ͍ͬͯΔ
• Ұ෦ͷ͔ͳΓେ͖ͳσʔλBigQueryͷΈ • Redshiftʹґଘͨ͠KPIΫΤϦʢKPIूܭόονʣ͕ଟ • BigQueryͷҠߦఘΊͨ • ৽͍͠KPI࣌ؒͷ͔͔ΔΫΤϦBigQuery
©Gunosy Inc. 14 Redshiftͷྑ͍ͱ͜Ζ Good in Redshift: • ʢओ؍తʹʣγϯϓϧͳΫΤϦ •
Postgresqlϕʔε • created_at::date else: • ઃܭɺӡ༻ਏ͍ • distkeyͷઃܭ͍͠ • nodeՃ࣌ϝϯςφϯεϞʔυ
©Gunosy Inc. 15 BigQueryͷྑ͍ͱ͜Ζ Good in BigQuery: • ετϨʔδεέʔϥϒϧ else:
• Standard SQLͬͱαϙʔτͯ͠΄͍͠ • percentileΑ • SELECTΫΤϦͰςʔϒϧεΩϟϯ͢ΔύʔςΟγϣϯ ࢦఆͨ͘͠ͳ͍
©Gunosy Inc. 16 ࣗՈϩάίϯόʔλʔ • ੜϩάͷܗΛ୲ • JSONϩάΛύΠϓ۠Γͷϩάʹ • ΞϓϦόʔδϣϯɺσόΠε(iOS,
Android)ʹΑͬͯ ϑΥʔϚοτ͕ҟͳΔϩάͷܗ • ϩάܗ୭͕୲͢Δ͔ • ϩάαʔόʔʢ಄ͰΔ͔ʣ • ϩάίϯόʔλʔʢඌͰΔ͔ʣ • ϩάܗ͠ͳ͍ • ܗΛʮඌʯͰ࣮ࢪͨ͠΄͏͕ӡ༻ָ • ੜϩάɺܗϩά྆ऀΛอଘ͢Δӡ༻ָ͕
©Gunosy Inc. 17 ϩάઃܭΛϛεͬͨΒ • ᘳͳϩάઃܭ͍͠ • σόΠεɺόʔδϣϯɺ৽ػೳɺ৽UIͰࢥΘ͵ࠩҟ͕Ͱ ͯ͠·͏ •
Ͳ͔͜ͰέΞ͢Δඞཁ͕ੜ͡Δ • ϩάAPIαʔόʔʁ • ϩάίϯόʔλʔʁ • SQLΫΤϦʁ
©Gunosy Inc. 18 σʔλੳج൫ • πʔϧηοτJupyter+pandas+DB • ϩάΛ༻ͨ͠KPIʹؔ͢Δσʔλੳ • هࣄσʔλΛ༻ͨ͠ΞϧΰϦζϜվળͷͨΊͷੳ
Redshift BigQuery Pandas
©Gunosy Inc. 19 σʔλੳج൫ͷ՝ • ΞϧΰϦζϜվળͷͨΊͷੳج൫ෳσʔλɾιʔε ʹ·͕͍ͨͬͯΔ • هࣄσʔλ •
هࣄΫϥελϦϯάσʔλ • είΞϦϯάσʔλ • ౷ҰతʹΞΫηεͰ͖Δͱੳͷޮ্͕͢Δ • ϩάΛ༻ͨ͠ੳBigQueryҰຊͰ݁͢Δ
©Gunosy Inc. 20 ABςετج൫ɺΦϑϥΠϯςετج൫ • ΰδϥʢABςετج൫ʣ • ༝དྷɿABςετΛࢧ͢Δଘࡏ • ABςετͷׂɺ࣮ࢪঢ়گΛWebπʔϧͰཧ
• σϩϦΞϯʢΦϑϥΠϯςετج൫ʣ • աڈϩάΛ༻ͯ͠ΞϧΰϦζϜมߋͷྑ͠ѱ͠Λஅ ͢Δ • ABςετʢΦϯϥΠϯςετʣ࣌ؒΛཁ͢Δ • ධՁࢦඪͷબͼํ͕؊ • ݱࡏίϚϯυϥΠϯπʔϧͷΈ
©Gunosy Inc. 21 ใϩάج൫ • هࣄ৴ΞϧΰϦζϜͰ༻ • KPIใʢHourly Active UserʣͰ༻
• σʔλੳʹ༻ͯ͠ͳ͍ • Spark Streaming͔ΒKinesis AnalyticsҠߦத
©Gunosy Inc. 22 Spark StreamingΛ༻ͨ͠ใج൫ • Spark StreamingͰ࣌ؒͷϩάूܭΛ࣮ࢪ • ใΛ༻͢ΔγεςϜ͕ѻ͍͍͢Α͏ʹRDSʹอଘ
• ՝ • αʔόʔίετʢEMRίετʣ͕ͦΕͳΓ • ӡ༻ίετʢΫϥελཧίετʣ͕ͦΕͳΓ
©Gunosy Inc. 23 Kinesis AnalyticsΛ༻ͨ͠ใج൫ • ετϦʔϛϯάσʔλΛSQLΫΤϦͰूܭͰ͖ΔαʔϏε • ूܭ݁ՌFirehoseͰElasticsearchʹอଘ •
ΤϯδχΞ͕࣮͢ΔͷूܭΫΤϦͷΈ • ظؒͷӡ༻࣮͜Ε͔Β • ίετݮ
©Gunosy Inc. 24 ·ͱΊ • Gunosyσʔλੳج൫ɺϩάج൫ͷશ༰ʹ͍ͭͯհ • ֬ఆϩάج൫ • ใϩάج൫
• ߈ΊͷλεΫ • ੳ͍͢͠ڥͮ͘Γ • ෳͷσʔλιʔεʹ౷ҰతʹΞΫηε • େྔͷσʔλΛ༰қʹѻ͑Δڥ • कΓͷλεΫՄೳͳݶΓࣗಈԽ