Upgrade to Pro
— share decks privately, control downloads, hide ads and more …
Speaker Deck
Features
Speaker Deck
PRO
Sign in
Sign up for free
Search
Search
実践的データ基盤への処方箋_2-9_2-12
Search
Tomoya Koike
February 03, 2022
Programming
0
220
実践的データ基盤への処方箋_2-9_2-12
Tomoya Koike
February 03, 2022
Tweet
Share
More Decks by Tomoya Koike
See All by Tomoya Koike
CDLE youth LT会 #1
koikeya
0
97
Other Decks in Programming
See All in Programming
Rで始めるML・LLM活用入門
wakamatsu_takumu
0
170
AI主導でFastAPIのWebサービスを作るときに 人間が構造化すべき境界線
okajun35
0
700
AI時代のシステム設計:ドメインモデルで変更しやすさを守る設計戦略
masuda220
PRO
5
850
Go Conference mini in Sendai 2026 : Goに新機能を提案し実装されるまでのフロー徹底解説
yamatoya
0
560
Vuetify 3 → 4 何が変わった?差分と移行ポイント10分まとめ
koukimiura
0
120
Fundamentals of Software Engineering In the Age of AI
therealdanvega
1
240
TipKitTips
ktcryomm
0
160
maplibre-gl-layers - 地図に移動体たくさん表示したい
kekyo
PRO
0
240
Go 1.26でのsliceのメモリアロケーション最適化 / Go 1.26 リリースパーティ #go126party
mazrean
1
380
AIに任せる範囲を安全に広げるためにやっていること
fukucheee
0
130
Docコメントで始める簡単ガードレール
keisukeikeda
1
110
Claude Code Skill入門
mayahoney
0
190
Featured
See All Featured
Leadership Guide Workshop - DevTernity 2021
reverentgeek
1
230
Data-driven link building: lessons from a $708K investment (BrightonSEO talk)
szymonslowik
1
970
Claude Code のすすめ
schroneko
67
220k
Context Engineering - Making Every Token Count
addyosmani
9
740
Are puppies a ranking factor?
jonoalderson
1
3.1k
svc-hook: hooking system calls on ARM64 by binary rewriting
retrage
2
160
The Success of Rails: Ensuring Growth for the Next 100 Years
eileencodes
47
8k
Heart Work Chapter 1 - Part 1
lfama
PRO
5
35k
Taking LLMs out of the black box: A practical guide to human-in-the-loop distillation
inesmontani
PRO
3
2.1k
From Legacy to Launchpad: Building Startup-Ready Communities
dugsong
0
170
Exploring the Power of Turbo Streams & Action Cable | RailsConf2023
kevinliebholz
37
6.3k
The #1 spot is gone: here's how to win anyway
tamaranovitovic
2
980
Transcript
࣮ફత デ ʔλج൫ͷॲํᝦ ྠಡձ 2-9 ~ 2-12 খஐ࠸
2−9 ϩάऩूΤʔδΣϯτͷΩϟύγςΟʹҙ
ϩάͱ 3 Α͘ੳ͞ΕΔϩά 1. WebαʔόͷΞΫηεϩά 2. ΞϓϦέʔγϣϯͷϩά WebαʔόͷΞΫηεϩάͷྫ • ΞΫηεͨ࣌ؒ͠ɺURL
• ΞΫηεݩͷIPΞυϨε • ΞΫηεʹ༻͍ͨใ • Web App͕ઃఆͨ͠ɺϢʔβࣝผใ
ϩάϩάऩूΤʔδΣϯτͰऩू͢Δ 4 ϩάऩूΤʔδΣϯτ • όοϑΝʹΑΓϩάऩूϚωʔδϟͷෛՙΛҰఆʹͰ͖Δ • όοϑΝ͕ᷓΕͳ͍Α͏ʹαΠζΛ֬อ͢Δ͜ͱʹҙ͕ඞཁ
ϩάऩू͕Ͱ͖Δ 5 ໊ ఏڙํ๏ ఏڙɾαϙʔτ ͍ͯ͠Δձࣾ fluentd, fluent-bit OSS Treasure
Data Logstash OSS Elastic CloudWatch Cloud AWS Cloud Logging Agent Cloud GCP
2−10 σʔλͷऩूқ͕ߴ͍ͨΊ Ͱ͖Δ͚ͩΛར༻͠ແཧͳΒࣗ࡞͢Δ
σʔλେྔ͕ͩ༗༻ 7 දతͳσʔλ ϒϥβΠϕϯτ εϚϗΞϓϦΠϕϯτ IoTσόΠεσʔλ ը໘ͷεΫϩʔϧϚεͷيɺϢʔβͷϒϥβ্Ͱͷૢ࡞σʔλ εϚϗΞϓϦ্ͰͷϢʔβͷૢ࡞σʔλ ंࡌηϯαʔͷσʔλɺڥηϯαɺΤΞϥϒϧσόΠε
ϒϥβΠϕϯτεϚϗΞϓϦΠϕϯτσʔλऩूΛར༻ 8 ཉ͍͠σʔλΛऩूͰ͖Δ͕ͳ͍͔Λ୳͢ Ϣʔβͷ্ͷߦಈੳπʔϧଟ͘ଘࡏ͢Δ ϒϥβΠϕϯτ εϚϗΞϓϦΠϕϯτ Adobe AnalyticsGoogle Analytics
Google Analytics For Firebase
ࣗ࡞͢Δ߹ࢄϝοηʔδΩϡʔΛ͏ 9 ϩάऩूπʔϧΛࣗ࡞͢Δ໘ • IoTσόΠεͷσʔλͰɺΫϥυαʔϏεʹ͍͍ͷ͕ͳ͍ͱ͖ • ΞΫηεղੳπʔϧͰेͳσʔλ͕ಘΒΕͳ͍߹ • ϩά༰ΛϦΞϧλΠϜʹ׆༻͢Δඞཁ͕͋Δͱ͖I ࢄϝοηʔδΩϡʔʹϩάΛૹ৴͢Δ
ࣗ࡞͢Δ߹ࢄϝοηʔδΩϡʔΛ͏ 10 ࢄϝοηʔδΩϡʔ ΩϡʔͱɺઌೖΕઌग़͠ͷσʔλߏͷ͜ͱɻ ΩϡʔΠϯά͢Δϓϩσϡʔαͱɺpop͢ΔίϯγϡʔϚͷ2ͭͷׂ͕͋Δɻ
ࢄϝοηʔδΩϡʔͷҙ͖͢ಛͱӡ༻ͷίπ 11 ࢄϝοηʔδΩϡʔͷҙ͖͢ಛ 1. ॱংੑอূͷ༗ແ…ॱং͕ඞཁͳ߹λΠϜελϯϓΛೖΕͯฒͼସ͑Δ͕ඞཁ 2. ϝοηʔδͷॏෳ༗ແ…ॲཧΛႈʹ͢Δ͔ɺॲཧ༗ແͷஅϩδοΫΛೖΕΔ 3. ՄࢹੑλΠϜΞτ…ॲཧ࣌ؒΑΓ͍ͱɺॲཧ͕2ճҎ্Δ ӡ༻ͷίπ
• ίϯγϡʔϚ͕ॲཧʹࣦഊ͠ଓ͚ͨσουϨλʔΛઐ༻ͷΩϡʔʹೖΕΔ • ϓϩσϡʔαʔ͕ੜྔΛ੍͢ΔόοΫϓϨογϟʔͱ͍͏ΈΛೖΕΔ
۩ମతͳγεςϜͷ࡞Γํ 12 ࢄϝοηʔδΩϡʔ
2−11 ETLΛબͿϙΠϯτར༻͢Δ ίωΫλͷػೳੑͱσόοάͷ͢͠͞
ETLͱ 14 ఏڙܗଶͷҧ͍ ETL…Extract Transform LoadͷུͰɺσʔλͷநग़ɺՃɺϩʔυΛҙຯ͢Δɻ ෳࡶͳՃ͕Ͱ͖Δ͔Ͳ͏͔ ҟͳΔσʔλιʔε͔ΒͷσʔλΛՃɺ݁߹ͯ͠ϩʔυ͢Δɻ Apache Nifi,
DataSpider, Glue, Cloud Data Fusion OSSɺ༗ঈɺΫϥυͷ3छྨɻ OSS…embulkfluentd, Apache Sqoop, ༗ঈͱͯ͠DataSpider, ASTERIAͳͲɻ ΫϥυͰAWSͷGlueGCPͷCloud Data Fusion, ྆αʔϏεͷDMS, embulkͷϚωʔδυαʔϏεͰ͋ΔtroccoͳͲɻ ※DMS…Database Migration Service
͏ίωΫλͷػೳΛॏࢹ͢Δ 15 MySQLίωΫλͰ͋ΕɺWHERE۟ࠩͷΈऩू͕Ͱ͖Δ͔ɻ ϏοάσʔλͰ͋ΕɺࢄॲཧͰ͖Δ͔ʹҙ͢Δɻ
ιʔείʔυϨϕϧͰσόοά͍͢͠ͷΛར༻͢Δ 16 όά͕ى͖ͨͱ͖ɺσʔλιʔεɾ֨ೲઌɾऩू͠Α͏ͱ͢ΔσʔλͷΈ߹ ΘͤʹΑͬͯ࠶ݱྫ͕ͳ͍͜ͱɻ ίωΫλͷιʔείʔυΛݟʹߦ͚Δͷ͕େࣄɻ ETLͷαϙʔτʹௐࠪͯ͠Β͏ͱ͖ɺσʔλج൫ʹೖͬͯΒ͏Α͏४උɻ σʔλʹґଘͯ͠ى͖Δόά ఆ͍ͯ͠ͳ͍จࣈίʔυ੍ޚจࣈɺվߦίʔυ nullΛظ͢Δͱ͜Ζʹۭจࣈྻ
ΤϯδχΞ͕͍ͳ͚ΕϓϩάϥϛϯάϨεͷETLબࢶͷ1ͭ 17 ઐ༻ͷը໘্Ͱσʔλιʔε֨ ೲઌͷΞΠίϯΛͭͳ͛ͯETLॲ ཧΛఆٛɺσϓϩΠͰ͖Δɻ Apache Nifi, Talend, DataSpider, ASTERIA,
Glue, Cloud Data Fusion ͳͲɻ
2−12 σʔλϨΠΫͰऩूͨ͠σʔλ Λͳ͘͞ͳ͍Α͏ʹ͢Δ
ऩूͨ͠σʔλΛݪଇͦͷ··ੵ͢Δ 19 σʔλϨΠΫʹऩूͨ͠σʔλΛՃͤͣʹ֨ೲ͢Δ σʔλϨΠΫʹԽͰ͖༰ྔ͕֦ுͰ͖ΔΛબͿ ऩूͨ͠σʔλΛͳ͘͞ͳ͍ͨΊʹԽ͢Δ͜ͱɺ σʔλ༰ྔΛ૿ͤΔΑ͏ʹ͓ͯ͘͜͠ͱ͕ॏཁɻ ϑΝΠϧJSONܗࣜɺςʔϒϧߏͳͲΛͦͷ··อଘɻ Ճʹࣦഊͯ͠σʔλଛࣦ͢Δ͜ͱΛ͙ͨΊɻ ػີใݸਓใಗ໊ԽΛߦͬͯੵ͢Δɻ
ϑΝΠϧΦϒδΣΫτετϨʔδʹੵ͢Δ 20 ෳͷσʔληϯλʔͰෳσόΠεʹ Խͯ͠อଘ͢Δ͜ͱͰɺ ΠϨϒϯφΠϯͷݎ࿚ੑͱ 99.99%ͷՄ༻ੑΛ࣮ݱ͍ͯ͠Δɻ σʔλΛʮΦϒδΣΫτʯͱ͍͏୯ҐͰѻ͏هԱஔɻ ΫϥυαʔϏεͱͯ͠S3Cloud StorageͳͲ͕͋Δɻ ΦϒδΣΫτετϨʔδ
ΦϯϓϨͷ߹ࢄετϨʔδΛར༻ɻ OSSͱͯ͠HDFSͳͲɻ
CSVJSONσʔλσʔλϕʔεʹೖΕͯOK 21 CSVJSONσʔλΛDWH༻ੳDBʹೖΕΔ ੳ༻DBͷதͰɺੜͷσʔλ Λ֨ೲ͢ΔσʔλϨΠΫͱ Ճ͞ΕͨσʔλΛ࣋ͭDWH ʹ͚Δ JSONʹ͍ͭͯɺจࣈྻܕ·ͨJSONܕͱͯ֨͠ೲ͢Δ
σʔλ͕ΦϯϓϨϛεʹ͋ͬͯσʔλϨΠΫΫϥυʹ͢Δ 22 3ͭͷཧ༝ ैྔ՝ۚͰར༻Ͱ͖ΔͨΊ ٱੑ͕ߴ͍ͨΊ ӡ༻ਓ݅අ͕͍҆ ج൫ߏங࣌ʹσʔλྔΛਖ਼֬ʹݟੵΔͷࠔͳͨΊ AWS S3ͷٱੑΠϨϒϯφΠϯͰ͋ΓɺΦϯϓϨͰఢΘͳ͍ αʔόͷߏΛؾʹ͠ͳͯ͘Α͍ͷͰɺඞཁͳٕज़ྗ͕Լ͕Δ
σϝϦοτͱͯ͠ɺࡉ҆͘ఆ͠ͳ͍ωοτճઢ͔ɺߴ͍ઐ༻ઢ͔Λ༻͍Δඞཁ͕͋Δ