Upgrade to Pro
— share decks privately, control downloads, hide ads and more …
Speaker Deck
Features
Speaker Deck
PRO
Sign in
Sign up for free
Search
Search
俺のDXを実現するためのサーバレスなデータ基盤開発と運用 / Serverless Data ...
Search
Shinichi Nakagawa
PRO
February 10, 2023
Programming
13k
5
Share
俺のDXを実現するためのサーバレスなデータ基盤開発と運用 / Serverless Data Platform and Baseball
Developers Summit 2023登壇資料
https://event.shoeisha.jp/devsumi/20230209/session/4196/
Shinichi Nakagawa
PRO
February 10, 2023
More Decks by Shinichi Nakagawa
See All by Shinichi Nakagawa
野球解説AI Agentを開発してみた - 2026/02/27 LayerX社内LT会資料
shinyorke
PRO
0
430
WBCの解説は生成AIにやらせよう - 生成AIで野球解説者AI Agentを実現する / Baseball Commentator AI Agent for Gemini
shinyorke
PRO
1
420
自らを強いエンジニアにするための3つの習慣 2025/ Fitter happier more productive
shinyorke
PRO
0
290
生成AI時代におけるSREの進化とキャリア戦略 / Building an Embedded SRE team and my career
shinyorke
PRO
0
160
生成AIを活用した野球データ分析 - メジャーリーグ編 / Baseball Analytics for Gen AI
shinyorke
PRO
1
6.2k
ゼロから始めるSREの事業貢献 - 生成AI時代のSRE成長戦略と実践 / Starting SRE from Day One
shinyorke
PRO
3
7.7k
AI・LLM事業部のSREとタスクの自動運転
shinyorke
PRO
0
540
実践Dash - 手を抜きながら本気で作るデータApplicationの基本と応用 / Dash for Python and Baseball
shinyorke
PRO
2
4.4k
Terraform, GitHub Actions, Cloud Buildでデータ基盤をProvisioningする / Data Platform provisioning for Google Cloud and Terraform
shinyorke
PRO
2
3.7k
Other Decks in Programming
See All in Programming
ハーネスエンジニアリングにどう向き合うか 〜ルールファイルを超えて開発プロセスを設計する〜 / How to approach harness engineering
rkaga
24
15k
AIと共に生きる技術選定 2026
sgash708
0
110
Programming with a DJ Controller — not vibe coding
m_seki
3
610
Running Swift without an OS
kishikawakatsumi
0
860
The Less-Told Story of Socket Timeouts
coe401_
3
780
クラウドネイティブなエンジニアに向ける Raycastの魅力と実際の活用事例
nealle
2
230
PCOVから学ぶコードカバレッジ #phpcon_odawara
o0h
PRO
0
280
実践CRDT
tamadeveloper
0
600
Spec-driven Development: How AI Changes Everything (And Nothing)
simas
PRO
0
400
NakouPAY説明用
annouim0
0
280
Offline should be the norm: building local-first apps with CRDTs & Kotlin Multiplatform
renaudmathieu
0
230
AI-DLC Deep Dive
yuukiyo
9
5k
Featured
See All Featured
Believing is Seeing
oripsolob
1
120
[Rails World 2023 - Day 1 Closing Keynote] - The Magic of Rails
eileencodes
38
2.8k
Bash Introduction
62gerente
615
210k
GitHub's CSS Performance
jonrohan
1032
470k
Save Time (by Creating Custom Rails Generators)
garrettdimon
PRO
32
2.9k
Jamie Indigo - Trashchat’s Guide to Black Boxes: Technical SEO Tactics for LLMs
techseoconnect
PRO
0
120
The Curious Case for Waylosing
cassininazir
0
320
Evolution of real-time – Irina Nazarova, EuRuKo, 2024
irinanazarova
9
1.3k
Organizational Design Perspectives: An Ontology of Organizational Design Elements
kimpetersen
PRO
1
680
Mobile First: as difficult as doing things right
swwweet
225
10k
How STYLIGHT went responsive
nonsquared
100
6.1k
So, you think you're a good person
axbom
PRO
2
2k
Transcript
ਪ͠ਪͤΔ࣌ʹਪͤ. ͱ͍͏ԶͷDXΛ࣮ݱ͢ΔͨΊͷαʔόϨεͳσʔλج൫։ൃͱӡ༻. ٿσʔλΛఴ͑ͯ⽁ @shinyorke 2023/02/10 Developers Summit 2023
Onboardingʢ͜ͷηογϣϯͷ͝Ҋʣ • ࢲ͕झຯͰߏஙͨ͠ʮٿσʔλϓϥοτϑΥʔϜʯαʔϏεʹ͓͚Δ, ʮΫϥυωΠςΟϒͳσʔλΤϯδχΞϦϯάʯࣄྫͷհͰ͢. • Ϋϥυɾσʔλج൫Ͱඇৗʹ෯͍ൣғͷࣄྫͱͳΓ·͢. ࢿྉൃදޙʹެ։͠·͢&ؾʹͳΔՕॴࢿྉΛಡΉࣄΛਪ͠·͢. •
ٿʹڵຯͳ͍ʢor͖͡Όͳ͍ʣํͱҰॹʹָ͠ΊͨΒ͍Ͱ͢.
αʔϏεͷߏஙͰͬͨϓϩάϥϛϯάݴޠɾαʔϏεɾؔ࿈ٕज़ͷʑ ΊͬͪΌଟ͍ͷͰؾʹͳΔͱ͜Ζ͚ͩಡΉɾ֮͑ΔͰେৎͰ͢ʂ
͜ΜͳํʹΦεεϝͰ͢. • DWHσʔλج൫ͱ͍ͬͨʮσʔλར׆༻ʯͷ͍͍ײ͡ͳߏஙࣄྫΛΓ͍ͨํ. • ύϒϦοΫΫϥυ, ओʹGoogle CloudΛͬͯԿ͔͠ΒͷࣄΛ͢Δํ. • ΫϥυαʔϏεΛ͍͍ͬͨײ͡ͳγεςϜઃܭɾߏஙʹڵຯ͕͋Δํ. •
αʔόϨεͳΫϥυαʔϏεΛ͍ͬͯΔ or ͜Ε͔Β͏ํʢڵຯͰ͍͍ʣ. ※AWS Lambda, AWS App Runner, Google App Engine, Cloud RunͳͲ͕֘. • ʢ͖ݏ͍ؔͳ͘ʣٿͷϧʔϧٴͼΦΦλχαϯ͕୭͔͍ͩͬͯΔ.
Who am I ? ʢ͓લ୭Α?ʣ • Shinichi Nakagawa@shinyorke • shinyorkeʮ͠ΜΑʔ͘ʯͱಡΈ·͢
※໊લͷҰ෦ + ਪ͠όϯυͷϘʔΧϧ͔Βഈआ🎸 • େख֎ࢿܥITίϯαϧاۀϚωʔδϟʔ ΫϥυɾΠϯϑϥѻ͏νʔϜͰ৭ʑͬͯΔਓ • ࣄझຯϑϧαΠΫϧͳΤϯδχΞ • ʮLean Baseballʯͱ͍͏ݸਓϒϩάͰ ʮٕज़ʯʮٿʯʮΩϟϦΞʯʹ͍ͭͯ৭ʑॻ͍ͯ·͢. https://shinyorke.hatenablog.com/
ॏཁͳεϥΠυ͜ͷ৭Ͱ͢ ※10ຕͳ͍Ͱ͢, ࠷ѱ͜ͷ৭ͷεϥΠυ͚͍ͩͬͯͩ͘͞.
ຊͷελʔςΟϯάϝϯόʔ • ΫϥυͱσʔλΤϯδχΞϦϯάͰ࣮ݱ͢ΔԶͷDX • αʔόϨεͳΞʔΩςΫνϟͰ࣮ݱ͢Δʮ͍͍ײ͡ͳσʔλج൫ʯ • ʮਪ͠ͷٿબखʯͷʮਪ࣌͠ʯͲ͜ͳͷ͔ΛݟۃΊΔ⽁
ͱ͜ΖͰ, ʮਪ͠ʯͬͯԿ?
ਓͱ͋Γ·͕͢, ରϞϊͰ͋ΕԿͰ͋Εʮਪ͠ʯͬͯݴ͏ؾ͕͢Δʢখʣ গͳ͘ͱࢲͦͷೝࣝʢҟೝΊΔʣ “ਪ͠ʢ͓͠ʣͱɺओʹΞΠυϧആ༏ʹ ͍ͭͯ༻͍ΒΕΔຊޠͷଏޠͰ͋Γɺਓ ʹનΊ͍ͨͱࢥ͏΄ͲʹײΛ͍࣋ͬͯΔ ਓͷ͜ͱΛ͍͏ɻ” by Wikipedia
9
झຯʹ͓͚Δʮਪ͠ʯ͋Γ·͔͢? ྑ͔ͬͨΒͭͿ͍ͯΈ͍ͯͩ͘͞
ࢲͷʮझຯʹ͓͚Δਪ͠ʯ • ٿʢੲࣄʹ͍ͯͨ͠ʣ • ମೳྗ͕ߴͯ͘ෆࢥٞͳΩϟϥͷ֎ख • ηΠόʔϝτϦΫεʢٿ౷ܭֶʣϮλΫ • ͦͷଞͷझຯʢଞʹࢁ͋Δʣ •
ڝഅΛݟΔͷ͕େ͖ʢ੨ࣛໟɾࠇࣛໟͷഅ͕ಛʹ͖ʣ🐴 • 90s-00s͙Β͍ͷUK Rockେ͖🎸ʢblur, RADIOHEAD͕େʣ
ٕज़ʹ͓͚Δʮਪ͠ʯ͋Γ·͔͢? ྑ͔ͬͨΒͭͿ͍ͯΈ͍ͯͩ͘͞
ࢲͷʮਪٕ͠ज़ʯͷʑʢ΄ΜͷҰྫͰ͢, ͬͱͨ͘͞Μ͋Δʣ. ࣄझຯ͍ͭ͜ΕΒͷͷʹॿ͚ΒΕ͍ͯ·͢.
͜ͷઌʮٕज़ʯʮΫϥυʯʮٿʯͷʮਪ͠ʯͰ࣮ݱͨ͠ ʮΫϥυͱσʔλΤϯδχΞϦϯάͰ࣮ݱ͢ΔԶͷDXʯͷΛ͠·͢ʂ
ਪ͠ͷٿબखʮΦΦλχαϯͷ2022ʯΛৼΓฦΔ σʔλࢀরݩ: https://baseballsavant.mlb.com/statcast_search ※ΦʔϓϯσʔλͰ͢
2022ͷΦΦλχαϯ, εϥΠμʔͱ2γʔϜͰ ʮบ͕ੌ͍ʯϐονϟʔʹ • ࠓͷΦΦλχαϯ, ΊͬͪΌ εϥΠμʔ͍͛ͯΔ • ͓ؾ͖ͮͩΖ͏͔?ޙઓ
2γʔϜʢσʔλ্Sinkerʣ͕ ૿͍͑ͯΔ͜ͱʹ!? • γʔζϯޙ͔ΒͷΩϟϥมͰແঢ়ଶ. ਅ͙ͬओମͷελΠϧ͔Βมޭ.
ͱ͋ΔΦΦλχαϯͷొ൘ʢ2022/9/29, 8ճ10ୣࡾৼແࣦʣ શٿͷ͏ͪ, ७ਮͳετϨʔτʢ4-Seamʣ͕Θ͔ͣ3.7%, ଞͯ͢ۂ͛ΔϘʔϧ. ͛ͨॴʢัखઢʣ ϦϦʔεϙΠϯτʢัखઢʣ ٿछͷׂ߹
ϫΠʮਏ͍Ͱ͢…ຖճSQLͱίʔυΛॻ͖ଓ͚Δͷ͕ʯ ޢຎߦ͡Όͳ͍ͷͰͬͱؾָʹΓ͍ͨʢޢຎߦܦݧͳ͍Ͱ͕͢ʣ. ≒
ϫΠʮͦ͏ͩ, ⚾σʔλج൫࡞Ζ͏ʯ ԶͷDXϓϩδΣΫτ, ര. σʔλʢhttps://baseballsavant.mlb.com/ ʣ͕͍ʹ͍͘, ίʔυॻ͘ͷਏ͍. ͩͬͨΒࣗͰ࡞Γ͍͢ͷ࡞ͬͯ͠·͑ʂͱ͍͏ΞΠσΞ͕߱ྟ.
࡞Γ·ͨ͠ʢ૯࡞ظؒɾʣ.
ΞϓϦͱσʔλج൫ͷΞʔΩςΫνϟ ʢʮԶͷDXʯੈքͷਤʣ
None
ΞʔΩςΫνϟղઆʢ㲈ͩ͜ΘΓϙΠϯτʣ • ʮϑϧϚωʔδυ͔ͭαʔόϨεʯͳΫϥυར༻ • ͓ࡒʹ༏͍͠ߏͱ͍ํʹప͢Δ
ʮϑϧϚωʔδυ͔ͭαʔόϨεʯ ͳΫϥυαʔϏεͱ? Google CloudΛྫʹ͢Δͱ
ʲਤʳGoogle Cloudͷڞ༗ϞσϧʢComputeܥݶఆʣ
None
ʮϑϧϚωʔδυ͔ͭαʔόϨεʯͳΫϥυ • ʮϑϧϚωʔδυ͔ͭαʔόϨεͳΫϥυαʔϏεʯͷಛ • Πϯϑϥɾαʔόʔͷϝϯςφϯε͕ෆཁʢࣗ͡Όͳͯ͘, ΫϥυαʔϏεଆ͕Δʣ • ΑΓ۩ମతʹ, ࣗͰK8sΫϥελVMΛݐͯͳͯ͘ྑ͍αʔϏεͷ͜ͱ •
ʮϑϧϚωʔδυ͔ͭαʔόϨεʯʮख͕͔͔ؒΒͳ͍ʯ • ΄΅ϝϯςφϯεϑϦʔ.ϛυϧΣΞͷอकɾӡ༻͔Β։์͞ΕΔ. • εέʔϥϏϦςΟͷ୲อ͕͍͢͠. ඞཁʹԠͯ͡εέʔϧΞτɾεέʔϧΠϯָ͕. • Ұݟ͢ΔͱαʔόϨεߏύʔϑΣΫτʹݟ͑·͕͢, ϦΫΤετ͋ͨΓͷॲཧ࣌ؒ, ͑ΔϦιʔεྔͷ੍ݶɾϥϯλΠϜVersionͷ੍ݶ, ߟྀ͖͢ཹҙɾܽ͋Γ·͢.
ʁʁʁʮݸਓͰΫϥυαʔϏεΛ͕ͬͭΓ͏ͱ͓͕ۚ৺ʯ ͬͯࢥ͏͡Όͳ͍Ͱ͔͢, ΫϥυͱαʔόϨεͰ͍͍ײ͡ʹग़དྷ·͢.
αʔόϨεͳΫϥυ͓ࡒʹ༏͍͠ • ʮ͏ͱ্ཱ͖͚͕ͩͪΕ͍͍ʯͷͰΞϓϦέʔγϣϯج൫αʔόϨεʹશৼΓ • WebΞϓϦέʔγϣϯόονॲཧίʔϧυελϯόΠ • ʮಈ͍͍ͯͳ͍͓͕͔͔࣌ۚΔʯͷDWHͱετϨʔδͷΈʢͦΕ͘͝গֹʣ • DatabaseʢBigQueryʣͱCloud Storage͍ํΛ͠ίετΛ͑Δ
• ແྉͷϧʔϧΛཧղ͠, ͦͷൣᙝͰۃྗ͏ʢ㲈ඞཁͳͷʹ͓ۚΛ͏ʣ • σʔλͷظอϧʔϧͱετϨʔδλΠϓͷ, ҰׅॲཧͰͷσʔλΠϯϙʔτ
Ұϲ݄Ͱ$3ະຬ, ίʔώʔ3.34ഋఔͰࡁΈ·ͨ͠🐯 ※σʔλج൫ϓϩδΣΫτͷ࣮ίετΑΓࢉग़ʢίʔώʔίϯϏχίʔώʔج४ʣ, υϝΠϯऔಘྉ֎෦ίετΛআ͘
ΞʔΩςΫνϟղઆͷ·ͱΊ • ʮϑϧϚωʔδυ͔ͭαʔόϨεʯͳΫϥυαʔϏεத৺Ͱߏ • อकɾӡ༻࡞ۀ͔Βͷղ์, εέʔϥϏϦςΟͷ୲อʹ༗ར • αʔόϨεͳΒͰͷ੍ࣄ߲ʹҙʢॲཧ࣌ؒɾϦιʔεʣ • ͓ࡒʹ༏͍͍͠ํ͕Ͱ͖Δ
• ʮ͚͓͕͔͔ͬͨ࣌ͩۚΔʯίετߏͷม • ແྉɾσʔλظอϧʔϧΛ׆༻ͯ͠ίετѹॖ
Ϣʔεέʔε͝ͱͷٕज़બఆ μογϡϘʔυΞϓϦฤ
None
μογϡϘʔυΞϓϦͷϢʔεέʔε • σϞΞϓϦέʔγϣϯʢWebΞϓϦέʔγϣϯʣCloud Run্Ͱϗετ. ࣮PythonʢDashʣ. • όοΫΤϯυͷೝূAPI GatewayΛ༻͍ͯߦ͏, API KeyํࣜͷೝূʢSaaSతʹ͏ͨΊ͑ͯ͜͏͍ͬͯΔʣ.
• BigQueryʹ֨ೲ͞ΕͨσʔλΛݕࡧ͢ΔόοΫΤϯυʢRESTful APIʣGoʢGinʣͰ࣮, Cloud RunͰϗετ. • RESTful APIͷResponse݅ผʹCloud StorageʹΩϟογϡʢಉ͡ΫΤϦΛԿ࣮ߦͤ͞ͳ͍ʣ.
ʮΫϥυαʔϏεબͼʯͷ͠͞Λղܾ͢Δ • αʔόϨεͳΫϥυαʔϏεͷબͼํ - Google Cloudฤ • App EngineͱFirebaseͱCloud Run,
Կ͕ҧ͏ͷ? • Cloud Functions͍ͬͯͭ͏ͷ?ͦͦԿऀ?? • …ͱ͍ͬͨΫΤενϣϯʹ͓͑͠·͢ • ϓϩάϥϛϯάݴޠͷબͼํ
αʔόϨεͳΞϓϦέʔγϣϯಈ࡞ڥ ओͳαʔϏεʢ(PPHMFʣ 63- ֓ཁ "QQ&OHJOF IUUQTDMPVEHPPHMFDPNBQQFOHJOF αʔόϨε͔ͭϑϧϚωʔδυͷ ݩΈ͍ͨͳଘࡏ ͍͍͢ 'JSFCBTF
IUUQT fi SFCBTFHPPHMFDPN ΞϓϦͷΈͳΒͣ %#ɾ௨ͱ ΕΔ͜ͱ͕ଟ͍͔ͭศར ͳ͓ݴޠ+BWB4DSJQUͷΈ $MPVE3VO IUUQTDMPVEHPPHMFDPNSVO ͖ͳݴޠɾڥͰ࡞ΔͳΒ͜Ε ͻͱ·ͣ$POUBJOFS࡞ͬͯಈ͔ͤΔ $MPVE'VODUJPOT IUUQTDMPVEHPPHMFDPNGVODUJPOT ͪΐͬͱͨؔ͠Λಈ͔͢ͳΒ 4MBDL#PUͷখ͞ΊͳΞϓϦͳͲ
Cloud RunΛબͨ͠ཧ༝ • σϞΞϓϦɾόοΫΤϯυڞʹContainerʢDockerʣϕʔεͰߏங͍ͯͨͨ͠Ί • ։ൃதґଘ͢ΔϥΠϒϥϦ͕ෆಁ໌ͩͬͨͨΊDocker ContainerఆͰ։ൃ • ContainerΛͦͷ··ಈ͔ͤΔCloud Run͕࠷ָ͔ͭ׳Ε͍ͯͨͷͰ࠾༻
• Cloud RunΛબͨ͠ཧ༝ͱഎܠ 1.ґଘϥΠϒϥϦͳͲͷ߹ͰPython or GoͰͷ։ൃ -> Firebase͕ީิ͔Βফ͑Δ 2.Cloud FunctionsͰಈ͔͢ʹඍົͳ༷ͱͳͬͨͨΊ, Cloud Funcitonsফ͑Δ 3.App EngineʢStandardʣContainer͡Όͳ͍ -> Cloud Runʹܾఆ ※ContainerͰಈ͔ͤΔApp EngineʢFlexibleʣΛΘͳ͔ͬͨཧ༝…ؾʹͳΔํฉ͍͍ͯͩ͘͞
ީิͱͯ͠ݕ౼ͨ͠ϓϩάϥϛϯάݴޠ ݴޠ ݕ౼ͷഎܠ ݁Ռ 1ZUIPOʢ8FCΞϓϦɾόονʣ Ұ൪͍׳Ε͍ͯͯαΫοͱ࡞ΕΔ ςετɾσϓϩΠͷڥपΓ͕ গʑ໘ σϞ༻ͷσʔλΞϓϦ͓Αͼ
σʔλج൫ͷؔΛ1ZUIPOʹ ͳ͓Ұ෦(PͰॻ͖͑Δ༧ఆ (Pʢ3&45GVM"1*ʣ ެࢲͱʹຆͲ͍ͬͯͳ͍͕͖ ςετɾίʔυϑΥʔϚολʔ͕ ޙൃݴޠ͚͍͍ͩ͋ͬͯ͢ ϓϩάϥϛϯάݴޠͱͯ͠ͷಛੑ $POUBJOFS։ൃͱͷ૬ੑൈ܈ͩͬͨ ͷͰόοΫΤϯυͷݴޠͱͯ͠༻ 5ZQF4DSJQUʢࠓճෆ࠾༻ʣ 1ZUIPOͷ࣍ʹ͍ͬͯΔ ϑϩϯτΤϯυ࡞ΔͳΒ͜Ε όοΫΤϯυଞʹީิ༗Γ %BTIʢ1ZUIPOͷ'SBNFXPSLʣଆ ͰϑϩϯτΤϯυΛੜ͢Δࣄʹ ͨ͠ͷͰࠓճΘͣ
ΞϓϦLow-code🐍 • ͔݁Βݴ͏ͱʮDashʯͱ͍͏ PythonͷLow-codeͰ࣮. • ϓϩτλΠϓͰੳɾՄࢹԽʹ Jupyter LabͱPlotlyΛ͓ͬͯΓ,
͜ΕΛͦͷ··Ҡ২Ͱ͖ΔखஈͰ ࣮͔ͨͬͨ͠ʢબͨ͠എܠʣ • https://dash.plotly.com/
Dash for Pythonʢྫʣ • DashΛ͏ͱPython͚ͩͰϑϩϯτॻ͚·͢. • HTMLʹͨ͠ίϯϙʔωϯτΛPythonͰ ΰϦͬͱॻ͍ͯ࡞͢Δͱ͍͍ײ͡ʹಈ͘. •
ΠϕϯτۦಈͰͷॻ͖͑Callbackͳ σίϨʔλʔͰ࣮ʢงғؾ΄΅Reactʣ. • ຊͷϑϩϯτΤϯυΑΓෆརͳଟ͍ͷͰ ͍ॴʹेҙΛʂʢύϑΥʔϚϯεʣ
GoΛͬͨόοΫΤϯυ։ൃ • ผʹPythonͰྑ͔ͬͨͷ͕ͩ, ͍ͨͯ͘GoΛͬͨʢ࠷େͷཧ༝ʣ. • RESTful APIΛ࡞ΔͱܾΊͨ࣌, ʮ͜ΕͬͯGoͳΒεϚʔτʹ࡞ΕΔ?ʯ ͱ͍͏ࣝϕʔεͷԾઆ͓Αͼ,
ϓϩάϥϛϯάΛָ͠ΉͨΊܾߦ. • ࣗͰ࡞ͬͯಈ͔ͨ͠ॴ, ʮεϚʔτʹ࡞Εͦ͏ʯͱ͍͏ૂ͍ݟࣄతத. • ContainerαΠζͷॖখ, จ۟ͳ͍ύϑΥʔϚϯε. • go fmt, go testͷ͓ӄͰDevOpsʢCI/CDʣύΠϓϥΠϯ͕ચ࿅͞ΕͨϞϊʹ.
Ϣʔεέʔε͝ͱͷٕज़બఆ σʔλऩूج൫ฤ
None
σʔλऩूج൫ͷϢʔεέʔε ᶃ Cron Scheduler͕ຖܾ·ͬͨ࣌ؒʹᶄͷTriggerΛݺͼग़͢ ᶄ TriggerσʔλऩूαʔϏεʢCrawlerʣʹඞཁͳύϥϝʔλʔΛ࡞ͬͯ͢ ᶅ CrawlerTrigger͔ΒͷύϥϝʔλʔΛͬͯσʔλݩαΠτ͔ΒCSVΛμϯϩʔυ, Datalakeʹอଘ ᶆ
Cron SchedulerͷτϦΨʔΛݩʹImporterCSVσʔλΛDWHʢBigQueryʣʹೖ
ΫϥυωΠςΟϒͳʮόονॲཧʯ • ΫϥυͰ͋Γ͕ͪͳόονॲཧํࣜ • Cloud Functions + Pub/SubΛͬͯϐλΰϥεΠονతʹ࡞Δ • BigQueryͱͷ͖߹͍ํ
ΫϥυͰ͋Γ͕ͪͳόονॲཧํࣜ ݴޠ ڧΈͱॴ ΫϥυωΠςΟϒͳํ๏ͱͯ͠ ʲΓํ࣍ୈʳࣗલͰαʔόʔ࡞Δ ʲڧΈʳ DSPOUBCͱԿ͔͠ΒͷݴޠͰ࡞ΕΔ ʲॴʳ ϝϯςɾӡ༻͕໘ɾଐਓԽ͢Δةݥ "84&$4'BSHBUFͳΒ͍͍ײ͡
7.ܥͩͬͨΓϚγϯ͕ඞཁͳΒ Ϋϥυ͏ҙຯ͕ݮΔ͔ ʲΦεεϝʳ"QBDIF"JS fl PXΛ ΫϥυαʔϏεͰ͏ ʲڧΈʳ "JS fl PXͷΈʹͬͯࢹɾӡ༻Մ ʲॴʳ "JS fl PXͷֶशίετɾऔΓѻ͍ ۀͰ͏ ͋Δఔͷن͕͋Δ όονॲཧͰྑ͍ײ͡ͷํ๏ (PPHMF$MPVE "84྆ํ͍͚Δ ʲཁ͕݅߹͑Φεεϝʳ 1VC4VCʹΑΔΠϕϯτۦಈ ʲڧΈʳ ϚΠΫϩαʔϏεతʹ࡞ΕΔ ʲॴʳ ઃܭɾ࣮Λཧղɾ׳ΕΔֶशίετ ΫϥυαʔϏεͳΒͰͷํ๏ ઃܭɾ࣮ͷֶशίετ͔͔Δ͕ ׳Εͯ͠·͑݁ߏؾܰʹ࡞ΕΔ
ࠓճPub/SubΠϕϯτۦಈͰߏங ݴޠ ڧΈͱॴ ΫϥυωΠςΟϒͳํ๏ͱͯ͠ ʲΓํ࣍ୈʳࣗલͰαʔόʔ࡞Δ ʲڧΈʳ DSPOUBCͱԿ͔͠ΒͷݴޠͰ࡞ΕΔ ʲॴʳ ϝϯςɾӡ༻͕໘ɾଐਓԽ͢Δةݥ "84&$4'BSHBUFͳΒ͍͍ײ͡
7.ܥͩͬͨΓϚγϯ͕ඞཁͳΒ Ϋϥυ͏ҙຯ͕ݮΔ͔ ʲΦεεϝʳ"QBDIF"JS fl PXΛ ΫϥυαʔϏεͰ͏ ʲڧΈʳ "JS fl PXͷΈʹͬͯࢹɾӡ༻Մ ʲॴʳ "JS fl PXͷֶशίετɾऔΓѻ͍ ۀͰ͏ ͋Δఔͷن͕͋Δ όονॲཧͰྑ͍ײ͡ͷํ๏ (PPHMF$MPVE "84྆ํ͍͚Δ ʲ࠾༻ʳ 1VC4VCʹΑΔΠϕϯτۦಈ ʲڧΈʳ ϚΠΫϩαʔϏεతʹ࡞ΕΔ ʲॴʳ ઃܭɾ࣮Λཧղɾ׳ΕΔֶशίετ ΫϥυαʔϏεͳΒͰͷํ๏ ઃܭɾ࣮ͷֶशίετ͔͔Δ͕ ׳Εͯ͠·͑݁ߏؾܰʹ࡞ΕΔ
Pub/Sub + Cloud FunctionsͳϐλΰϥεΠον • crontabͷΫϥυαʔϏεʢCloud Schedulerʣ͔ΒPub/Subܦ༝Ͱୟ͘ࣄͰόονॲཧΛ࣮ݱ • ֤Cloud Functionsؔʢᶄ,
ᶅ, ᶆʣPub/Sub͔ΒͷϝοηʔδΛड͚ͯಈ͘ΈͰ࡞Δ • ͜ΕΒͷํ๏Google CloudެࣜʹQuick Start͕͋ΔͷͰਅࣅͨ͠Β࣮ݱՄೳʢਅࣅ͠·ͨ͠ʣ • ͪͳΈʹڥCloud RunͳͲ, ଞͷαʔϏε͑·͢&ݴޠPython͡Όͳͯ͘ߏ͍·ͤΜ
Pub/SubʹΑΔΠϕϯτۦಈΛબͨ͠ཧ༝ • ӡ༻ɾίετ྆໘ͰαʔόϨεͷϝϦοτΛ׆͔ͨ͢Ί. • ࣗલαʔόʔΠϯϑϥΛࣗݾཧ͠ͳ͍ΞΧϯ&VMͩͱ͓͔͔ۚΔ. • Air fl owʢCloud ComposerʣGKEΫϥελ͕ඞཁͰίετ໘Ͱͷෆ҆.
• Cloud Functions + Cloud Scheduler + Pub/SubͰαʔόϨεԽ͕Ұ൪ཁٻ༷ʹ߹͏ͱஅ • ֤ॲཧΛϚΠΫϩαʔϏεͱ࣮ͯ͠Մೳʢೋ൪ͷཧ༝ʣ. • ಠཱͨ͠ॲཧ୯ҐͰCloud FunctionsͷؔΛ࡞Δ͜ͱͰૄ݁߹ͳϚΠΫϩαʔϏεʹͳΔ. • ֤ؔͰϓϩάϥϛϯάݴޠɾFrameworkͷมߋ͕ग़དྷΔ, ςετͷ͢͠͞ͱ͍͏ϝϦοτ.
• σʔλநग़ʢExtractʣ, มʢTransformʣ, ૹग़ʢLoadʣͦΕͧΕͰ͚ͯ࡞ΔͱεοΩϦ͠·͢. ※ࠓճ্༷TransformͱLoadҰॹʹ͍ͯ͠·͢ʢͦͷํ͕εοΩϦͨͨ͠Ίʣ • ॲཧͷτϦΨʔCloud SchedulerͷαʔϏεΛCronΘΓʹ͢Δͱ࣮͕͍͍ײ͡ʹলུͰ͖·͢. • Cloud
Functionsʢ&ଞࣾΫϥυؚΊͨFaaSʣϦΫΤετ͋ͨΓͷॲཧ࣌ؒɾϦιʔε੍ݶ͕͋ΔͷͰҙ • Cloud FunctionsʢୈೋੈʣϦΫΤετ͋ͨΓͷ࠷େॲཧ࣌ؒ60ʢHTTPܦ༝ͷ߹ʣ, CPU/ϝϞϦ੍ݶ͋Γ·͢. • AWS LambdaͷαʔϏεͦΕͧΕ੍͕͋ΔͷͰ͝ҙΛ, େ͖͍σʔλDataflow, ઐ༻αʔϏεΛ͏͜ͱΛݕ౼. ※FaaS: Function as a ServiceʢؔϕʔεͷΫϥυαʔϏε, Cloud Functions, AWS LambdaͳͲʣ Cloud FunctionsΛͬͨόονॲཧͷ࣮
BigQueryͱͷ͖߹͍ํ • αΫοͱ͑ΔεέʔϧՄೳͳDWH͕ཉ͔ͬͨ͠ΒBigQuery͍͍ͧ • ʢࠓճͷج൫Ͱͳ͘ʣݱ࣮ͷۀͰεέʔϧՄೳͳॴʹԿ͔ٹΘΕ·ͨ͠. • ݸਓར༻Ͱ͍׳Ε͓ͯ͘ͱͳ͓ྑ͍Ͱ͢. • ݄ͨΓ, 10GBͷσʔλอ,
1TBͷΫΤϦ࣮ߦແྉ. • ݸਓͰͬͯͯ͜ͷྔͦ͏ͦ͏ߦ͔ͳ͍ϋζͳͷͰ͏ͱ͍͍͔. • ҰσʔλΛೖΕ͓ͯ͘ͱ, ΞϓϦ͔Β͏ɾΞυϗοΫੳ྆ํศརͰ͢. ࢲ͏BigQuery͔ΒಀΕΔ͜ͱग़དྷ·ͤΜʢਅإʣ.
ٕज़తͳͰർΕ͖ͯ·ͤΜ͔? ʮਪ͠ͷٿબखʯͷΛ͠·͢Ͷ⽁
WBCͰઈରݟಀͤͳ͍ ʮ֤ࠃͷਓͨͪʯ • ମೳྗ͓Խ͚ͰΩϨοΩϨ • ύϫʔ, ڧݞ, कͦͯ͠٭͕ചΓ • ߽͛ͯٿorΤά͍มԽٿ
• ͦ͏͍͏ਓͨͪΛ͝հ • ݱ࣌ͷ৽ঙ߶ࢤͬΆ͍ Ӊਓબख͍Δ͔🤘
WBCʹग़͢Δʢϋζʣ, ֤ࠃදͷਓϚϯ • ถࠃදʮϝδϟʔφϯόʔϫϯͷकඋྗΛތΔαʔυʯ • υϛχΧදʮະདྷͷΠνϩʔ or ৽ঙ??࿘ອ͋;ΕΔηϯλʔʯ • ࣆJAPANʮϝδϟʔϦʔά༗ג֎ख͕ຬΛ࣋ͯ͠ͷ͝ొʯ
ϊʔϥϯɾΞϨφυ ʢถࠃදʣ • 10࿈ଓΰʔϧυάϥϒͱ͍͏ ϝδϟʔ۶ࢦͷڧଧͷࡾྥख • ଧऀͱͯ͠ۃͳҾͬுΓϚϯ ӈଧͪͰଧ΄΅ࠨํ
• ถࠃදओཁϝϯόʔͷҰਓ ࡾྥकඋͱϑϧεΠϯάʹ
ϑϦΦɾϩυϦήε ʢυϛχΧදʣ • λϨϯτ܉ஂυϛχΧͷएखελʔ. 2022MLB৽ਓԦ, All MLB 2ndνʔϜೖΓ. •
ࡾৼ͔ଧͱ͍͏ۃͳଧܸελΠϧ, ҙ֎ͱ֯ʹଧͯΔଧܸηϯε, Πϝʔδతʹ৽ঙ߶ࢤͬΆ͍. • ॴଐ͢ΔϚϦφʔζͰΠνϩʔࢯ͕ ࿅शύʔτφʔΛ͍ͯ͠Δ͜ͱͰ.
ϥʔζɾψʔτόʔ ʢࣆJAPANʣ • ࣆJAPANॳͷΞϝϦΧग़બख • ଧ͍ͷͷग़ྥͱଧ ͦͯ͠ଧٿ͕͘, 3ϙδγϣϯ
ͦͭͳ͘कΕΔ֎कඋ˕ • Ωϟϥཱ͍͍ͪͯͯ͠Ϡπͱ͍͏ᷚ, ࣆδϟύϯͷηϯλʔͱͯ͠ظʂ
ࣆJAPANͱ͍͑, ͜ͷೋਓ ݟಀͤͳ͍ਓ ώϯτ: ݩϋϜͷഎ൪߸11
μϧϏογϡ༗͜ͱ, ٿεϖγϟϦετͳ@faridyu͞Μ ୣࡾৼͷଟ͞ͱΧοτϘʔϧࢁ…WBCͰͲ͏͍͏ϓϨʔ͢Δ͔ظ͍ͨ͠.
େ୩ᠳฏʢࢦ໊ଧऀʣͷओཁͳଧٿσʔλ ҾͬுΓϚϯͰͳͯ֯͘ʹ͍ଧٿଧͬͯͯੌ͍ʢ͜ͳΈʣ
݁ͼ
ʲ࠶ܝʳຊͷελʔςΟϯάϝϯόʔ • ΫϥυͱσʔλΤϯδχΞϦϯάͰ࣮ݱ͢ΔԶͷDX • αʔόϨεͳΞʔΩςΫνϟͰ࣮ݱ͢Δʮ͍͍ײ͡ͳσʔλج൫ʯ • ʮਪ͠ͷٿબखʯͷʮਪ࣌͠ʯͲ͜ͳͷ͔ΛݟۃΊΔ⽁
ࠓͷΛཁ͢Δͱ… • ϑϧϚωʔδυ͔ͭαʔόϨεͳΫϥυ͚ͩͰσʔλج൫࡞ΕΔ. • ΫϥυαʔϏεͱϓϩάϥϛϯάݴޠ༻్ʹ߹Θͤͯબ΅͏. • ࣆJAPANʹڧྗͳϥΠόϧ͕ͨ͘͞Μ͍Δʢ͜ͳΈʣ.
ʮԶͷDXϓϩδΣΫτʯരޙͷޮՌ. • ٿσʔλαΠΤϯςΟετɾϓϩμΫτΦʔφʔͱ͍͏ࢹͰϝονϟ͑ͦ͏. • ؒ500MB, 91߲ͱ͍͏టष͍σʔλΛ͍͘͢͢Δ͚͍͍ͩϞϊΛ࡞Εͨʢࣄ࣮ʣ. • σʔλΛݟֶͯΜͩΓ, ϒϩάɾొஃͷωλʹ͢Δ͜ͱͰ։ൃͷࢿճऩ͕Ͱ͖ͦ͏. •
ݸਓͱͯ͠ͷٕज़ݕূ͕ϝονϟḿΔ༧ײʢ&ࣗݾຬʣ • Google Cloudͷࢼ͍ͨ͠ػೳɾϞϊΛ࣮ݧ͢Δsandboxͱͯ͠ػೳͦ͠͏, AIܥαʔϏεͱ͔. • ΞʔΩςΫνϟΛͦͷ··ʹAWSAzureͰࢼ͢ͷΞϦ, ϚϧνΫϥυΠέΔΜ͡ΌͶ? • ࣄʹඞཁͳεΩϧɾࣝΛझຯʢࣗݾຬʣ͔ΒखʹೖΕΔج൫ͱͯ͠࠷ߴʹྑ͍.
͓ࣄʢۀʣͰࢀߟʹ͠Α͏ͱࢥͬͨํ • ࠓճհͨ͠ΓํɾߏઈରతͳճɾϕετϓϥΫςΟεͰͳ͍Ͱ͢. ྫ͑αʔόϨεɾΞʔΩςΫνϟʹ͖͢/͖͢͡Όͳ͍ঢ়گ࣮֬ʹଘࡏ͠·͢. • ͜ͷࢲʢshinyorkeʣ͕Γ͍ͨࣄ, ͍͍ͱࢥͬͯΔࣄʢ&৮Γ͍ͨϞϊʣΛ ٧ΊࠐΜͰ࡞ͬͨ,
͕ࣗΓ͍ͨࣄͷूେͰ, ͋͘·Ͱղͷग़͠ํͷҰͭͰ͢. • αʔόϨεʹͩ͜ΘΒͳ͔ͬͨΒʮGKEΫϥελཱͯͯͦ͜Ͱಈ͔͢ʯͰऴΘΔ. ͜ΕཱͳղͩͬͨΓ͠·͢, ϝϦσϝ࣍ୈͰ. • Infrastructure as CodeͰ͖ͯͳ͍, ςετίʔυΓͳ͍etc…࣮՝ࢁੵΈ. • ʢίϯςΩετͷཧղ͕த్ͳ··ʣͦͷ··ਅࣅ͢Δͱരࢮ͠·͢. ·ͣखΛಈ͔͠, ֶशͨ͠Γಈ͔ͨ͠Γ͍͍ͯ͠ͷΛݟ͚ͭΔࢀߟʹͲ͏ͧʂ
ʮਪ͠ਪͤΔ࣌ʹਪͤʯͱ͍͏ԶͷDXଓฤ. • ϝδϟʔϦʔάɾϓϩٿͷ༧ଌΤϯδϯ࡞ͬͯΔͷͰ, ͦΕΛࠓճͷج൫ʹ࣮͍ͨ͠, ૬ੑྑ͍ͣͳͷͰ. • ࠓޙͷలͱͯ͠, σʔλͷϥΠηϯεͷ՝Λղܾ͠,
͍͍ײ͡ͳٿσʔλαΠτΛҰൠެ։͢ΔʢઈରʹΓ͍ͨʣ. • ࠓޙٕज़Λ৭ʑࢼ͠, ٿͷ໘നָ͍͠ΈํΛൃ৴ͭͭ͠, ʮٿͱٕज़ͷਪ͠ΛϦίϝϯυʯ͢ΔԶͷDXΛ͍ͬͯ͘Ͱ.
͝ਗ਼ௌ͋Γ͕ͱ͏͍͟͝·ͨ͠⽁ @shinyorke