Upgrade to Pro
— share decks privately, control downloads, hide ads and more …
Speaker Deck
Features
Speaker Deck
PRO
Sign in
Sign up for free
Search
Search
AWSで作る、サーバーレスデータ分析基盤構築 / jawsug-niigata-11
Search
kasacchiful
January 15, 2022
Programming
1
400
AWSで作る、サーバーレスデータ分析基盤構築 / jawsug-niigata-11
JAWS-UG新潟#11で発表した資料です。
kasacchiful
January 15, 2022
Tweet
Share
More Decks by kasacchiful
See All by kasacchiful
Amazon Q Developer for CLI を使って PHP Conference 新潟 2025 参加者向けにグルメサイトを構築した話 / 20250620niigata-5min-tech
kasacchiful
0
53
ワイがおすすめする新潟の食 / 20250530phpconf-niigata-eve
kasacchiful
0
300
生成AIでメタデータを生成してみた / 20250525generate-metadata-using-generative-ai
kasacchiful
0
53
Strands Agents SDK で AIエージェント作成 を試してみた / 20250525strands-agents
kasacchiful
0
150
いろんな世界を見てみよう / 20250508ninno_tech_fest
kasacchiful
0
30
Amazon Q Developer for CLIのある生活 / 20250427ai_craft_hacks_niigata1
kasacchiful
1
79
AWSのコンテナサービス / jawsug-akita-aws-container-services
kasacchiful
0
70
データ基盤でのコンテナ活用事例 / jawsug-akita-data-platform-with-container
kasacchiful
0
75
データ基盤でのコンテナ活用事例 / jawsug-niigata21-data-platform-with-container
kasacchiful
0
110
Other Decks in Programming
See All in Programming
Select API from Kotlin Coroutine
jmatsu
1
210
アンドパッドの Go 勉強会「 gopher 会」とその内容の紹介
andpad
0
290
Node-RED を(HTTP で)つなげる MCP サーバーを作ってみた
highu
0
120
AIエージェントはこう育てる - GitHub Copilot Agentとチームの共進化サイクル
koboriakira
0
480
GoのGenericsによるslice操作との付き合い方
syumai
3
710
『自分のデータだけ見せたい!』を叶える──Laravel × Casbin で複雑権限をスッキリ解きほぐす 25 分
akitotsukahara
1
600
第9回 情シス転職ミートアップ 株式会社IVRy(アイブリー)の紹介
ivry_presentationmaterials
1
260
C++20 射影変換
faithandbrave
0
560
AWS CDKの推しポイント 〜CloudFormationと比較してみた〜
akihisaikeda
3
320
Hypervel - A Coroutine Framework for Laravel Artisans
albertcht
1
110
エンジニア向け採用ピッチ資料
inusan
0
180
関数型まつりレポート for JuliaTokai #22
antimon2
0
160
Featured
See All Featured
Mobile First: as difficult as doing things right
swwweet
223
9.7k
How to Ace a Technical Interview
jacobian
277
23k
Git: the NoSQL Database
bkeepers
PRO
430
65k
Raft: Consensus for Rubyists
vanstee
140
7k
The Web Performance Landscape in 2024 [PerfNow 2024]
tammyeverts
8
680
Bash Introduction
62gerente
614
210k
What’s in a name? Adding method to the madness
productmarketing
PRO
23
3.5k
Designing Dashboards & Data Visualisations in Web Apps
destraynor
231
53k
Java REST API Framework Comparison - PWX 2021
mraible
31
8.7k
Become a Pro
speakerdeck
PRO
28
5.4k
Principles of Awesome APIs and How to Build Them.
keavy
126
17k
Building Better People: How to give real-time feedback that sticks.
wjessup
367
19k
Transcript
AWSͰ࡞ΔɺαʔόʔϨε σʔλੳج൫ߏங JAWS-UG৽ׁ#11 2022-01-15 @kasacchiful
Classmethod, Inc. Solutions Architect / Software Develper Favorite: Community: •
JAWS-UG Niigata • Python ML in Niigata • JaSST Niigata • ASTER • SWANII • etc. Hiroshi Kasahara @kasacchiful @kasacchiful 2
αʔόʔϨεͷੳج൫
σʔλੳʹ͓͚Δ֤छAWSαʔϏε
σʔλͷՃʗੳʹ AWS Lambda Մೳ
ෳࡶɾେنͳΒ AWS Step Functions Λ׆༻
αʔόʔϨεύλʔϯ IUUQTBXTBNB[PODPNKQTFSWFSMFTTQBUUFSOTTFSWFSMFTTQBUUFSO
Ϣʔεέʔεผʹύλʔϯ͕͋Δ IUUQTBXTBNB[PODPNKQTFSWFSMFTTQBUUFSOTTFSWFSMFTTQBUUFSO
ύλʔϯͷৄࡉBlack BeltͷࢿྉΛࢀߟʹ IUUQTEBXTTUBUJDDPNXFCJOBSTKQQEGTFSWJDFT@"84@#MBDL#FU@4FSWFSMFTT@6TFDBTF@1BUUFSOTQEG :PV5VCFͰͷղઆಈըIUUQTZPVUVCF)*M8ESC@Z.
S3ʹೖΕͯ͠·͑ɺͳΜͱ͔ͳΔ
αʔόʔϨεͰσʔλ࿈ܞ͢Δࡍʹ ϋϚͬͨͱ͜Ζ
Step FunctionsͷεςʔτϚγϯͰLambdaͷ ϫʔΫϑϩʔΛ੍ޚͯ͠ɺσʔλΛՃ
Step FunctionsͷεςʔτϚγϯͰLambdaͷ ϫʔΫϑϩʔΛ੍ޚͯ͠ɺσʔλΛՃ σʔλൃੜݩ͔ΒɺσʔλΛऔ ಘͯ͠4ʹอଘ ֤ϑΝΠϧຖʹɺ࠷ݶͷσʔ λՃΛͯ͠ɺ4ʹอଘ 2VJDL4JHIU #* ༻ʹ
ෳϑΝΠϧͷσʔλΛ·ͱΊ ͯదʹܗ͢Δ
͍Ζ͍ΖϋϚͬͨͱ͜Ζ 4ͭհ
1. ಛఆͷσʔλϑΝΠϧଟ͗͢
Έ: ͋ΔಛఆͷσʔλϑΝΠϧ͚ͩҟৗʹଟ͍ • 5ؒͷσʔλ͕1ϑΝΠϧʹ͋Δ • தϛϦඵ୯ҐͷϨίʔυ • ಛఆͷॲཧ͚͕͔͔ͩ࣌ؒΔ
• ݅ଟ͍σʔλɺBIʹग़ྗ͠ͳ͍߲ͩͬͨ • ࣍ॲཧ͔ΒΓͯ͠ɺຖ࣌ॲཧʹมߋ • ࣍ॲཧͷϘτϧωοΫΛআ͍ͨ ରॲ๏: ͋ΔಛఆͷσʔλϑΝΠϧ͚ͩɺຖ࣌ॲ ཧʹมߋ
2. AthenaͷΫΥʔλ
• σʔλҠߦ࣌ʹɺ࣍ॲཧͷ࠷ޙͷLambdaͰΤϥʔʹͳΔ • લஈͰॲཧͨ͠ෳσʔλΛAthenaͬͯSQLΫΤϦͰऔಘ͢Δͱ͜ΖͰ ্ݶʹҾ͔͔ͬΔ • Lambdaؔ1ͭʹ͖ͭɺɹstart-query-executionɹAPIΛ5ճίʔϧ • Ұ࣌తʹόʔετͰ্ݶ80·Ͱ૿͑Δ͚ͲɺσʔλҠߦ࣌ʹ20Ͱ಄ଧͪ •
্ݶ؇ਃ͢Ε্ݶ͋͛ΒΕΔ Έ: AthenaͷΫΤϦಉ࣮࣌ߦͷΫΥʔλʹ Ҿ͔͔ͬΔ
IUUQTEPDTBXTBNB[PODPNKB@KQTUFQGVODUJPOTMBUFTUEHMJNJUTPWFSWJFXIUNM
ରॲ๏: Step Functions ͷMapεςʔτͷ࠷େಉ ࣮࣌ߦΛઃఆ • Mapεςʔτ (ྻ͢ͱɺಉ࣮࣌ߦͰྻཁૉΛॲཧ͢ΔΠϝʔδ) ͷ࠷େಉ࣮࣌ߦΛઃఆ͠ɺAthenaͷ start-query-execution
APIίʔ ϧΛ࠷େ20·Ͱʹ͓͑͞Δ
Mapεςʔτʹ͍ͭͯɺҎԼͷهࣄΛࢀߟʹ IUUQTEFWDMBTTNFUIPEKQBSUJDMFTTUFQGVODUJPOTVQEBUFNBQTUBUF IUUQTEPDTBXTBNB[PODPNKB@KQTUFQGVODUJPOTMBUFTUEHBNB[POTUBUFTMBOHVBHFNBQTUBUFIUNM
3. Step FunctionsͷΫΥʔλ
Έ: Step FunctionsͷΠϕϯτཤྺ͕ΫΥʔ λʹҾ͔͔ͬΔ • ͋Δಛఆͷ͚ͩɺຖ࣌ॲཧͷϑΝΠϧ͕ҟৗʹଟ͍ • 1࣌ؒܦͬͯҟৗऴྃɻStep FunctionsͷΠϕϯτཤྺͷ্ݶ౸ୡ (25,000Πϕϯτ)
• ্ݶ؇ෆՄͷ߲ { "error": "States.Runtime", "cause": "The execution reached the maximum number of history events (25000)." }
IUUQTEPDTBXTBNB[PODPNKB@KQTUFQGVODUJPOTMBUFTUEHMJNJUTPWFSWJFXIUNM
ରॲ๏: Step Functions ͷεςʔτϚγϯΛೖΕ ࢠʹ • εςʔτϚγϯΛೖΕࢠʹ͢Δ͜ͱͰɺΠϕϯτཤྺ্ݶʹҾ͔͔ͬ Βͳ͍Α͏ʹͨ͠ • Lambdaͷಉ࣮࣌ߦ͕͔ͳΓ૿͑ΔͷͰɺҎԼͷରԠΛՃ
✓ Lambdaͷಉ࣮࣌ߦͷ্ݶ؇ਃ ✓ Step FunctionsͷMapεςʔτͷ࠷େಉ࣮࣌ߦΛઃఆ
มߋલ มߋޙ
มߋલ มߋޙ
4. Lambdaͷεέʔϧ͕͍͔ͭͳ͍
Έ: 1ճ͚ͩLambdaͷRateLimitΤϥʔʹૺ۰ • ಉ࣮࣌ߦͷΤϥʔͷΑ͏͚ͩͲ… • ͢Ͱʹಉ࣮࣌ߦͷ্ݶΛҾ্͖͍͛ͯΔͷͷɺ֤ؔͷϞχλϦ ϯάݟΔݶΓɺಉ࣮࣌ߦʹ౸ୡ͍ͯ͠ͳ͍ { "error": "Lambda.TooManyRequestsException",
"cause": "Rate Exceeded. (Service: Lambda, Status Code: 429, Request ID: xxxxxxxx-xxxx-xxxx-xxxx-xxxxxxxxxxxx, Extended Request ID: null)" }
IUUQTEPDTBXTBNB[PODPNKB@KQMBNCEBMBUFTUEHJOWPDBUJPOTDBMJOHIUNM
ରॲ๏: LambdaؔͷRetryઃఆΛݟ͠ • Step Functions ͷ Mapεςʔτͷ࠷େಉ࣮࣌ߦΛݟ͠ • Step Functions
Ͱఆٛ͢Δ Lambda ͷ Retry ઃఆΛݟ͠
Retry ͷִؒʹ͍ͭͯҎԼͷهࣄ͕ৄ͍͠ $ node -e '((i,m,b)=>{for(let w=i,c=0;c<m;c++){console.log(w+=(c==0?0:b**c))}})(2,7,1.85)' 2 3.85 7.272500000000001
13.604125000000002 25.317631250000005 46.987617812500005 87.07709295312502 IUUQTEFWDMBTTNFUIPEKQBSUJDMFTXBJU@UJNF@BOE@QBSBNT@JO@TUFQ@GVODUJPO@SFUSZ
Lambda ͷ Provisioned Concurrency ઃఆࠓճ ࣮ࢪͯ͠ͳ͍ IUUQTEFWDMBTTNFUIPEKQBSUJDMFTMBNCEBQSPWJTJPOFEDPODVSSFODZDPMETUBSU
σʔλͷՃʹ AWS Glueͱ͍͏αʔϏε͋ΔΑʁ
σʔλͷՃͳΒGlue͕͋Δ GlueΘͣʹɺΘ͟Θ͟Step Functions + LambdaͰΉඞཁ͋Δͷ͔ʁ • Step Functions + Lambdaͷ߹ɺΑ͘ΘΕΔ։ൃϑϨʔϜϫʔΫ͕͑ΔͷͰɺෳਓ
Ͱͷ։ൃ͕͍͢͠ɻ ✓ ࠓճ Serverless Framework ͬͨɻ • σʔλϑΝΠϧ͕ଟͯ͘ɺσʔλ1݅͋ͨΓͷ༰ྔ͕ͦ͜·Ͱେ͖͘ͳ͚Εɺ࣍ ୈͰLambdaͰॲཧ͕Ͱ͖Δɻ • LambdaͰΓΕͳ͍σʔλ༰ྔ࣮ߦ࣌ؒΛѻ͏߹ɺGlueͬͨํ͕͍͍ɻ ✓ ࠷େϝϞϦׂ: 10240MBɺ࠷େ࣮ߦ࣌ؒ: 15ɺ /tmp σΟϨΫτϦαΠζ: 512MB
͓·͚
͓·͚: AWS Data Wrangler͕ศར IUUQTHJUIVCDPNBXTMBCTBXTEBUBXSBOHMFS
͓·͚: AWS Data Wrangler͕ศར PandasͷػೳΛAWSʹ֦ு͢ΔɺΦʔϓϯιʔεͷPythonϥΠϒϥϦ • PandasσʔλϑϨʔϜͱAWSͷσʔλؔ࿈ͷαʔϏεͱΛ͏·͘ଓͯ͘͠Ε Δ ✓ Redshift
/ Glue / Athena / EMR ͳͲ • ௨ৗͷETLλεΫʹඞཁͳ͕ؔἧ͍ͬͯΔ
ҙ: ϑΝΠϧαΠζ͕େ͖ͯ͘ɺͦͷ··ͩ ͱLambdaʹΒͳ͍ • LambdaͷσϓϩΠύοέʔδඇѹॖ࣌ʹ250MBҎԼʹ͢Δඞཁ͕͋Δ ✓ AWS Data WranglerΛී௨ʹpipΠϯετʔϧ͢Δͱɺ250MB͑Δ •
GitHubͷReleaseϖʔδʹ͋ΔɺLambda Layer༻ͷzipϑΝΠϧΛར༻͠Α͏
·ͱΊ • αʔόʔϨεαʔϏεΛۦͯ͠ɺσʔλੳج൫ΛߏஙՄೳ • αʔόʔϨεͷΑ͋͘ΔΞʔΩςΫνϟύλʔϯΛ͏·͍͘͜ͳ͠ ·͠ΐ͏
͓͠·͍