Upgrade to Pro
— share decks privately, control downloads, hide ads and more …
Speaker Deck
Features
Speaker Deck
PRO
Sign in
Sign up for free
Search
Search
AWSで作る、サーバーレスデータ分析基盤構築 / jawsug-niigata-11
Search
kasacchiful
January 15, 2022
Programming
1
370
AWSで作る、サーバーレスデータ分析基盤構築 / jawsug-niigata-11
JAWS-UG新潟#11で発表した資料です。
kasacchiful
January 15, 2022
Tweet
Share
More Decks by kasacchiful
See All by kasacchiful
Amazon S3 TablesとAmazon S3 Metadataを触ってみた / 20250201-jawsug-tochigi-s3tables-s3metadata
kasacchiful
0
180
Amazon S3 TablesとAmazon S3 Metadataを動かしてみた / 20250125-niigata-5min-tech-lt
kasacchiful
0
19
dbt coreとFargateでデータ変換 / 20240928-jawsug-toyama-hokuriku-shinkansen
kasacchiful
1
96
What we keep in mind when migrating from Serverless Framework to AWS CDK and AWS SAM
kasacchiful
1
350
AWSでIcebergを使ってデータウェアハウスを構築してみる / 20240810-jawsug-akita
kasacchiful
0
42
サーバーレスパターンを元にAWS CDKでデータ基盤を構築する / 20240731_classmethod_odyssey_online_build_a_data_infrastructures_using_aws_cdk_based_on_serverless_patterns
kasacchiful
0
500
AWS IoT 1-clickがサービス終了するので、SORACOMに移行した話 / 20240518-jawsug-niigata-iotlt-niigata
kasacchiful
0
270
AWS Application Composerで始める、 サーバーレスなデータ基盤構築 / 20240406-jawsug-hokuriku-shinkansen
kasacchiful
1
580
AWSの各種サービス紹介と活用方法 − AI・ML活用デモを交えて − / 20231208aws-aiml-seminar
kasacchiful
0
540
Other Decks in Programming
See All in Programming
『GO』アプリ バックエンドサーバのコスト削減
mot_techtalk
0
150
2025.2.14_Developers Summit 2025_登壇資料
0101unite
0
130
Introduction to kotlinx.rpc
arawn
0
750
Serverless Rust: Your Low-Risk Entry Point to Rust in Production (and the benefits are huge)
lmammino
1
140
CloudNativePGを布教したい
nnaka2992
0
100
Amazon Q Developer Proで効率化するAPI開発入門
seike460
PRO
0
120
Rubyで始める関数型ドメインモデリング
shogo_tksk
0
130
プログラミング言語学習のススメ / why-do-i-learn-programming-language
yashi8484
0
150
Datadog DBMでなにができる? JDDUG Meetup#7
nealle
0
120
昭和の職場からアジャイルの世界へ
kumagoro95
1
410
Formの複雑さに立ち向かう
bmthd
1
900
密集、ドキュメントのコロケーション with AWS Lambda
satoshi256kbyte
1
210
Featured
See All Featured
JavaScript: Past, Present, and Future - NDC Porto 2020
reverentgeek
47
5.2k
CoffeeScript is Beautiful & I Never Want to Write Plain JavaScript Again
sstephenson
160
15k
A better future with KSS
kneath
238
17k
Done Done
chrislema
182
16k
A Tale of Four Properties
chriscoyier
158
23k
Bootstrapping a Software Product
garrettdimon
PRO
306
110k
Visualizing Your Data: Incorporating Mongo into Loggly Infrastructure
mongodb
45
9.4k
jQuery: Nuts, Bolts and Bling
dougneiner
63
7.7k
Keith and Marios Guide to Fast Websites
keithpitt
411
22k
Designing Experiences People Love
moore
140
23k
Build your cross-platform service in a week with App Engine
jlugia
229
18k
Six Lessons from altMBA
skipperchong
27
3.6k
Transcript
AWSͰ࡞ΔɺαʔόʔϨε σʔλੳج൫ߏங JAWS-UG৽ׁ#11 2022-01-15 @kasacchiful
Classmethod, Inc. Solutions Architect / Software Develper Favorite: Community: •
JAWS-UG Niigata • Python ML in Niigata • JaSST Niigata • ASTER • SWANII • etc. Hiroshi Kasahara @kasacchiful @kasacchiful 2
αʔόʔϨεͷੳج൫
σʔλੳʹ͓͚Δ֤छAWSαʔϏε
σʔλͷՃʗੳʹ AWS Lambda Մೳ
ෳࡶɾେنͳΒ AWS Step Functions Λ׆༻
αʔόʔϨεύλʔϯ IUUQTBXTBNB[PODPNKQTFSWFSMFTTQBUUFSOTTFSWFSMFTTQBUUFSO
Ϣʔεέʔεผʹύλʔϯ͕͋Δ IUUQTBXTBNB[PODPNKQTFSWFSMFTTQBUUFSOTTFSWFSMFTTQBUUFSO
ύλʔϯͷৄࡉBlack BeltͷࢿྉΛࢀߟʹ IUUQTEBXTTUBUJDDPNXFCJOBSTKQQEGTFSWJDFT@"84@#MBDL#FU@4FSWFSMFTT@6TFDBTF@1BUUFSOTQEG :PV5VCFͰͷղઆಈըIUUQTZPVUVCF)*M8ESC@Z.
S3ʹೖΕͯ͠·͑ɺͳΜͱ͔ͳΔ
αʔόʔϨεͰσʔλ࿈ܞ͢Δࡍʹ ϋϚͬͨͱ͜Ζ
Step FunctionsͷεςʔτϚγϯͰLambdaͷ ϫʔΫϑϩʔΛ੍ޚͯ͠ɺσʔλΛՃ
Step FunctionsͷεςʔτϚγϯͰLambdaͷ ϫʔΫϑϩʔΛ੍ޚͯ͠ɺσʔλΛՃ σʔλൃੜݩ͔ΒɺσʔλΛऔ ಘͯ͠4ʹอଘ ֤ϑΝΠϧຖʹɺ࠷ݶͷσʔ λՃΛͯ͠ɺ4ʹอଘ 2VJDL4JHIU #* ༻ʹ
ෳϑΝΠϧͷσʔλΛ·ͱΊ ͯదʹܗ͢Δ
͍Ζ͍ΖϋϚͬͨͱ͜Ζ 4ͭհ
1. ಛఆͷσʔλϑΝΠϧଟ͗͢
Έ: ͋ΔಛఆͷσʔλϑΝΠϧ͚ͩҟৗʹଟ͍ • 5ؒͷσʔλ͕1ϑΝΠϧʹ͋Δ • தϛϦඵ୯ҐͷϨίʔυ • ಛఆͷॲཧ͚͕͔͔ͩ࣌ؒΔ
• ݅ଟ͍σʔλɺBIʹग़ྗ͠ͳ͍߲ͩͬͨ • ࣍ॲཧ͔ΒΓͯ͠ɺຖ࣌ॲཧʹมߋ • ࣍ॲཧͷϘτϧωοΫΛআ͍ͨ ରॲ๏: ͋ΔಛఆͷσʔλϑΝΠϧ͚ͩɺຖ࣌ॲ ཧʹมߋ
2. AthenaͷΫΥʔλ
• σʔλҠߦ࣌ʹɺ࣍ॲཧͷ࠷ޙͷLambdaͰΤϥʔʹͳΔ • લஈͰॲཧͨ͠ෳσʔλΛAthenaͬͯSQLΫΤϦͰऔಘ͢Δͱ͜ΖͰ ্ݶʹҾ͔͔ͬΔ • Lambdaؔ1ͭʹ͖ͭɺɹstart-query-executionɹAPIΛ5ճίʔϧ • Ұ࣌తʹόʔετͰ্ݶ80·Ͱ૿͑Δ͚ͲɺσʔλҠߦ࣌ʹ20Ͱ಄ଧͪ •
্ݶ؇ਃ͢Ε্ݶ͋͛ΒΕΔ Έ: AthenaͷΫΤϦಉ࣮࣌ߦͷΫΥʔλʹ Ҿ͔͔ͬΔ
IUUQTEPDTBXTBNB[PODPNKB@KQTUFQGVODUJPOTMBUFTUEHMJNJUTPWFSWJFXIUNM
ରॲ๏: Step Functions ͷMapεςʔτͷ࠷େಉ ࣮࣌ߦΛઃఆ • Mapεςʔτ (ྻ͢ͱɺಉ࣮࣌ߦͰྻཁૉΛॲཧ͢ΔΠϝʔδ) ͷ࠷େಉ࣮࣌ߦΛઃఆ͠ɺAthenaͷ start-query-execution
APIίʔ ϧΛ࠷େ20·Ͱʹ͓͑͞Δ
Mapεςʔτʹ͍ͭͯɺҎԼͷهࣄΛࢀߟʹ IUUQTEFWDMBTTNFUIPEKQBSUJDMFTTUFQGVODUJPOTVQEBUFNBQTUBUF IUUQTEPDTBXTBNB[PODPNKB@KQTUFQGVODUJPOTMBUFTUEHBNB[POTUBUFTMBOHVBHFNBQTUBUFIUNM
3. Step FunctionsͷΫΥʔλ
Έ: Step FunctionsͷΠϕϯτཤྺ͕ΫΥʔ λʹҾ͔͔ͬΔ • ͋Δಛఆͷ͚ͩɺຖ࣌ॲཧͷϑΝΠϧ͕ҟৗʹଟ͍ • 1࣌ؒܦͬͯҟৗऴྃɻStep FunctionsͷΠϕϯτཤྺͷ্ݶ౸ୡ (25,000Πϕϯτ)
• ্ݶ؇ෆՄͷ߲ { "error": "States.Runtime", "cause": "The execution reached the maximum number of history events (25000)." }
IUUQTEPDTBXTBNB[PODPNKB@KQTUFQGVODUJPOTMBUFTUEHMJNJUTPWFSWJFXIUNM
ରॲ๏: Step Functions ͷεςʔτϚγϯΛೖΕ ࢠʹ • εςʔτϚγϯΛೖΕࢠʹ͢Δ͜ͱͰɺΠϕϯτཤྺ্ݶʹҾ͔͔ͬ Βͳ͍Α͏ʹͨ͠ • Lambdaͷಉ࣮࣌ߦ͕͔ͳΓ૿͑ΔͷͰɺҎԼͷରԠΛՃ
✓ Lambdaͷಉ࣮࣌ߦͷ্ݶ؇ਃ ✓ Step FunctionsͷMapεςʔτͷ࠷େಉ࣮࣌ߦΛઃఆ
มߋલ มߋޙ
มߋલ มߋޙ
4. Lambdaͷεέʔϧ͕͍͔ͭͳ͍
Έ: 1ճ͚ͩLambdaͷRateLimitΤϥʔʹૺ۰ • ಉ࣮࣌ߦͷΤϥʔͷΑ͏͚ͩͲ… • ͢Ͱʹಉ࣮࣌ߦͷ্ݶΛҾ্͖͍͛ͯΔͷͷɺ֤ؔͷϞχλϦ ϯάݟΔݶΓɺಉ࣮࣌ߦʹ౸ୡ͍ͯ͠ͳ͍ { "error": "Lambda.TooManyRequestsException",
"cause": "Rate Exceeded. (Service: Lambda, Status Code: 429, Request ID: xxxxxxxx-xxxx-xxxx-xxxx-xxxxxxxxxxxx, Extended Request ID: null)" }
IUUQTEPDTBXTBNB[PODPNKB@KQMBNCEBMBUFTUEHJOWPDBUJPOTDBMJOHIUNM
ରॲ๏: LambdaؔͷRetryઃఆΛݟ͠ • Step Functions ͷ Mapεςʔτͷ࠷େಉ࣮࣌ߦΛݟ͠ • Step Functions
Ͱఆٛ͢Δ Lambda ͷ Retry ઃఆΛݟ͠
Retry ͷִؒʹ͍ͭͯҎԼͷهࣄ͕ৄ͍͠ $ node -e '((i,m,b)=>{for(let w=i,c=0;c<m;c++){console.log(w+=(c==0?0:b**c))}})(2,7,1.85)' 2 3.85 7.272500000000001
13.604125000000002 25.317631250000005 46.987617812500005 87.07709295312502 IUUQTEFWDMBTTNFUIPEKQBSUJDMFTXBJU@UJNF@BOE@QBSBNT@JO@TUFQ@GVODUJPO@SFUSZ
Lambda ͷ Provisioned Concurrency ઃఆࠓճ ࣮ࢪͯ͠ͳ͍ IUUQTEFWDMBTTNFUIPEKQBSUJDMFTMBNCEBQSPWJTJPOFEDPODVSSFODZDPMETUBSU
σʔλͷՃʹ AWS Glueͱ͍͏αʔϏε͋ΔΑʁ
σʔλͷՃͳΒGlue͕͋Δ GlueΘͣʹɺΘ͟Θ͟Step Functions + LambdaͰΉඞཁ͋Δͷ͔ʁ • Step Functions + Lambdaͷ߹ɺΑ͘ΘΕΔ։ൃϑϨʔϜϫʔΫ͕͑ΔͷͰɺෳਓ
Ͱͷ։ൃ͕͍͢͠ɻ ✓ ࠓճ Serverless Framework ͬͨɻ • σʔλϑΝΠϧ͕ଟͯ͘ɺσʔλ1݅͋ͨΓͷ༰ྔ͕ͦ͜·Ͱେ͖͘ͳ͚Εɺ࣍ ୈͰLambdaͰॲཧ͕Ͱ͖Δɻ • LambdaͰΓΕͳ͍σʔλ༰ྔ࣮ߦ࣌ؒΛѻ͏߹ɺGlueͬͨํ͕͍͍ɻ ✓ ࠷େϝϞϦׂ: 10240MBɺ࠷େ࣮ߦ࣌ؒ: 15ɺ /tmp σΟϨΫτϦαΠζ: 512MB
͓·͚
͓·͚: AWS Data Wrangler͕ศར IUUQTHJUIVCDPNBXTMBCTBXTEBUBXSBOHMFS
͓·͚: AWS Data Wrangler͕ศར PandasͷػೳΛAWSʹ֦ு͢ΔɺΦʔϓϯιʔεͷPythonϥΠϒϥϦ • PandasσʔλϑϨʔϜͱAWSͷσʔλؔ࿈ͷαʔϏεͱΛ͏·͘ଓͯ͘͠Ε Δ ✓ Redshift
/ Glue / Athena / EMR ͳͲ • ௨ৗͷETLλεΫʹඞཁͳ͕ؔἧ͍ͬͯΔ
ҙ: ϑΝΠϧαΠζ͕େ͖ͯ͘ɺͦͷ··ͩ ͱLambdaʹΒͳ͍ • LambdaͷσϓϩΠύοέʔδඇѹॖ࣌ʹ250MBҎԼʹ͢Δඞཁ͕͋Δ ✓ AWS Data WranglerΛී௨ʹpipΠϯετʔϧ͢Δͱɺ250MB͑Δ •
GitHubͷReleaseϖʔδʹ͋ΔɺLambda Layer༻ͷzipϑΝΠϧΛར༻͠Α͏
·ͱΊ • αʔόʔϨεαʔϏεΛۦͯ͠ɺσʔλੳج൫ΛߏஙՄೳ • αʔόʔϨεͷΑ͋͘ΔΞʔΩςΫνϟύλʔϯΛ͏·͍͘͜ͳ͠ ·͠ΐ͏
͓͠·͍