Upgrade to Pro
— share decks privately, control downloads, hide ads and more …
Speaker Deck
Features
Speaker Deck
PRO
Sign in
Sign up for free
Search
Search
AWSにおけるデータ分析入門 / Introduction To Data Analytic...
Search
hedgehog051
October 06, 2021
0
230
AWSにおけるデータ分析入門 / Introduction To Data Analytics In AWS
hedgehog051
October 06, 2021
Tweet
Share
More Decks by hedgehog051
See All by hedgehog051
AWS Generative AI CDK Constructsについて
hedgehog051
2
270
KnowledgeBasesとAgentsの紹介
hedgehog051
4
1.7k
BedrockUpdatesPost-GW Summary
hedgehog051
4
780
来てくれClaude 3! Agents for Amazon Bedrockのモデル比較或いはチューニングの話
hedgehog051
5
1.7k
Relic_Tech_Camp_GenerativeAI.pdf
hedgehog051
11
88k
concurrencyで爆速並列デプロイ
hedgehog051
1
1.8k
AWS App Runnerについてとこれから期待したいこと/About-AWS-App-Runner-and-what-to-expect-in-the-future
hedgehog051
0
93
また増えた!?AWSコンテナ関連サービスを10分でざっくり掴もう/Learn-about-AWS-0container-services-in-10-minutes
hedgehog051
0
110
Featured
See All Featured
RailsConf 2023
tenderlove
30
1.1k
How to train your dragon (web standard)
notwaldorf
95
6.1k
Producing Creativity
orderedlist
PRO
346
40k
Templates, Plugins, & Blocks: Oh My! Creating the theme that thinks of everything
marktimemedia
31
2.4k
Creating an realtime collaboration tool: Agile Flush - .NET Oxford
marcduiker
30
2.1k
Fireside Chat
paigeccino
37
3.5k
Build The Right Thing And Hit Your Dates
maggiecrowley
36
2.8k
The Art of Delivering Value - GDevCon NA Keynote
reverentgeek
15
1.5k
Navigating Team Friction
lara
187
15k
Building an army of robots
kneath
306
45k
Bash Introduction
62gerente
613
210k
The Myth of the Modular Monolith - Day 2 Keynote - Rails World 2024
eileencodes
26
2.9k
Transcript
"8 4 ʹ ͓ ͚ Δ σ ʔ λ
ੳ ೖ ג ࣜ ձ ࣾ R e l i c ۽ ా
ࣗݾհ • ۽ా ,BO,VNBEB • ळdΠϯϑϥΤϯδχΞ • ݄ʹגࣜձࣾ3FMJDೖࣾ
σʔλੳ͕͍ͨ͠ʜ
ϏδωεΛΠϯςϦδΣϯε͍ͨ͠ʜ
ʑσʔλੳͷػӡߴ·Δ
ͦͷલʹ
• ݁Ռ࣮ͳͲͷσʔλΛऩू ˞زΒചΓ্͔͛ͨɺͲΕ͘Β͍ΞΫηε͕͔͋ͬͨͳͲ • ऩूͨ͠σʔλ͔ΒԿ͔͠ΒͷΠϯαΠτΛಘΔ ˞Կ͕ചΕ͍ͯΔ͔ɺ͍ͭɺ୭ʹചΕ͍ͯΔ͔ͳͲ • ಘΒΕͨΠϯαΠτʹରͯ͠ΞΫγϣϯΛى͜͢ ˞Ձ֨ΛௐɺදࣔΛௐɺλʔήοτ֦େͳͲ
σʔλੳͬͯԿ͢Δͷ
ԿΛ࣮ݱͨͯ͘͠σʔλੳΛ ͢Δͷ͔Λ໌֬ʹ͢Δͷ͕େࣄ
"84Ͱͷσʔλੳؔ࿈αʔϏε
ͳΔ΄ͲɺΘ͔ΒΜ
• ݁Ռ࣮ͳͲͷσʔλΛऩू ˠͲ͏ͬͯूΊΔ͔ɺԿॲʹूΊΔ͔ • ऩूͨ͠σʔλ͔ΒԿ͔͠ΒͷΠϯαΠτΛಘΔ ˠੳ͘͢͠ՃɺੳɺՄࢹԽ σʔλੳج൫Λߏங͢Δʹ͋ͨͬͯ
ͬ͘͟Γྨ
ऩू Amazon Kinesis Amazon Kinesi s Video Streams Amazon Kinesi
s Data Streams Amazon Kinesi s Data Firehose Amazon Manage d Streaming for Apache Kafka AWS Data Pipeline AWS Data Exchange
ੵ Amazon Redshift Amazon LakeFarmation Amazon S3
Ճ Amazon EMR AWS Glue AWS Glue Elastic Views
AWS Glue DataBrew Amazon Kinesi s Data Analytics
ੳ Amazon EMR AWS Athena Amazon Kinesi s Data Analytics
Amazon Redshift Amazon QuickSight Amazon OpenSearch Service
ՄࢹԽ Amazon ElasticSearch Service Amazon QuickSight Amazon OpenSearch Service ৭ʑ͋ͬͯ
ؾ࣋ͪɺগ͠ํੑݟ͖͑ͯͨ ؾ͕͢Δ
ͦΕͧΕΛͬ͘͟Γ
ऩू
ϦΞϧλΠϜετϦʔϛϯά Amazon Kinesis Amazon Kinesi s Video Streams Amazon Kinesi
s Data Streams Amazon Kinesi s Data Firehose Amazon Manage d Streaming for Apache Kafka AWS Data Pipeline AWS Data Exchange
Amazon Kinesis Amazon Kinesi s Video Streams Amazon Kinesi s
Data Streams Amazon Kinesi s Data Firehose Amazon Manage d Streaming for Apache Kafka KinesisαʔϏεͷ૯শ ετϦʔϛϯάಈըͷΩϟϓνϟɺ ॲཧɺอଘ ετϦʔϜσʔλͷΩϟϓνϟɺ ॲཧɺอଘ AWS σʔλετΞʹ ετϦʔϜσʔλΛϩʔυ ϚωʔδυܕApache Kafk a ετϦʔϜσʔλͷૹड৴
ͦͷଞ Amazon Kinesis Amazon Kinesi s Video Streams Amazon Kinesi
s Data Streams Amazon Kinesi s Data Firehose Amazon Manage d Streaming for Apache Kafka AWS Data Pipeline AWS Data Exchange
AWS Data Pipeline AWS Data Exchange αʔυύʔςΟσʔλͷ αϒεΫϦϓγϣϯ Reuters͕ఏڙ͢ΔهࣄσʔλͳͲ ఆظ࣮ߦʹΑΔσʔλҠಈɺม
ੵ
Amazon Redshift Amazon LakeFarmation Amazon S3 σʔλΣΞϋε γεςϜ͔Βେͳ”ߏԽσʔλ ” ΛूΊཧ͢Δݿ
σʔλϨΠΫΛߏங ະՃͰ༻్ఆΊΒΕ͍ͯͳ͍ σʔλΛอ͢Δ ΦϒδΣΫτετϨʔδ ”ߏԽσʔλ”ɺ“ඇߏԽσʔλ ” ͳͲΛอ͢ΔετϨʔδ
Ճɾੳ
Amazon EMR AWS Glue AWS Glue Elastic View s
(ϓϨϏϡʔ) AWS Glue DataBrew ϏοάσʔλϑϨʔϜϫʔΫ ؔ࿈OSSΛΈ߹Θͤͯେྔσʔλͷ ETLετϦʔϛϯάॲཧੳΛ࣮ߦ αʔόϨεETL(நग़/ม/ϩʔυ) ϊʔίʔυͰσʔλͷ ΫϦʔϯΞοϓͱਖ਼نԽ ϚςϦΞϥΠζυϏϡʔߏங ෳσʔλετΞʹΞΫηεͯ͠ σʔλΛ݁߹&ίϐʔ
AWS Athena Amazon Kinesi s Data Analytics ΞυϗοΫΫΤϦΛS3ʹର࣮ͯ͠ߦ ετϦʔϛϯάσʔλΛมɺੳ Amazon
Redshift σʔλΣΞϋε ෳࡶͳSQLΫΤϦΛ࣮ߦ
ՄࢹԽ
Amazon QuickSight Amazon OpenSearch Service&Kibana ϦΞϧλΠϜσʔλݕࡧ/ՄࢹԽ αʔόϨεBIπʔϧ/ՄࢹԽ
ͲΜͳ࣌ʹ͏ ओཁͦ͏ͳͷ
Amazon Kinesis Video Streams ɾಈըσʔλΛੜ͢ΔσόΠε͍҃ΞϓϦέʔγϣϯ͕͋Δ ɾHLSͰϥΠϒಈըըϝσΟΞΛϒϥβεϚϗʹετϦʔϛϯά͍ͨ͠ ɾϦΞϧλΠϜͷํϝσΟΞετϦʔϛϯάwebϒϥβετϦʔϛϯά͕͍ͨ͠ ɾಈըσʔλΛRekognitionVideo(ಈըೝࣝ)SageMaker(ML)ʹ͍͍ͨ
ɾαʔόσόΠε͕ੜ͢ΔϩάΠϕϯτσʔλΛϦΞϧλΠϜͰߴऩू͍ͨ͠ ɾ1ඵҎԼͷ͞ͰσʔλΛऩू͍ͨ͠ ɾετϦʔϛϯάσʔλΛLambdaͰॲཧ͍ͨ͠ ɾετϦʔϛϯάσʔλΛEC2ʹసૹ͍ͨ͠ ɾετϦʔϛϯάσʔλΛKinesis Data Analyticsʹసૹͯ͠ϦΞϧλΠϜੳ͍ͨ͠ Amazon Kinesis Data
Streams
ɾετϦʔϜσʔλΛS3RedshiftɺOpenSearchService৴͍ͨ͠ ɾ΄΅ϦΞϧλΠϜ(60ඵҎ)ͷ͞ͰσʔλΛ্هσʔλετΞ৴͍ͨ͠ ɾσʔλΛDatadogɺNewRelicɺMongoDBͳͲͷαʔϏεϓϩόΠμ৴͍ͨ͠ ɾσʔλΛσʔλετΞʹ৴͢ΔલʹApachParquetApacheORCʹม͍ͨ͠ ɾΞϓϦͷ։ൃΠϯϑϥͷཧΛͤͣʹσʔλετΞ৴͍ͨ͠ Amazon Kinesis Data Firehose
ɾετϦʔϛϯάσʔλʹରͯ͠ϦΞϧλΠϜʹඪ४SQLͰΫΤϦ͍ͨ͠ ɾ1ඵະຬͷ͞ͰετϦʔϛϯάσʔλΛϦΞϧλΠϜͰੳ͍ͨ͠ ɾApache FlinkΛ༷ͬͯʑͳAWSαʔϏεͱ౷߹ͯ͠ετϦʔϛϯά ETL͍ͨ͠ ɾSQLɺJavaɺScalaɺPythonͰੳΞϓϦέʔγϣϯΛߏஙͯ͠ੳ͍ͨ͠ Amazon Kinesis Data Analytics
ɾϊϯϦΞϧλΠϜ ɾAWSͷετϨʔδίϯϐϡʔςΟϯάɺΦϯϓϨϛεͷσʔλΛఆظతʹҠಈ͍ͨ͠ ɾσʔλҠಈͷࡍʹ؆୯ͳมͳͲͷॲཧΛߦ͍͍ͨ ɾRDS→DynamoDBͳͲͷσʔλҠಈ͕͍ͨ͠ͳͲ AWS Data Pipeline
ɾߏԽσʔλɺߏԽσʔλΛੳ͍ͨ͠ ɾେن(ϖλόΠτ)σʔλʹରͯ͠ෳࡶͳSQLΫΤϦΛ࣮ߦ͍ͨ͠ ɾܧଓతͳॻ͖ࠐΈߋ৽ͳ͘ɺେنσʔλΛҰׅͰੳ͕͍ͨ͠ ɾRedshift SpectrumΛ༻͍ͯS3ͷσʔλʹରͯ͠SQLΫΤϦΛ࣮ߦ͍ͨ͠ ɾΫΤϦ݁ՌΛS3ʹอଘͯ͠ଞAWSαʔϏεͳͲͰར༻͍ͨ͠ Amazon Redshift
ɾσʔλS3ʹ͋ΓɺγϯϓϧͳΞυϗοΫΫΤϦΛ࣮ߦ͍ͨ͠ ɾcsvɺjsonɼorcɺParquetܗࣜͳͲͷϑΝΠϧʹΫΤϦ͍ͨ͠ ɾαʔόϨεʹΫΤϦΛ࣮ߦ͍ͨ͠ ɾETLෆཁ ɾΫΤϦ݁ՌΛcsvʹग़ྗ͍ͨ͠ AWS Athena
ɾσʔλϨΠΫΛ؆୯ʹߏங͍ͨ͠ ɾࠓޙͷσʔλੳʹ͚ͯنʹؔΘΒͣະՃͷσʔλΛҰݩอ͍ͨ͠ ɾσʔλՃޙɺະՃσʔλอ͍࣋ͨ͠ ɾ৫ͷ༷ʑͳ෦ॺ͕֤ʑσʔλΛͬͯੳΛ͍ͨ͠ Amazon LakeFarmation
ɾOSSΛॊೈʹΧελϚΠζͯ͠σʔλॲཧΛΓ͍ͨ ɾେنσʔληοτͷETL(நग़/ม/ಡΈࠐΈ)Λ͍ͨ͠ ɾApache Spark MLlibɺTensorFlowɺApache MXNetͰML͍ͨ͠ ɾApache SparkApache HiveͰS3ͷΫϦοΫετϦʔϜσʔλΛੳ͍ͨ͠ ɾApache
FlinkͱApache Spark StreamingͰϦΞϧλΠϜετϦʔϛϯά͍ͨ͠ Amazon EMR
ɾαʔόʔϨεͰதنͷETL(நग़/ม/ಡΈࠐΈ)͕͍ͨ͠ ɾRedshiftɺS3ɺRDSɺDynamoDBͳͲͷσʔλΛETL͍ͨ͠ ɾσʔλιʔεΛఆظతʹΫϩʔϧͯ͠DataCatalogΛߋ৽ࣗ͠ಈతʹม͍ͨ͠ AWS Glue
ɾOpenSearchΫϥελΛ؆୯ʹߏஙͯ͠ΞϓϦͷϩάσʔλΛੳ͍ͨ͠ ɾΞϓϦΣϒαΠτɺσʔλϨΠΫΧλϩάͷݕࡧͰ͖ΔΑ͏ʹ͍ͨ͠ ɾΠϯϑϥͷϩάϝτϦοΫΛऩूͯ͠ϦΞϧλΠϜʹՄࢹԽ͍ͨ͠ ɾετϦʔϜσʔλΛϦΞϧλΠϜʹՄࢹԽ͍ͨ͠ Amazon OpenSearch Service&Kibana
ɾαʔόϨεͳBIπʔϧ͕͍͍ͨ ɾ༷ʑͳσʔλιʔε͔ΒσʔλΛՄࢹԽ͍ͨ͠ ɹ※S3ɺRDSɺAthenaɺRedshiftɺOpenSearchɺcsvjsonͳͲ ɾϦΞϧλΠϜͰͳ͘ఆظతͳάϥϑσʔλͳͲͷϨϙʔτ͕ཉ͍͠ ɾ༷ʑͳάϥϑΛ༻͍ͯੳ͍ͨ͠ Amazon QuickSight
2VJDL4JHIUՄࢹԽΠϝʔδ IUUQTBXTBNB[PODPNKQRVJDLTJHIUHBMMFSZ
None
None
બఆʹ͓͚ΔߟྀϙΠϯτ
·ͱΊ
·ͱΊ ऩू/ੳ/ՄࢹԽͷཻʹӨڹ͢ΔͷͰɺ Կͷҝͷੳ͔Λ໌֬ʹ͠Α͏
͋Γ͕ͱ͏͍͟͝·ͨ͠