Upgrade to Pro — share decks privately, control downloads, hide ads and more …

AWSにおけるデータ分析入門 / Introduction To Data Analytics In AWS

Abcdb58e32f39440ddc00aa5447f4500?s=47 kumada15
October 06, 2021
48

AWSにおけるデータ分析入門 / Introduction To Data Analytics In AWS

Abcdb58e32f39440ddc00aa5447f4500?s=128

kumada15

October 06, 2021
Tweet

Transcript

  1. "8 4 ʹ ͓ ͚ Δ σ ʔ λ ෼

    ੳ ೖ ໳ ג ࣜ ձ ࣾ R e l i c ۽ ా ׮
  2. ࣗݾ঺հ  • ۽ా׮ ,BO,VNBEB  • ೥ळdΠϯϑϥΤϯδχΞ • ೥݄ʹגࣜձࣾ3FMJDೖࣾ

  3.  σʔλ෼ੳ͕͍ͨ͠ʜ

  4.  ϏδωεΛΠϯςϦδΣϯε͍ͨ͠ʜ

  5.  ೔ʑσʔλ෼ੳͷػӡ͸ߴ·Δ

  6.  ͦͷલʹ

  7.  • ݁Ռ΍࣮੷ͳͲͷσʔλΛऩू ˞زΒചΓ্͔͛ͨɺͲΕ͘Β͍ΞΫηε͕͔͋ͬͨͳͲ • ऩूͨ͠σʔλ͔ΒԿ͔͠ΒͷΠϯαΠτΛಘΔ ˞Կ͕ചΕ͍ͯΔ͔ɺ͍ͭɺ୭ʹചΕ͍ͯΔ͔ͳͲ • ಘΒΕͨΠϯαΠτʹରͯ͠ΞΫγϣϯΛى͜͢ ˞Ձ֨Λௐ੔ɺදࣔΛௐ੔ɺλʔήοτ֦େͳͲ

    σʔλ෼ੳͬͯԿ͢Δͷ
  8.  ԿΛ࣮ݱͨͯ͘͠σʔλ෼ੳΛ ͢Δͷ͔Λ໌֬ʹ͢Δͷ͕େࣄ

  9.  "84Ͱͷσʔλ෼ੳؔ࿈αʔϏε

  10.  ͳΔ΄ͲɺΘ͔ΒΜ

  11.  • ݁Ռ΍࣮੷ͳͲͷσʔλΛऩू ˠͲ͏΍ͬͯूΊΔ͔ɺԿॲʹूΊΔ͔ • ऩूͨ͠σʔλ͔ΒԿ͔͠ΒͷΠϯαΠτΛಘΔ ˠ෼ੳ͠΍͘͢Ճ޻ɺ෼ੳɺՄࢹԽ σʔλ෼ੳج൫Λߏங͢Δʹ͋ͨͬͯ

  12.  ͬ͘͟Γ෼ྨ

  13. ऩू Amazon Kinesis Amazon Kinesi s Video Streams Amazon Kinesi

    s Data Streams Amazon Kinesi s Data Firehose Amazon Manage d Streaming for Apache Kafka AWS Data Pipeline AWS Data Exchange
  14. ஝ੵ Amazon Redshift Amazon LakeFarmation Amazon S3

  15. Ճ޻ Amazon EMR AWS Glue AWS Glue 
 Elastic Views

    AWS Glue DataBrew Amazon Kinesi s Data Analytics
  16. ෼ੳ Amazon EMR AWS Athena Amazon Kinesi s Data Analytics

    Amazon Redshift Amazon QuickSight Amazon OpenSearch Service
  17. ՄࢹԽ Amazon ElasticSearch
 Service Amazon QuickSight Amazon OpenSearch Service ৭ʑ͋ͬͯ

  18.  ؾ࣋ͪɺগ͠ํ޲ੑݟ͖͑ͯͨ ؾ͕͢Δ

  19.  ͦΕͧΕΛͬ͘͟Γ

  20.  ऩू

  21. ϦΞϧλΠϜετϦʔϛϯά Amazon Kinesis Amazon Kinesi s Video Streams Amazon Kinesi

    s Data Streams Amazon Kinesi s Data Firehose Amazon Manage d Streaming for Apache Kafka AWS Data Pipeline AWS Data Exchange
  22. Amazon Kinesis Amazon Kinesi s Video Streams Amazon Kinesi s

    Data Streams Amazon Kinesi s Data Firehose Amazon Manage d Streaming for Apache Kafka KinesisαʔϏεͷ૯শ ετϦʔϛϯάಈըͷΩϟϓνϟɺ ॲཧɺอଘ ετϦʔϜσʔλͷΩϟϓνϟɺ ॲཧɺอଘ AWS σʔλετΞʹ ετϦʔϜσʔλΛϩʔυ ϚωʔδυܕApache Kafk a ετϦʔϜσʔλͷૹड৴
  23. ͦͷଞ Amazon Kinesis Amazon Kinesi s Video Streams Amazon Kinesi

    s Data Streams Amazon Kinesi s Data Firehose Amazon Manage d Streaming for Apache Kafka AWS Data Pipeline AWS Data Exchange
  24. AWS Data Pipeline AWS Data Exchange αʔυύʔςΟσʔλͷ αϒεΫϦϓγϣϯ Reuters͕ఏڙ͢ΔهࣄσʔλͳͲ ఆظ࣮ߦʹΑΔσʔλҠಈɺม׵

  25.  ஝ੵ

  26. Amazon Redshift Amazon LakeFarmation Amazon S3 σʔλ΢ΣΞϋ΢ε γεςϜ͔Β๲େͳ”ߏ଄Խσʔλ ” ΛूΊ੔ཧ͢Δ૔ݿ

    σʔλϨΠΫΛߏங ະՃ޻Ͱ༻్΋ఆΊΒΕ͍ͯͳ͍ σʔλΛอ؅͢Δ ΦϒδΣΫτετϨʔδ ”ߏ଄Խσʔλ”ɺ“ඇߏ଄Խσʔλ ” ͳͲΛอ؅͢ΔετϨʔδ
  27.  Ճ޻ɾ෼ੳ

  28. Amazon EMR AWS Glue AWS Glue 
 Elastic View s

    (ϓϨϏϡʔ) AWS Glue DataBrew ϏοάσʔλϑϨʔϜϫʔΫ ؔ࿈OSSΛ૊Έ߹Θͤͯେྔσʔλͷ ETL΍ετϦʔϛϯάॲཧ෼ੳΛ࣮ߦ αʔόϨεETL(நग़/ม׵/ϩʔυ) ϊʔίʔυͰσʔλͷ ΫϦʔϯΞοϓͱਖ਼نԽ ϚςϦΞϥΠζυϏϡʔߏங ෳ਺σʔλετΞʹΞΫηεͯ͠ σʔλΛ݁߹&ίϐʔ
  29. AWS Athena Amazon Kinesi s Data Analytics ΞυϗοΫΫΤϦΛS3ʹର࣮ͯ͠ߦ ετϦʔϛϯάσʔλΛม׵ɺ෼ੳ Amazon

    Redshift σʔλ΢ΣΞϋ΢ε ෳࡶͳSQLΫΤϦΛ࣮ߦ
  30.  ՄࢹԽ

  31. Amazon QuickSight Amazon OpenSearch Service&Kibana ϦΞϧλΠϜσʔλݕࡧ/ՄࢹԽ αʔόϨεBIπʔϧ/ՄࢹԽ

  32.  ͲΜͳ࣌ʹ࢖͏  ओཁͦ͏ͳ΋ͷ

  33. Amazon Kinesis Video Streams ɾಈըσʔλΛੜ੒͢ΔσόΠε͍҃͸ΞϓϦέʔγϣϯ͕͋Δ ɾHLSͰϥΠϒಈը΍࿥ըϝσΟΞΛϒϥ΢β΍εϚϗʹετϦʔϛϯά͍ͨ͠ ɾϦΞϧλΠϜͷ૒ํ޲ϝσΟΞετϦʔϛϯά΍webϒϥ΢βετϦʔϛϯά͕͍ͨ͠ ɾಈըσʔλΛRekognitionVideo(ಈըೝࣝ)΍SageMaker(ML)ʹ࢖͍͍ͨ

  34. ɾαʔό΍σόΠε͕ੜ੒͢Δϩά΍ΠϕϯτσʔλΛϦΞϧλΠϜͰߴ଎ऩू͍ͨ͠ ɾ1ඵҎԼͷ଎͞ͰσʔλΛऩू͍ͨ͠ ɾετϦʔϛϯάσʔλΛLambdaͰॲཧ͍ͨ͠ ɾετϦʔϛϯάσʔλΛEC2ʹసૹ͍ͨ͠ ɾετϦʔϛϯάσʔλΛKinesis Data Analyticsʹసૹͯ͠ϦΞϧλΠϜ෼ੳ͍ͨ͠ Amazon Kinesis Data

    Streams
  35. ɾετϦʔϜσʔλΛ௚઀S3΍RedshiftɺOpenSearchService΁഑৴͍ͨ͠ ɾ΄΅ϦΞϧλΠϜ(60ඵҎ಺)ͷ଎͞ͰσʔλΛ্هσʔλετΞ΁഑৴͍ͨ͠ ɾσʔλΛDatadogɺNewRelicɺMongoDBͳͲͷαʔϏεϓϩόΠμ΁௚઀഑৴͍ͨ͠ ɾσʔλΛσʔλετΞʹ഑৴͢ΔલʹApachParquet΍ApacheORCʹม׵͍ͨ͠ ɾΞϓϦͷ։ൃ΍Πϯϑϥͷ؅ཧΛͤͣʹσʔλετΞ΁഑৴͍ͨ͠ Amazon Kinesis Data Firehose

  36. ɾετϦʔϛϯάσʔλʹରͯ͠ϦΞϧλΠϜʹඪ४SQLͰΫΤϦ͍ͨ͠ ɾ1ඵະຬͷ଎͞ͰετϦʔϛϯάσʔλΛϦΞϧλΠϜͰ෼ੳ͍ͨ͠ ɾApache FlinkΛ࢖༷ͬͯʑͳAWSαʔϏεͱ౷߹ͯ͠ετϦʔϛϯά ETL͍ͨ͠ ɾSQLɺJavaɺScalaɺPythonͰ෼ੳΞϓϦέʔγϣϯΛߏஙͯ͠෼ੳ͍ͨ͠ Amazon Kinesis Data Analytics

  37. ɾϊϯϦΞϧλΠϜ ɾAWSͷετϨʔδ΍ίϯϐϡʔςΟϯάɺΦϯϓϨϛεͷσʔλΛఆظతʹҠಈ͍ͨ͠ ɾσʔλҠಈͷࡍʹ؆୯ͳม׵ͳͲͷॲཧΛߦ͍͍ͨ ɾRDS→DynamoDBͳͲͷσʔλҠಈ͕͍ͨ͠ͳͲ AWS Data Pipeline

  38. ɾߏ଄Խσʔλɺ൒ߏ଄ԽσʔλΛ෼ੳ͍ͨ͠ ɾେن໛(ϖλόΠτ)σʔλʹରͯ͠ෳࡶͳSQLΫΤϦΛ࣮ߦ͍ͨ͠ ɾܧଓతͳॻ͖ࠐΈ΍ߋ৽͸ͳ͘ɺେن໛σʔλΛҰׅͰ෼ੳ͕͍ͨ͠ ɾRedshift SpectrumΛ༻͍ͯS3ͷσʔλʹରͯ͠௚઀SQLΫΤϦΛ࣮ߦ͍ͨ͠ ɾΫΤϦ݁ՌΛS3ʹอଘͯ͠ଞAWSαʔϏεͳͲͰ΋ར༻͍ͨ͠ Amazon Redshift

  39. ɾσʔλ͸S3ʹ͋ΓɺγϯϓϧͳΞυϗοΫΫΤϦΛ࣮ߦ͍ͨ͠ ɾcsvɺjsonɼorcɺParquetܗࣜͳͲͷϑΝΠϧʹΫΤϦ͍ͨ͠ ɾαʔόϨεʹΫΤϦΛ࣮ߦ͍ͨ͠ ɾETL͸ෆཁ ɾΫΤϦ݁ՌΛcsvʹग़ྗ͍ͨ͠ AWS Athena

  40. ɾσʔλϨΠΫΛ؆୯ʹߏங͍ͨ͠ ɾࠓޙͷσʔλ෼ੳʹ޲͚ͯن໛ʹؔΘΒͣະՃ޻ͷσʔλΛҰݩอ؅͍ͨ͠ ɾσʔλՃ޻ޙ΋ɺະՃ޻σʔλ͸อ͍࣋ͨ͠ ɾ૊৫ͷ༷ʑͳ෦ॺ͕֤ʑσʔλΛ࢖ͬͯ෼ੳΛ͍ͨ͠ Amazon LakeFarmation

  41. ɾOSSΛॊೈʹΧελϚΠζͯ͠σʔλॲཧΛ΍Γ͍ͨ ɾେن໛σʔληοτͷETL(நग़/ม׵/ಡΈࠐΈ)Λ͍ͨ͠ ɾApache Spark MLlibɺTensorFlowɺApache MXNetͰML͍ͨ͠ ɾApache Spark΍Apache HiveͰS3ͷΫϦοΫετϦʔϜσʔλΛ෼ੳ͍ͨ͠ ɾApache

    FlinkͱApache Spark StreamingͰϦΞϧλΠϜετϦʔϛϯά͍ͨ͠ Amazon EMR
  42. ɾαʔόʔϨεͰதن໛ͷETL(நग़/ม׵/ಡΈࠐΈ)͕͍ͨ͠ ɾRedshiftɺS3ɺRDSɺDynamoDBͳͲͷσʔλΛETL͍ͨ͠ ɾσʔλιʔεΛఆظతʹΫϩʔϧͯ͠DataCatalogΛߋ৽ࣗ͠ಈతʹม׵͍ͨ͠ AWS Glue

  43. ɾOpenSearchΫϥελΛ؆୯ʹߏஙͯ͠ΞϓϦͷϩάσʔλΛ෼ੳ͍ͨ͠ ɾΞϓϦ΍΢ΣϒαΠτɺσʔλϨΠΫΧλϩάͷݕࡧͰ͖ΔΑ͏ʹ͍ͨ͠ ɾΠϯϑϥͷϩά΍ϝτϦοΫΛऩूͯ͠ϦΞϧλΠϜʹՄࢹԽ͍ͨ͠ ɾετϦʔϜσʔλΛϦΞϧλΠϜʹՄࢹԽ͍ͨ͠ Amazon OpenSearch Service&Kibana

  44. ɾαʔόϨεͳBIπʔϧ͕࢖͍͍ͨ ɾ༷ʑͳσʔλιʔε͔ΒσʔλΛՄࢹԽ͍ͨ͠ ɹ※S3ɺRDSɺAthenaɺRedshiftɺOpenSearchɺcsv΍jsonͳͲ ɾϦΞϧλΠϜͰ͸ͳ͘ఆظతͳάϥϑσʔλͳͲͷϨϙʔτ͕ཉ͍͠ ɾ༷ʑͳάϥϑΛ༻͍ͯ෼ੳ͍ͨ͠ Amazon QuickSight

  45. 2VJDL4JHIUՄࢹԽΠϝʔδ IUUQTBXTBNB[PODPNKQRVJDLTJHIUHBMMFSZ

  46. None
  47. None
  48. બఆʹ͓͚ΔߟྀϙΠϯτ

  49.  ·ͱΊ

  50. ·ͱΊ  ऩू/෼ੳ/ՄࢹԽͷཻ౓ʹӨڹ͢ΔͷͰɺ Կͷҝͷ෼ੳ͔Λ໌֬ʹ͠Α͏

  51. ͋Γ͕ͱ͏͍͟͝·ͨ͠