Slide 1

Slide 1 text

AWS Serverless ͱϑΝΠϧ Step Functions ͱ Lambda Function Ͱ૊Ή Athena ʹΑΔϑΝΠϧॲཧ #serverlessjp

Slide 2

Slide 2 text

ɹαʔόʔϨεͰϑΝΠϧ to ϑΝΠϧͷETL ࠓ೔ͷ࿩

Slide 3

Slide 3 text

No content

Slide 4

Slide 4 text

αʔόʔϨεͰETLͱ͍͑͹ AWS Glue ?

Slide 5

Slide 5 text

No content

Slide 6

Slide 6 text

Amazon Athena Ͱσʔλநग़ʁ

Slide 7

Slide 7 text

Θͨ͠ͷ"84(MVF΍"NB[PO"UIFOBͷୈҰҹ৅ "84ίϯιʔϧ͔Β࢖ͬͨ͜ͱ͋Δ໊͠લ΋஌ͬͯΔ ͚ͲɺΞϓϦέʔγϣϯʹ૊ΈࠐΉΠϝʔδ͕͋·ΓΘ ͔ͳ͍

Slide 8

Slide 8 text

Θͨ͠ͷ"84(MVF΍"NB[PO"UIFOBͷୈҰҹ৅ αʔϏεͷ࣮ମ͕Α͘Θ͔ΒΜ 4ʢΩʔόϦϡʔͷʣετϨʔδ %ZOBNP%#ɿ/P42- Ћ ͷϑϧϚωʔδυσʔλϕʔε "84"QQ4ZODɿ(SBQI2-"1*ͷήʔτ΢ΣΠ 4UFQ'VODUJPOTɿλεΫΛఆٛ͠ঢ়ଶભҠͱͯ͠δϣϒΛ૊Έཱͯ "84(MVFɿͳΜ͔ม׵͢Δਓʁ "NB[PO"UIFOBɿ4ͷσʔλʹରͯ͠ΫΤϦൃߦͯ͘͠ΕΔਓʁ

Slide 9

Slide 9 text

ຊηογϣϯͷ໨త "84(MVFͱ"NB[PO"UIFOBͷ͜ͱΛ஌ΓɺαʔόʔϨ εΞϓϦέʔγϣϯʹ૊ΈࠐΉΠϝʔδΛ࣋ͬͯؼͬͯ΋Β͏ Ξϓϩʔνɿ "84(MVFͱ"NB[PO"UIFOBͷ͜ͱΛ஌Δ ϑΝΠϧUPϑΝΠϧͷ&5-ϢʔεέʔεΛߟ͑Δ "84$%,ͰϦιʔεΛ࡞੒ͯ͠ΈΔ

Slide 10

Slide 10 text

Ϋϥϝιͷ࿨ాͰ͢αʔόʔϨε޷͖Ͱ͢ ࿨ా༞հ $9ࣄۀຊ෦ αʔόʔϨεΤϯδχΞ XBEEZ@V ༗ޮظݶ੾Ε

Slide 11

Slide 11 text

$POUFOUT "84(MVFͱ"NB[PO"UIFOBͷؔ܎ αϯϓϧϢʔεέʔεɿΠϕϯτ؅ཧαʔϏε "84$%,ʹΑΔ&5-δϣϒ૊Έཱͯ

Slide 12

Slide 12 text

"84(MVF ߏ੒ཁૉ͸ɺσʔλΧλϩάͱαʔόʔϨε&5-ɺΦʔ έετϨʔγϣϯͷ̏ͭ ߏ੒ཁૉ Ϧιʔε໊௨শ ʢ$MPVE'PSNBUJPOΑΓʣ σʔλΧλϩά "84(MVF5BCMF αʔόʔϨε&5- "84(MVF+PC ΦʔέετϨʔγϣϯ "84(MVF8PSLqPX ࢀߟɿAWS CloudFormation ςϯϓϨʔτΛ࢖༻ͨ͠σʔλΧλϩάͷࣄલઃఆ - https://docs.aws.amazon.com/ja_jp/glue/latest/dg/populate-with-cloudformation-templates.html

Slide 13

Slide 13 text

"84(MVF ߏ੒ཁૉ͸ɺσʔλΧλϩάͱαʔόʔϨε&5-ɺΦʔ έετϨʔγϣϯͷ̏ͭ ߏ੒ཁૉ Ϧιʔε໊௨শ ʢ$MPVE'PSNBUJPOΑΓʣ σʔλΧλϩά "84(MVF5BCMF αʔόʔϨε&5- "84(MVF+PC ΦʔέετϨʔγϣϯ "84(MVF8PSLqPX ࢀߟɿAWS CloudFormation ςϯϓϨʔτΛ࢖༻ͨ͠σʔλΧλϩάͷࣄલઃఆ - https://docs.aws.amazon.com/ja_jp/glue/latest/dg/populate-with-cloudformation-templates.html

Slide 14

Slide 14 text

Glue Table

Slide 15

Slide 15 text

"84(MVF5BCMF "QBDIF)JWFޓ׵ͷϝλετΞ ΋͏গ͠Θ͔Γ΍͘͢ɿεΩʔϚఆ͚ٛͩΛ(MVFʹ΋ͨͤΒΕΔ σʔλͷ࣮ମ͸ผʹ࣋ͨͤɺ(MVFͱͯ͠͸ఆ͚͓ٛͩ࣋ͬͯ͘ (MVF5BCMF͸͔͋ͨ΋εΩʔϚఆٛΛ΋ͬͨςʔϒϧͷΑ͏ʹѻ͑ Δ ྫɿ"NB[PO"UIFOBͰΫΤϦͨ͠Γɺ(MVF+PCͰΧϥϜ໊͔Β σʔλΛͱ͖ͬͯͨΓͰ͖Δ

Slide 16

Slide 16 text

"84(MVF5BCMF σʔλͷ࣮ମ 4ͳͲ εΩʔϚఆٛͳͲͷϝλσʔ λ 20190806 AWS Black Belt Online Seminar AWS Glue https://www.slideshare.net/AmazonWebServicesJapan/20190806-aws-black-belt-online-seminar-aws-glue?ref=https://aws.amazon.com/jp/blogs/news/webinar-bb-aws-glue-2019/

Slide 17

Slide 17 text

Glue Job

Slide 18

Slide 18 text

"84(MVF+PC &5-ʢσʔλม׵ॲཧʣͷຊମ σʔλΧλϩάʢ(MVF5BCMF ͔ΒσʔλΛநग़͠ɺϩδο ΫΛద༻ͯ͠ɺผͷσʔλετΞʢλʔήοτʣʹग़ྗ͢Δ

Slide 19

Slide 19 text

"84(MVF+PC εΫϦϓτ͸1ZUIPO4IFMMͱ4QBSLͰબ୒ "84(MVF+PC 1ZUIPO4IFMM "84(MVF+PC 4QBSL ʢ$MPVE'PSNBUJPOΑΓʣ ϝϞϦ (#PS(# (# ࣮ߦ࣌ؒ σϑΥϧτ࣌ؒ σϑΥϧτ࣌ؒ ಉ࣮࣌ߦ਺ ΞΧ΢ϯτ಺Ͱ ΞΧ΢ϯτ಺Ͱ ՝ֹۚ ੑೳͱ࣮ߦ࣌ؒϕʔεͷैྔ՝ۚ ͨͩ͠গͳ͘ͱ΋෼ʹ੾Γ্͛ ੑೳͱ࣮ߦ࣌ؒϕʔεͷैྔ՝ۚ ͨͩ͠গͳ͘ͱ΋෼ʹ੾Γ্͛ ىಈ࣌ؒ ਺ेඵ਺෼ ਺ेඵ਺෼ ݴޠ 1ZUIPO 1ZUIPOPS4DBMB AWS GlueΛ࢖ͬͨ Serverless ETL ͷ࣮૷ύλʔϯ - Speaker Deck https://speakerdeck.com/seiichi1101/aws-gluewoshi-tuta-serverless-etl-falseshi-zhuang-patan-ac5553a5-f8e9-45fb-b093-4bcf0b113a48?slide=41

Slide 20

Slide 20 text

Glue Workflow

Slide 21

Slide 21 text

"84(MVF8PSLqPX ΫϩʔϥͱδϣϒΛ࢖ͬͯҰ࿈ͷॲཧΛఆٛͰ͖Δ wਐߦঢ়گΛ֬ೝͰ͖Δ wෳ਺ͷδϣϒΛ૊Έ߹Θ ͤΒΕΔ w4UFQ'VODUJPOTͱࣅͯ ͍Δ͕ɺͪ͜Β͸(MVF ͷΫϩʔϥ (MVF+PCͷ ΈࢦఆՄೳ

Slide 22

Slide 22 text

Amazon Athena ͱͷؔ܎

Slide 23

Slide 23 text

No content

Slide 24

Slide 24 text

No content

Slide 25

Slide 25 text

ެࣜτοϓʹ΋͏ॻ͍ͯΔ

Slide 26

Slide 26 text

"84(MVF5BCMFͱ"NB[PO"UIFOB 4σʔλΧλϩάʹରͯ͠1SFTUPΫΤϦΛൃߦ

Slide 27

Slide 27 text

"84(MVF5BCMFͱ"NB[PO"UIFOB 4σʔλΧλϩάʹରͯ͠1SFTUPΫΤϦΛൃߦ

Slide 28

Slide 28 text

ͳΜͱͳ͘Πϝʔδ͕༙͍ͨͰ͠ΐ͏͔ αʔϏεͷ࣮ମʹ࠶νϟϨϯδ "84(MVFɿ(MVF5BCMF (MVF+PC (MVF8PSLqPXʹ෼ղ Ͱ͖ɺͦΕͧΕࣗ༝ʹ૊Έ߹ΘͤΒΕΔ&5-αʔϏε "NB[PO"UIFOBɿ4Λσʔλιʔεͱ͢Δ(MVF5BCMFʹ ରͯ͠1SFTUPΫΤϦΛൃߦ͠ɺ݁ՌΛ4ʹग़ྗ͢ΔαʔϏ ε

Slide 29

Slide 29 text

$POUFOUT "84(MVFͱ"NB[PO"UIFOBͷؔ܎ αϯϓϧϢʔεέʔεɿΠϕϯτ؅ཧαʔϏε "84$%,ʹΑΔ&5-δϣϒ૊Έཱͯ

Slide 30

Slide 30 text

ϑΝΠϧΛѻ͍ͦ͏ͳγνϡΤʔγϣϯ ՍۭͷΠϕϯτ؅ཧαʔϏε (PPHMF'PSNͳͲͰηογϣϯΛެื͢Δʢ*%ϝΞυʣ ಉ࣌ʹɺผͷ࢓૊ΈͰࢀՃνέοτΛൃߦ͢Δʢ*%ϝΞυʣ ͜ΕΒ͸$47ग़ྗՄೳ νέοτ*%Λࢦఆ͢Δͱɺηογϣϯ৘ใΛऔಘͰ͖ΔΑ͏ͳ "1*Λ࡞Γ͍ͨ

Slide 31

Slide 31 text

ϑΝΠϧΛѻ͍ͦ͏ͳγνϡΤʔγϣϯ ՍۭͷΠϕϯτ؅ཧαʔϏε (PPHMF'PSNͳͲͰηογϣϯΛެื͢Δʢ*%ϝΞυʣ ಉ࣌ʹɺผͷ࢓૊ΈͰࢀՃνέοτΛൃߦ͢Δʢ*%ϝΞυʣ ͜ΕΒ͸$47ग़ྗՄೳ νέοτ*%Λࢦఆ͢Δͱɺηογϣϯ৘ใΛऔಘͰ͖ΔΑ͏ͳ "1*Λ࡞Γ͍ͨ

Slide 32

Slide 32 text

ͳͥʁ ෳ਺ͷαʔϏεͷ৘ใ͔ΒͻͱͭͷαʔϏεͷόοΫΤϯυ ʹݟ͔͚ͤΔ࢓૊Έ͸ࠓޙ૿͑ͦ͏ ศརͳαʔϏε͕ͨ͘͞Μ ͲΕ͔ҰͭͷαʔϏεΛ࢖͍ͬͯΕ͹ਓੜ͕׬݁͢Δͱ͍͏͜ͱ͸ গͳ͘ͳ͖ͬͯͦ͏ େ఍ͷαʔϏεͰ͸ɺσʔλΛ$47ͱͯ͠ग़ྗՄೳ 8FCIPPL &WFOU#SJEHFʜ͍Ζ͍Ζͳ࿈ܞํ๏͕͋Δ͕ɺϑΝΠϧ Λೖྗͱͯ͠&5-͢Δधཁ΋ɺࠓޙ૿͑ͦ͏

Slide 33

Slide 33 text

ηογϣϯ৘ใͷެืͱ$47ग़ྗ .csv

Slide 34

Slide 34 text

ࢀՃऀνέοτͷ$47ग़ྗ λΠϜελϯϓ ໊͓લ ϝʔϧΞυϨε ॴଐ૊৫ ࢀՃ೔ ޕޙ(.5 ࿨ా༞հ XBEBZVTVLF!DMBTTNFUIPEKQ Ϋϥεϝιουגࣜձࣾ ೔ ޕޙ(.5 ৽Ҫ੣Ұ XBEBZVTVLF!DMBTTNFUIPEKQ Ϋϥεϝιουגࣜձࣾ ೔ ࢀՃ೔Λ͓બͼ͍ͩ͘͞ɻ νέοτ*% Ϣʔβʔ*% ೔໨೔໨೔໨ BFBCFCBEFCEFG GEEBBDBFCBC ೔໨೔໨೔໨ FGFBBCFBBCF ECED

Slide 35

Slide 35 text

νέοτ*%͔Βηογϣϯ৘ใΛҾ͘ (&5IUUQTBQJNFFUVQQFSDPNTFTTJPO FWFOU*E UJDLFU*EBFBCFCBEFCEFG { "name": "࿨ా ༞հ", "title": "αʔόʔϨε x ϑΝΠϧɿAmazon Athena ͱ Step Functions Λ૊Έ߹ΘͤͯϑΝΠϧΛ݁߹͢Δ", "description": "αʔόʔϨεͰϑΝΠϧΛѻ͏γʔϯ͕૿͖͑ͯ·ͨ͠ɻAWSͰอଘ͓ͯ͘͠৔ॴͱͯ͠S3͸༗ྗͰ͕͢ɺΞ ϓϦέʔγϣϯͰར༻Ͱ͖Δܗࣜʹม׵͢Δํ๏ͱͯ͠ɺଟ͘ͷબ୒ࢶ͕͋Γ·͢ɻ\n͜ͷηογϣϯͰ͸ɺʮผʑͷαʔϏε Ͱൃߦ͞ΕͨCSVϑΝΠϧΛJOIN͢Δʯͱ͍͏Ϣʔεέʔεʹର͠ɺAmazonAthenaΛ࢖ͬͨΞϓϩʔνΛ঺հ͠·͢ɻ͞Β ʹɺҰ࿈ͷॲཧΛStepFunctionsʹ·ͱΊΔ͜ͱͰɺδϣϒͱͯ͠ఆٛ͢Δํ๏Λࣔ͠·͢ɻ" }

Slide 36

Slide 36 text

ཁ݅Λ֬ೝ ࠷ऴతʹ͸νέοτ*%͔Βɺηογϣϯσʔλ͕Ҿͬ ுͬͯ͜ΒΕΕ͹ྑ͍ ืूظؒऴྃޙ͸$47ϑΝΠϧͷมߋ͸΄΅ͳ͍ͷͰɺϦΞϧ λΠϜʹมߋΛ൓ө͢Δඞཁ͸ͳ͍ ͨͩ͠؅ཧը໘ͳͲ͔Β$47ϑΝΠϧΛΞοϓϩʔυ͢Δ͜ͱ ͰมߋΛ൓өͰ͖ΔΑ͏ʹͯ͠΄͍͠ɻ͍͍ͩͨ෼͘Β͍Ͱ Πϕϯτ͝ͱʹืू͢ΔͷͰɺΠϕϯτऴྃޙ͸ΞʔΧΠϒՄೳ

Slide 37

Slide 37 text

Ͳ͏࣮ݱ͢Δ͔ 3%#Ͱ͋Ε͹ͦΕͧΕͷςʔϒϧʹσʔλΛೖΕͯϦ ΫΤετ࣌ʹ+0*/͢Ε͹ྑ͍͕ʜ "84αʔόʔϨεͩͱҎԼύλʔϯ͍ͣΕ͔

Slide 38

Slide 38 text

Ͳ͏࣮ݱ͢Δ͔ 3%#Ͱ͋Ε͹ͦΕͧΕͷςʔϒϧʹσʔλΛೖΕͯϦ ΫΤετ࣌ʹ+0*/͢Ε͹ྑ͍͕ʜ "84αʔόʔϨεͩͱҎԼύλʔϯ͍ͣΕ͔ ͜ΕΛ࢖͏ ɾϦΞϧλΠϜੑ͸ཁٻ͞Εͳ͍ ɾݹ͍ΠϕϯτͷΦϒδΣΫτ͸(MBDJFSߦ͖ ɾಡΈग़࣌͠ͷϩδοΫ͸࠷খݶʹ͍ͨ͠

Slide 39

Slide 39 text

ΰʔϧɿJOINࡁΈͷS3 ΦϒδΣΫτΛ࡞Δ S3 to S3 ͷϑΝΠϧ ETL

Slide 40

Slide 40 text

"84(MVFʹΑΔ&5-Λࢥ͍ग़͢

Slide 41

Slide 41 text

"84(MVFʹΑΔ&5-Λࢥ͍ग़͢

Slide 42

Slide 42 text

"84(MVFʹΑΔ&5-Λࢥ͍ग़͢ ɾσʔλΧλϩάΛ"NB[PO"UIFOBͰΫΤϦ͢Δ ɾJOTFSUJOUPͰ+0*/͢Δ ɾҰ࿈ͷྲྀΕΛ4UFQ'VODUJPOTͰ૊Ή ʜͱ͍͏ϧʔτ΋͋Δ

Slide 43

Slide 43 text

ͳͥผϧʔτΛݕ౼͢Δͷ͔ʁ ͜ͷن໛ͩͱ(MVF+PCͷىಈ࣌ؒͱ՝ֹ͕ۚωοΫ "84(MVF+PC 1ZUIPO4IFMM "84(MVF+PC 4QBSL ʢ$MPVE'PSNBUJPOΑΓʣ ϝϞϦ (#PS(# (# ࣮ߦ࣌ؒ σϑΥϧτ࣌ؒ σϑΥϧτ࣌ؒ ಉ࣮࣌ߦ਺ ΞΧ΢ϯτ಺Ͱ ΞΧ΢ϯτ಺Ͱ ՝ֹۚ ੑೳͱ࣮ߦ࣌ؒϕʔεͷैྔ՝ۚ ͨͩ͠গͳ͘ͱ΋෼ʹ੾Γ্͛ ੑೳͱ࣮ߦ࣌ؒϕʔεͷैྔ՝ۚ ͨͩ͠গͳ͘ͱ΋෼ʹ੾Γ্͛ ىಈ࣌ؒ ਺ेඵ਺෼ ਺ेඵ਺෼ ݴޠ 1ZUIPO 1ZUIPOPS4DBMB AWS GlueΛ࢖ͬͨ Serverless ETL ͷ࣮૷ύλʔϯ - Speaker Deck https://speakerdeck.com/seiichi1101/aws-gluewoshi-tuta-serverless-etl-falseshi-zhuang-patan-ac5553a5-f8e9-45fb-b093-4bcf0b113a48?slide=41

Slide 44

Slide 44 text

ͳͥผϧʔτΛݕ౼͢Δͷ͔ʁ ͜ͷن໛ͩͱ(MVF+PCͷىಈ࣌ؒͱ՝ֹ͕ۚωοΫ "84(MVF+PC 1ZUIPO4IFMM "84(MVF+PC 4QBSL ʢ$MPVE'PSNBUJPOΑΓʣ ϝϞϦ (#PS(# (# ࣮ߦ࣌ؒ σϑΥϧτ࣌ؒ σϑΥϧτ࣌ؒ ಉ࣮࣌ߦ਺ ΞΧ΢ϯτ಺Ͱ ΞΧ΢ϯτ಺Ͱ ՝ֹۚ ੑೳͱ࣮ߦ࣌ؒϕʔεͷैྔ՝ۚ ͨͩ͠গͳ͘ͱ΋෼ʹ੾Γ্͛ ੑೳͱ࣮ߦ࣌ؒϕʔεͷैྔ՝ۚ ͨͩ͠গͳ͘ͱ΋෼ʹ੾Γ্͛ ىಈ࣌ؒ ਺ेඵ਺෼ ਺ेඵ਺෼ ݴޠ 1ZUIPO 1ZUIPOPS4DBMB AWS GlueΛ࢖ͬͨ Serverless ETL ͷ࣮૷ύλʔϯ - Speaker Deck https://speakerdeck.com/seiichi1101/aws-gluewoshi-tuta-serverless-etl-falseshi-zhuang-patan-ac5553a5-f8e9-45fb-b093-4bcf0b113a48?slide=41

Slide 45

Slide 45 text

ͳͥผϧʔτΛݕ౼͢Δͷ͔ʁ (MVF5BCMFͷΫΤϦ͸"UIFOBʹ೚࣮ͤͯߦ؀ڥΛ -BNCEB'VODUJPOʹ͢Δͱ͍͏ख "84(MVF+PC 1ZUIPO4IFMM "84(MVF+PC 4QBSL ʢ$MPVE'PSNBUJPOΑΓʣ "84-BNCEB ϝϞϦ (#PS(# (# .# ࣮ߦ࣌ؒ σϑΥϧτ࣌ؒ σϑΥϧτ࣌ؒ ࠷େ෼ ಉ࣮࣌ߦ਺ ΞΧ΢ϯτ಺Ͱ ΞΧ΢ϯτ಺Ͱ Ϧʔδϣϯຖʹ ՝ֹۚ ෼ʹ੾Γ্͛ ෼ʹ੾Γ্͛ ϛϦඵʹ੾Γ্͛ ىಈ࣌ؒ ਺ेඵ਺෼ ਺ेඵ਺෼ ਺ϛϦඵ਺ඵ ݴޠ 1ZUIPO 1ZUIPOPS4DBMB αϙʔτ͍ͯ͠ΔϥϯλΠϜ ΧελϜϥϯλΠϜ AWS GlueΛ࢖ͬͨ Serverless ETL ͷ࣮૷ύλʔϯ - Speaker Deck https://speakerdeck.com/seiichi1101/aws-gluewoshi-tuta-serverless-etl-falseshi-zhuang-patan-ac5553a5-f8e9-45fb-b093-4bcf0b113a48?slide=41

Slide 46

Slide 46 text

ΞʔΩςΫνϟ֓؍ ※࣮ࡍʹ͸ S3 ΦϒδΣΫτͷURLͳͲΛอଘ͓ͯ͘͠ DynamoDB ͕ඞཁ

Slide 47

Slide 47 text

$POUFOUT "84(MVFͱ"NB[PO"UIFOBͷؔ܎ αϯϓϧϢʔεέʔεɿΠϕϯτ؅ཧαʔϏε "84$%,ʹΑΔ&5-δϣϒ૊Έཱͯ

Slide 48

Slide 48 text

"84$%,Λ࢖ͬͯ૊ΈཱͯΔྲྀΕͷҰ෦Λ঺հ $%,ʹ͍ͭͯ͸ΫϥϝιͷΠϕϯτͰ΋ൃදͯ͠·͢ ʲొஃࢿྉʳAWS CDK Λ࢖ͬͨ αʔόʔϨεΞϓϦέʔγϣϯͷσϓϩΠํ๏ͱ࣮૷ྫΛ঺հ͠·ͨ͠ – Developes.IO 2020 CONNECT #devio2020 | Developers.IO https://dev.classmethod.jp/articles/production-deploy-serverless-aws-cdk/

Slide 49

Slide 49 text

"84$%,*OGSBTUSVDUVSFBT$PEF 5ZQF4DSJQUͳͲͷίʔυ͔Β$MPVE'PSNBUJPOς ϯϓϨʔτΛग़ྗɺσϓϩΠՄೳ What is the AWS CDK? - AWS Cloud Development Kit (AWS CDK) https://docs.aws.amazon.com/cdk/latest/guide/home.html

Slide 50

Slide 50 text

$%,αϯϓϧɿ4-BNCEB'VODUJPO

Slide 51

Slide 51 text

"UIFOBͬͯ$47Ͱ΋ΫΤϦͰ͖ΔΜ͡Όʁ ࠓճͷηογϣϯ֓ཁͷΑ͏ͳ$47σʔλ಺ʹվߦΛؚΉ σʔλΛ͏·͘ύʔεͰ͖ͳ͔ͬͨ ଞʹ΋೔ຊޠͷΧϥϜ໊ͩͱύʔςΟγϣϯ͕੾Εͳ͍ࣄ৅Λ֬ೝ $47ͷจࣈίʔυ͕"UIFOBͰѻ͑ͳ͍4+*4ͩͬͨΓ͢Δ $47ੜσʔλΛͦͷ··+0*/͢ΔͷͰ͸ͳ͘ɺҰ౓+40/-JOFT ʹม׵͢ΔͳͲͨ͠΄͏͕Αͦ͞͏ ͜͏͍ͬͨࡉ͔͍ॲཧ͕Ͱ͖Δͷ΋-BNCEB'VODUJPOͷڧΈͷͻ ͱͭ

Slide 52

Slide 52 text

$%,αϯϓϧɿ4-BNCEB'VODUJPO export async function sessionConverter( stack: cdk.Construct, global: GlobalProps, ): Promise { const rawCsvBucket = new s3.Bucket(stack, 'Raw', { bucketName: global.getBucketName('raw', cdk.Stack.of(stack).account), cors: [ { allowedHeaders: ['*'], allowedMethods: [HttpMethods.PUT], allowedOrigins: ['*'], }, ], }); const jsonLinesBucket = new s3.Bucket(stack, 'JsonLines', { bucketName: global.getBucketName( 'json-lines', cdk.Stack.of(stack).account, ), }); const convertSessionCsvToJsonLinesFn = new lambda.Function( stack, 'ConvertSessionCsvToJsonLines', { functionName: global.getFunctionName('ConvertSessionCsvToJsonLines'), code: lambda.Code.fromAsset(NODE_LAMBDA_SRC_DIR), handler: 'lambda/handlers/sf/csv-converter/convert-session-csv-to-json-handler.handler', runtime: lambda.Runtime.NODEJS_12_X, layers: [nodeModulesLayer], timeout: Duration.minutes(15), memorySize: 256, environment: { REGION: global.pm.region, CSV_BUCKET_NAME: rawCsvBucket.bucketName, JSON_LINES_BUCKET_NAME: jsonLinesBucket.bucketName, }, }, ); rawCsvBucket.grantRead(convertSessionCsvToJsonLinesFn); jsonLinesBucket.grantReadWrite(convertSessionCsvToJsonLinesFn);

Slide 53

Slide 53 text

$%,αϯϓϧɿ4-BNCEB'VODUJPO export async function sessionConverter( stack: cdk.Construct, global: GlobalProps, ): Promise { const rawCsvBucket = new s3.Bucket(stack, 'Raw', { bucketName: global.getBucketName('raw', cdk.Stack.of(stack).account), cors: [ { allowedHeaders: ['*'], allowedMethods: [HttpMethods.PUT], allowedOrigins: ['*'], }, ], }); const jsonLinesBucket = new s3.Bucket(stack, 'JsonLines', { bucketName: global.getBucketName( 'json-lines', cdk.Stack.of(stack).account, ), }); const convertSessionCsvToJsonLinesFn = new lambda.Function( stack, 'ConvertSessionCsvToJsonLines', { functionName: global.getFunctionName('ConvertSessionCsvToJsonLines'), code: lambda.Code.fromAsset(NODE_LAMBDA_SRC_DIR), handler: 'lambda/handlers/sf/csv-converter/convert-session-csv-to-json-handler.handler', runtime: lambda.Runtime.NODEJS_12_X, layers: [nodeModulesLayer], timeout: Duration.minutes(15), memorySize: 256, environment: { REGION: global.pm.region, CSV_BUCKET_NAME: rawCsvBucket.bucketName, JSON_LINES_BUCKET_NAME: jsonLinesBucket.bucketName, }, }, ); rawCsvBucket.grantRead(convertSessionCsvToJsonLinesFn); jsonLinesBucket.grantReadWrite(convertSessionCsvToJsonLinesFn); w $47ੜσʔλΛ౤ೖ͢Δ4 όέοτ w QSFTJHOFEVSM͔ΒΞοϓ ϩʔυͯ͠΋Β͏ͷͰ$034 ઃఆ

Slide 54

Slide 54 text

$%,αϯϓϧɿ4-BNCEB'VODUJPO export async function sessionConverter( stack: cdk.Construct, global: GlobalProps, ): Promise { const rawCsvBucket = new s3.Bucket(stack, 'Raw', { bucketName: global.getBucketName('raw', cdk.Stack.of(stack).account), cors: [ { allowedHeaders: ['*'], allowedMethods: [HttpMethods.PUT], allowedOrigins: ['*'], }, ], }); const jsonLinesBucket = new s3.Bucket(stack, 'JsonLines', { bucketName: global.getBucketName( 'json-lines', cdk.Stack.of(stack).account, ), }); const convertSessionCsvToJsonLinesFn = new lambda.Function( stack, 'ConvertSessionCsvToJsonLines', { functionName: global.getFunctionName('ConvertSessionCsvToJsonLines'), code: lambda.Code.fromAsset(NODE_LAMBDA_SRC_DIR), handler: 'lambda/handlers/sf/csv-converter/convert-session-csv-to-json-handler.handler', runtime: lambda.Runtime.NODEJS_12_X, layers: [nodeModulesLayer], timeout: Duration.minutes(15), memorySize: 256, environment: { REGION: global.pm.region, CSV_BUCKET_NAME: rawCsvBucket.bucketName, JSON_LINES_BUCKET_NAME: jsonLinesBucket.bucketName, }, }, ); rawCsvBucket.grantRead(convertSessionCsvToJsonLinesFn); jsonLinesBucket.grantReadWrite(convertSessionCsvToJsonLinesFn); w +40/-JOFTΛอଘ͓ͯ͘͠ όέοτ

Slide 55

Slide 55 text

$%,αϯϓϧɿ4-BNCEB'VODUJPO export async function sessionConverter( stack: cdk.Construct, global: GlobalProps, ): Promise { const rawCsvBucket = new s3.Bucket(stack, 'Raw', { bucketName: global.getBucketName('raw', cdk.Stack.of(stack).account), cors: [ { allowedHeaders: ['*'], allowedMethods: [HttpMethods.PUT], allowedOrigins: ['*'], }, ], }); const jsonLinesBucket = new s3.Bucket(stack, 'JsonLines', { bucketName: global.getBucketName( 'json-lines', cdk.Stack.of(stack).account, ), }); const convertSessionCsvToJsonLinesFn = new lambda.Function( stack, 'ConvertSessionCsvToJsonLines', { functionName: global.getFunctionName('ConvertSessionCsvToJsonLines'), code: lambda.Code.fromAsset(NODE_LAMBDA_SRC_DIR), handler: 'lambda/handlers/sf/csv-converter/convert-session-csv-to-json-handler.handler', runtime: lambda.Runtime.NODEJS_12_X, layers: [nodeModulesLayer], timeout: Duration.minutes(15), memorySize: 256, environment: { REGION: global.pm.region, CSV_BUCKET_NAME: rawCsvBucket.bucketName, JSON_LINES_BUCKET_NAME: jsonLinesBucket.bucketName, }, }, ); rawCsvBucket.grantRead(convertSessionCsvToJsonLinesFn); jsonLinesBucket.grantReadWrite(convertSessionCsvToJsonLinesFn); w ੜσʔλόέοτ͔Βϩʔυ ͠ɺม׵͠ɺ+40/-JOFTό έοτʹ౤ೖ͢Δ-BNCEB 'VODUJPO

Slide 56

Slide 56 text

$%,αϯϓϧɿ(MVF5BCMF"NB[PO"UIFOB

Slide 57

Slide 57 text

$%,αϯϓϧɿ(MVF5BCMF"NB[PO"UIFOB ɾ4σʔλιʔε(MVF5BCMF ɾ-BNCEB'VODUJPO"NB[PO"UIFOB

Slide 58

Slide 58 text

$%,αϯϓϧɿ4(MVF5BCMF export async function eventGlueTableResource( stack: cdk.Construct, global: GlobalProps, s3Resource: SlsGlueTableResourceInput, ): Promise { const slsVirtualDataBase = new glue.Database(stack, 'SlsVirtualDatabase', { databaseName: global.getGlueDatabaseName('slsVirtual_application'), }); const sessionGlueTable = new glue.Table(stack, 'SessionTable', { database: slsVirtualDataBase, tableName: 'session', columns: [ { name: 'sessionId', type: glue.Schema.STRING, }, { name: 'email', type: glue.Schema.STRING, }, { name: 'title', type: glue.Schema.STRING, }, { name: 'description', type: glue.Schema.STRING, }, { name: 'organizationName', type: glue.Schema.STRING, }, { name: 'name', type: glue.Schema.STRING, }, ], dataFormat: glue.DataFormat.JSON, bucket: s3Resource.jsonLinesBucket, s3Prefix: 'session/', partitionKeys: [ { name: 'eventId', type: glue.Schema.STRING, }, { name: 'versionId', type: glue.Schema.STRING, }, ], });

Slide 59

Slide 59 text

$%,αϯϓϧɿ4(MVF5BCMF export async function eventGlueTableResource( stack: cdk.Construct, global: GlobalProps, s3Resource: SlsGlueTableResourceInput, ): Promise { const slsVirtualDataBase = new glue.Database(stack, 'SlsVirtualDatabase', { databaseName: global.getGlueDatabaseName('slsVirtual_application'), }); const sessionGlueTable = new glue.Table(stack, 'SessionTable', { database: slsVirtualDataBase, tableName: 'session', columns: [ { name: 'sessionId', type: glue.Schema.STRING, }, { name: 'email', type: glue.Schema.STRING, }, { name: 'title', type: glue.Schema.STRING, }, { name: 'description', type: glue.Schema.STRING, }, { name: 'organizationName', type: glue.Schema.STRING, }, { name: 'name', type: glue.Schema.STRING, }, ], dataFormat: glue.DataFormat.JSON, bucket: s3Resource.jsonLinesBucket, s3Prefix: 'session/', partitionKeys: [ { name: 'eventId', type: glue.Schema.STRING, }, { name: 'versionId', type: glue.Schema.STRING, }, ], }); w (MVF%BUBCBTFͱ(MVF 5BCMFΛ࡞੒

Slide 60

Slide 60 text

$%,αϯϓϧɿ4(MVF5BCMF export async function eventGlueTableResource( stack: cdk.Construct, global: GlobalProps, s3Resource: SlsGlueTableResourceInput, ): Promise { const slsVirtualDataBase = new glue.Database(stack, 'SlsVirtualDatabase', { databaseName: global.getGlueDatabaseName('slsVirtual_application'), }); const sessionGlueTable = new glue.Table(stack, 'SessionTable', { database: slsVirtualDataBase, tableName: 'session', columns: [ { name: 'sessionId', type: glue.Schema.STRING, }, { name: 'email', type: glue.Schema.STRING, }, { name: 'title', type: glue.Schema.STRING, }, { name: 'description', type: glue.Schema.STRING, }, { name: 'organizationName', type: glue.Schema.STRING, }, { name: 'name', type: glue.Schema.STRING, }, ], dataFormat: glue.DataFormat.JSON, bucket: s3Resource.jsonLinesBucket, s3Prefix: 'session/', partitionKeys: [ { name: 'eventId', type: glue.Schema.STRING, }, { name: 'versionId', type: glue.Schema.STRING, }, ], }); w σʔλιʔεͱͳΔ4ό έοτΛࢦఆ

Slide 61

Slide 61 text

$%,αϯϓϧɿ4(MVF5BCMF const ticketGlueTable = new glue.Table(stack, 'TicketTable', { database: slsVirtualDataBase, tableName: 'ticket', columns: [ { name: 'email', type: glue.Schema.STRING, }, { name: 'ticketId', type: glue.Schema.STRING, }, ], dataFormat: glue.DataFormat.JSON, bucket: s3Resource.jsonLinesBucket, s3Prefix: 'ticket/', partitionKeys: [ { name: 'eventId', type: glue.Schema.STRING, }, { name: 'versionId', type: glue.Schema.STRING, }, ], }); w νέοτ༻ͷ(MVF5BCMF΋ ಉ͡Α͏ʹ࡞੒

Slide 62

Slide 62 text

$%,αϯϓϧɿ4(MVF5BCMF const entryTable = new glue.Table(stack, 'EntryTable', { database: slsVirtualDataBase, tableName: 'entry', columns: [ { name: 'sessionId', type: glue.Schema.STRING, }, { name: 'name', type: glue.Schema.STRING, }, { name: 'title', type: glue.Schema.STRING, }, { name: 'description', type: glue.Schema.STRING, }, { name: 'ticketId', type: glue.Schema.STRING, }, ], dataFormat: new glue.DataFormat({ inputFormat: glue.InputFormat.TEXT, outputFormat: glue.OutputFormat.HIVE_IGNORE_KEY_TEXT, serializationLibrary: glue.SerializationLibrary.HIVE_JSON, }), bucket: s3Resource.entryBucket, s3Prefix: 'entry/', partitionKeys: [ { name: 'eventId', type: glue.Schema.STRING, }, ], w ࠷ޙʹɺ"UIFOBͰ+0*/͠ ͨσʔλΛ౤ೖ͢Δ༻ͷ (MVF5BCMF΋༻ҙ͓ͯ͘͠

Slide 63

Slide 63 text

+0*/͢Δ-BNCEB'VODUJPOͷίʔυ͸͜Μͳײ͡ export async function handler( event: JoinRequest, context?: LambdaContext, ): Promise { const param: DocumentClient.GetItemInput = { TableName: EventDynamodbTableName, Key: { id: event.eventId, }, }; const res = await dynamo.get(param).promise(); const sessionVersionId = res.Item!.sessionJsonLinesInfo.versionId; const ticketVersionId = res.Item!.ticketJsonLinesInfo.versionId; try { const param = { ResultConfiguration: { OutputLocation: `s3://${AthenaOutputBucketName}/`, }, QueryExecutionContext: { Database: SlsVirtualGlueDataBaseName, }, QueryString: [ `insert into ${EntryGlueTableName}`, 'select session.sessionId,', 'ticket.ticketId,', 'session.name,', 'session.organizationName,', 'session.title,', 'session.description,', 'ticket.email', // lower case for partition key `from ${SessionGlueTableName} session inner join ${TicketGlueTableName} ticket on session.email = ticket.email`, `where session.versionId = '${sessionVersionId}' and ticket.versionId = '${ticketVersionId}'`, ].join(' '), }; console.log('insert into param', param); const execution = await athena.startQueryExecution(param).promise(); console.log(execution); return { executionId: execution.QueryExecutionId!, }; } catch (e) { console.log(e); throw e; } }

Slide 64

Slide 64 text

export async function handler( event: JoinRequest, context?: LambdaContext, ): Promise { const param: DocumentClient.GetItemInput = { TableName: EventDynamodbTableName, Key: { id: event.eventId, }, }; const res = await dynamo.get(param).promise(); const sessionVersionId = res.Item!.sessionJsonLinesInfo.versionId; const ticketVersionId = res.Item!.ticketJsonLinesInfo.versionId; try { const param = { ResultConfiguration: { OutputLocation: `s3://${AthenaOutputBucketName}/`, }, QueryExecutionContext: { Database: SlsVirtualGlueDataBaseName, }, QueryString: [ `insert into ${EntryGlueTableName}`, 'select session.sessionId,', 'ticket.ticketId,', 'session.name,', 'session.organizationName,', 'session.title,', 'session.description,', 'ticket.email', // lower case for partition key `from ${SessionGlueTableName} session inner join ${TicketGlueTableName} ticket on session.email = ticket.email`, `where session.versionId = '${sessionVersionId}' and ticket.versionId = '${ticketVersionId}'`, ].join(' '), }; console.log('insert into param', param); const execution = await athena.startQueryExecution(param).promise(); console.log(execution); return { executionId: execution.QueryExecutionId!, }; } catch (e) { console.log(e); throw e; } } -BNCEB'VODUJPOͷίʔυ͸͜Μͳײ͡ w ϝΞυͰ+0*/ w +0*/ޙͷσʔλΛ&OUSZ ςʔϒϧʹ౤ೖ

Slide 65

Slide 65 text

export async function handler( event: JoinRequest, context?: LambdaContext, ): Promise { const param: DocumentClient.GetItemInput = { TableName: EventDynamodbTableName, Key: { id: event.eventId, }, }; const res = await dynamo.get(param).promise(); const sessionVersionId = res.Item!.sessionJsonLinesInfo.versionId; const ticketVersionId = res.Item!.ticketJsonLinesInfo.versionId; try { const param = { ResultConfiguration: { OutputLocation: `s3://${AthenaOutputBucketName}/`, }, QueryExecutionContext: { Database: SlsVirtualGlueDataBaseName, }, QueryString: [ `insert into ${EntryGlueTableName}`, 'select session.sessionId,', 'ticket.ticketId,', 'session.name,', 'session.organizationName,', 'session.title,', 'session.description,', 'ticket.email', // lower case for partition key `from ${SessionGlueTableName} session inner join ${TicketGlueTableName} ticket on session.email = ticket.email`, `where session.versionId = '${sessionVersionId}' and ticket.versionId = '${ticketVersionId}'`, ].join(' '), }; console.log('insert into param', param); const execution = await athena.startQueryExecution(param).promise(); console.log(execution); return { executionId: execution.QueryExecutionId!, }; } catch (e) { console.log(e); throw e; } } -BNCEB'VODUJPOͷίʔυ͸͜Μͳײ͡ w "UIFOBΫΤϦΛ࣮ߦ

Slide 66

Slide 66 text

ίʔυαϯϓϧɿ44FMFDU

Slide 67

Slide 67 text

async function createS3SelectStream( request: EntryRequest, expression: string, ): Promise> { const source$ = new Subject(); const content = await s3 .selectObjectContent({ Bucket: request.bucket, Key: request.key, InputSerialization: { JSON: { Type: 'LINES' }, CompressionType: 'GZIP', }, OutputSerialization: { JSON: { RecordDelimiter: '\n' } }, Expression: expression, ExpressionType: 'SQL', }) .promise(); const contentStream: any = content.Payload!; const filterRecordsOnly = new stream.Transform({ objectMode: true, transform( chunk: any, encoding: string, done: stream.TransformCallback, ): void { if (chunk.Records) { this.push(chunk.Records.Payload); } else if (chunk.Stats) { console.log( `Processed ${chunk.Stats.Details.BytesProcessed} bytes`, ); } else if (chunk.End) { console.log('SelectObjectContent completed'); } done(); }, }); const readLine = readline.createInterface( contentStream.pipe(filterRecordsOnly), ); readLine .on('line', (line: any) => { // console.log('line', line); source$.next(JSON.parse(line) as T); }) .on('close', () => { console.log('end'); source$.complete(); 44FMFDUͷ-BNCEB'VODUJPOίʔυ

Slide 68

Slide 68 text

async function createS3SelectStream( request: EntryRequest, expression: string, ): Promise> { const source$ = new Subject(); const content = await s3 .selectObjectContent({ Bucket: request.bucket, Key: request.key, InputSerialization: { JSON: { Type: 'LINES' }, CompressionType: 'GZIP', }, OutputSerialization: { JSON: { RecordDelimiter: '\n' } }, Expression: expression, ExpressionType: 'SQL', }) .promise(); const contentStream: any = content.Payload!; const filterRecordsOnly = new stream.Transform({ objectMode: true, transform( chunk: any, encoding: string, done: stream.TransformCallback, ): void { if (chunk.Records) { this.push(chunk.Records.Payload); } else if (chunk.Stats) { console.log( `Processed ${chunk.Stats.Details.BytesProcessed} bytes`, ); } else if (chunk.End) { console.log('SelectObjectContent completed'); } done(); }, }); const readLine = readline.createInterface( contentStream.pipe(filterRecordsOnly), ); readLine .on('line', (line: any) => { // console.log('line', line); source$.next(JSON.parse(line) as T); }) .on('close', () => { console.log('end'); source$.complete(); 44FMFDUͷ-BNCEB'VODUJPOίʔυ w TFMFDUGSPNTPCKFDUPCKXIFSFPCKUJDLFUJE b`

Slide 69

Slide 69 text

async function createS3SelectStream( request: EntryRequest, expression: string, ): Promise> { const source$ = new Subject(); const content = await s3 .selectObjectContent({ Bucket: request.bucket, Key: request.key, InputSerialization: { JSON: { Type: 'LINES' }, CompressionType: 'GZIP', }, OutputSerialization: { JSON: { RecordDelimiter: '\n' } }, Expression: expression, ExpressionType: 'SQL', }) .promise(); const contentStream: any = content.Payload!; const filterRecordsOnly = new stream.Transform({ objectMode: true, transform( chunk: any, encoding: string, done: stream.TransformCallback, ): void { if (chunk.Records) { this.push(chunk.Records.Payload); } else if (chunk.Stats) { console.log( `Processed ${chunk.Stats.Details.BytesProcessed} bytes`, ); } else if (chunk.End) { console.log('SelectObjectContent completed'); } done(); }, }); const readLine = readline.createInterface( contentStream.pipe(filterRecordsOnly), ); readLine .on('line', (line: any) => { // console.log('line', line); source$.next(JSON.parse(line) as T); }) .on('close', () => { console.log('end'); source$.complete(); 44FMFDUͷ-BNCEB'VODUJPOίʔυ w 44FMFDU͸݁ՌΛಠࣗͷ/PEFKTετϦʔϜͰฦ͢ͷͰ ͕Μ͹ͬͯऔಘ͍ͯ͠Δ༷ࢠ

Slide 70

Slide 70 text

ίʔυαϯϓϧͷ·ͱΊ -BNCEB'VODUJPOΛ࢖ͬͨ&5-Ͱ"84$%,Λ࢖ ͍૊ΈཱͯΔαϯϓϧΛࣔͨ͠ "84ߏ੒ਤ͔Β$%,ίʔυʹམͱͨ͢Ίʹ͸ɺ֤αʔϏεͷ ࣮ମΛ஌Δ͜ͱ͕ۙಓ ཁ݅ʹΑͬͯϚον͢ΔαʔϏε͸ҟͳΔɻ߹க͢ΔαʔϏε ͕ͳʹ͔ɺࣾ಺ͰɺαʔόʔϨείϛϡχςΟͰٞ࿦͠·͠ΐ ͏

Slide 71

Slide 71 text

·ͱΊ ϑΝΠϧΛ+0*/͢ΔϢʔεέʔεʹରͯ͠ɺαʔόʔϨ εͰϥΠτͳ&5-Λߦͬͨ ෳ਺ͷαʔϏεͷσʔλΛ૊Έ߹Θͤͯɺͻͱͭͷ݁ՌΛಘ Δͱ͍͏γνϡΤʔγϣϯʹૺ۰ͨ͠ͱ͖ʹɺࠓճͷΑ͏ʹ ͍ͬͨΜ4ϑΝΠϧ΁ग़ྗɺͦͷޙ"UIFOBͰ+0*/ͯ͠4 4FMFDUͰΫΤϦ͢Δͱ͍͏ྲྀΕ͕࢖͑Δ͜ͱΛࢥ͍ग़ͯ͠΋Β ͑Ε͹޾͍Ͱ͢

Slide 72

Slide 72 text

ࢀߟॻ੶ɾར༻ͨ͠πʔϧ w DODGXHTFSWFSMFTT$/$'4FSWFSMFTT8(IUUQTHJUIVCDPNDODG XHTFSWFSMFTT w "84$%,IUUQTHJUIVCDPNBXTBXTDEL w DNXBEBZVTVLFUTBTDELIUUQTHJUIVCDPNDNXBEBZVTVLF UTBTDEL