Slide 1

Slide 1 text

ϚϚͷҰาΛࢧ͑Δ ϚϚ޲͚ίϛϡχςΟαʔϏεΛ ࢧ͑Δ/-1 $POOFIJUP*OD໺ᖒ఩র .-!-PGU/-1

Slide 2

Slide 2 text

͍͖ͳΓͰ͕͢ʜ

Slide 3

Slide 3 text

೔ຊޠͷࣗવݴޠॲཧͬͯ l΍Δ͜ͱzଟ͘ͳ͍Ͱ͔͢ʁ

Slide 4

Slide 4 text

೔ຊޠͷࣗવݴޠॲཧͬͯ l΍Δ͜ͱzଟ͘ͳ͍Ͱ͔͢ʁ

Slide 5

Slide 5 text

w ܗଶૉղੳ w લॲཧ ඼ࢺ੍ݶ ਖ਼نԽ ετοϓϫʔυআڈ w ࣙॻ؅ཧ w ετοϓϫʔυ؅ཧ FUD ˙8IBUTl΍Δ͜ͱz

Slide 6

Slide 6 text

w ܗଶૉղੳ w લॲཧ ඼ࢺ੍ݶ ਖ਼نԽ ετοϓϫʔυআڈ w ࣙॻ؅ཧ w ετοϓϫʔυ؅ཧ FUD ˙8IBUTl΍Δ͜ͱz ى͜Γ͏Δ͜ͱ w ϩʔΧϧͱຊ൪ͷ.FDBCͬͯಉ͔͡ͳʁ w ຊ൪Ͱಈ͍͍ͯΔϞσϧ࡞ͬͨOPUFCPPLͲΕ͚ͩͬʁ w ͦΕͬΆ͍OPUFCPPL͸ݟ͔͚ͭͬͨͲɺͪΐ͍ͪΐ͍ม ߋ͠ͳ͕Β࣮ߦͯ͠Δ͔Βɺલॲཧͱ͔ຊ౰ʹ͜ͷ··Ͱ େৎ෉͔ͳʁ FUD

Slide 7

Slide 7 text

w ܗଶૉղੳ w લॲཧ ඼ࢺ੍ݶ ਖ਼نԽ ετοϓϫʔυআڈ w ࣙॻ؅ཧ w ετοϓϫʔυ؅ཧ FUD ˙8IBUTl΍Δ͜ͱz ʊਓਓਓਓਓਓਓਓਓਓਓਓਓਓਓʊ ʼɹѹ౗తͳ৺ཧత҆શੑͷ௿͞ɹʻ ʉ?:?:?:?:?:?:?:?:?:?:?:ʉ

Slide 8

Slide 8 text

w ܗଶૉղੳ w લॲཧ ඼ࢺ੍ݶ ਖ਼نԽ ετοϓϫʔυআڈ w ࣙॻ؅ཧ w ετοϓϫʔυ؅ཧ FUD ˙8IBUTl΍Δ͜ͱz ࠓ೔࿩͢͜ͱ

Slide 9

Slide 9 text

w ܗଶૉղੳ w લॲཧ ඼ࢺ੍ݶ ਖ਼نԽ ετοϓϫʔυআڈ w ࣙॻ؅ཧ w ετοϓϫʔυ؅ཧ FUD ˙8IBUTl΍Δ͜ͱz "84ͷαʔϏεΛ׆༻͢Δ͜ͱͰ ԁ׈ͳ.-ϑϩʔΛߏஙͰ͖ͨ ͱ͍͏࿩Λ͠·͢

Slide 10

Slide 10 text

ຊ୊΁ ʢ˞ࢿྉ͸ޙ΄Ͳެ։͠·͢ʂʣ

Slide 11

Slide 11 text

˙ΞδΣϯμ ࣗݾ঺հ "CPVUϚϚϦ ϚϚϦʹ͓͚Δ՝୊ͱ/-1׆༻ࣄྫ ΞʔΩςΫνϟ Ϟσϧͷӡ༻

Slide 12

Slide 12 text

ࣗݾ঺հ

Slide 13

Slide 13 text

˙ࣗݾ঺հ ໊લɿ໺ᖒ఩রʢ/P[BXB5BLBOPCVʣ ॴଐɿίωώτגࣜձࣾ ɹɹɿ!UBLBQZ w ʙίωώτʹ.-ΤϯδχΞͱͯ͠+0*/ w ػցֶशؔ࿈ΛϝΠϯʹ΍ΓͭͭɺΠϯϑϥ΋ษڧத w ,BHHMFͨ͠ΓɺϒϩάʢIUUQTXXXUBLBQZXPSLʣॻ͍ͨΓɺ ໺ٿͨ͠Γ͍ͯ͠·͢

Slide 14

Slide 14 text

"CPVUϚϚϦ

Slide 15

Slide 15 text

˙"CPVUϚϚϦ ˞ʮӾཡ਺ʯʮར༻ऀ਺ʯ͸ϝσΟΞͱΞϓϦͷ߹ܭ஋ʢ೥݄݄ͷฏۉ஋ʣ ˞ʮϚϚ޲͚/PΞϓϦʯ͸೥݄Πϯςʔδௐ΂ɹௐࠪର৅ɿ೛৷தʙ̎ࡀ̌ϲ݄ͷࢠڙΛ࣋ͭঁੑ O Λநग़ ˞*OTUBHSBNͷϑΥϩϫʔ਺ɺ'BDFCPPLͷ͍͍Ͷ਺ɺ-*/&ͷͱ΋ͩͪ਺ͷ߹ܭ஋ ೥݄࣌఺ ϚϚϦ ΞϓϦɾ8FC 4/4 *OTUBHSBNɾ-*/&ɾ'BDFCPPL هࣄ ϚϚಉ࢜Ͱ೰ΈΛ૬ஊ͠߹͏2"ίϛϡχςΟΛத৺ʹ ϢʔβʔΛ֦େ͍ͯ͠·͢ ʮϚϚϦʯͰϢʔβʔಉ͕࢜ ͲΜͲΜܨ͕͍ͬͯ·͢ ϚϚͷੜ׆ʹ໾ཱͭهࣄΛ ෯޿͍δϟϯϧͰ഑৴͍ͯ͠·͢ ϚϚ޲͚/P̍ΞϓϦʹબग़ ਓͷϚϚ͕બͿʮݱࡏ࢖͍ͬͯΔΞϓϦʯʹ ͯɺ߲໨ ଞͷϚϚʹΦεεϝ͍ͨ͠ɺೝ஌౓ɺ
 ར༻཰ɺརศੑɺ޷ײ౓ Ͱ̍ҐΛ֫ಘ͠·ͨ͠ هࣄ਺ 6,000 هࣄҎ্ ྦྷܭϑΝϯ਺ ໿ 85 ສਓ ˞ ݄ؒӾཡ਺ ໿ 1.5ԯճ ˞ ݄ؒར༻ऀ਺ ໿ 650ສਓ ˞ ˞ l೰ΈzͱzڞײzΛ࣠ʹϚϚʹدΓఴ͍ ΞϓϦɾ8FCɾ4/4ͱଟ֯తʹαʔϏεΛల։͍ͯ͠·͢

Slide 16

Slide 16 text

0 450,000 900,000 1,350,000 1,800,000 2014/4 2014/5 2014/6 2014/7 2014/8 2014/9 2014/10 2014/11 2014/12 2015/1 2015/2 2015/3 2015/4 2015/5 2015/6 2015/7 2015/8 2015/9 2015/10 2015/11 2015/12 2016/1 2016/2 2016/3 2016/4 2016/5 2016/6 2016/7 2016/8 2016/9 2016/10 2016/11 2016/12 2017/1 2017/2 2017/3 2017/4 2017/5 2017/6 2017/7 2017/8 2017/9 2017/10 2017/11 2017/12 2018/1 2018/2 2018/3 2018/4 2018/5 2018/6 2018/7 2018/8 ˙"CPVUϚϚϦ ݄ؒ౤ߘ਺ ໿ 150ສ݅ िʹ೔Ҏ্ىಈ͢Δ ΞΫςΟϒϢʔβʔ ໿ 50 ਓʹਓ 57$. ์ө ΞϓϦ૯%-਺ສ ਓʹਓ ਓʹਓ ਓʹਓ ਓʹਓ ˞ ˞ʮϚϚϦʯ಺ͷग़࢈༧ఆ೔Λઃఆͨ͠Ϣʔβʔ਺ͱɺްੜ࿑ಇলൃදʮਓޱಈଶ౷ܭʯͷग़ੜ਺͔Βࢉग़
 ˞िʹճҎ্ىಈ͢ΔϢʔβʔ ˞ ೥ʹग़࢈ͨ͠ϚϚͷʮਓʹਓʯ͕ϚϚϦΛར༻த ೔ຊ࠷େڃن໛ΛތΔϒϥϯυ΁ͱ੒௕͍ͯ͠·͢ ˞

Slide 17

Slide 17 text

ϚϚϦʹ͓͚Δ՝୊ͱ/-1׆༻ࣄྫ

Slide 18

Slide 18 text

˙ϚϚϦʹ͓͚Δ՝୊ͱ/-1׆༻ࣄྫ ՝୊ɿ

Slide 19

Slide 19 text

˙ϚϚϦʹ͓͚Δ՝୊ͱ/-1׆༻ࣄྫ ՝୊ɿ ࣭໰ऀ ճ౴ऀ

Slide 20

Slide 20 text

˙ϚϚϦʹ͓͚Δ՝୊ͱ/-1׆༻ࣄྫ ՝୊ɿ ࣭໰ऀ ෆద੾ͳίϯςϯπͷ౤ߘ FH؆୯ʹՔ͛Δํ๏ڭ͑·͢Α ճ౴ऀ

Slide 21

Slide 21 text

˙ϚϚϦʹ͓͚Δ՝୊ͱ/-1׆༻ࣄྫ ՝୊ɿ ࣭໰ऀ ճ౴ऀ ෆద੾ͳίϯςϯπͷ౤ߘ FH؆୯ʹՔ͛Δํ๏ڭ͑·͢Α ݕ Ӿ ϑ Ο ϧ λ

Slide 22

Slide 22 text

˙ϚϚϦʹ͓͚Δ՝୊ͱ/-1׆༻ࣄྫ ՝୊ɿ ࣭໰ऀ ճ౴ऀ ෆద੾ͳίϯςϯπͷ౤ߘ FH؆୯ʹՔ͛Δํ๏ڭ͑·͢Α ݕ Ӿ ϑ Ο ϧ λ ػցֶशΛ׆༻

Slide 23

Slide 23 text

˙ϚϚϦʹ͓͚Δ՝୊ͱ/-1׆༻ࣄྫ ՝୊ɿ ࣭໰ऀ ճ౴ऀ ෆద੾ͳίϯςϯπͷ౤ߘ FH؆୯ʹՔ͛Δํ๏ڭ͑·͢Α ݕ Ӿ ϑ Ο ϧ λ ػցֶशΛ׆༻ /-1Λ࢖ͬͯҧ൓౤ߘΛ ϦΞϧλΠϜʹݕ஌

Slide 24

Slide 24 text

ΞʔΩςΫνϟ

Slide 25

Slide 25 text

˙ΞʔΩςΫνϟ Fargate S3 Training RDS Glue Question Data Prediction EndPoint SageMaker StepFunctions Preprocessing Task Fargate Flask API શମ૾

Slide 26

Slide 26 text

˙ΞʔΩςΫνϟ Fargate S3 Training RDS Glue Question Data Prediction EndPoint SageMaker StepFunctions Preprocessing Task Fargate Flask API શମ૾ ֶश

Slide 27

Slide 27 text

˙ΞʔΩςΫνϟ Fargate S3 Training RDS Glue Question Data Prediction EndPoint SageMaker StepFunctions Preprocessing Task Fargate Flask API શମ૾ ਪ࿦

Slide 28

Slide 28 text

˙ΞʔΩςΫνϟ Fargate S3 Training RDS Glue Question Data Prediction EndPoint SageMaker StepFunctions Preprocessing Task Fargate Flask API શମ૾ ৄ͘͠

Slide 29

Slide 29 text

˙ΞʔΩςΫνϟ Fargate S3 Training RDS Glue Question Data Prediction EndPoint SageMaker StepFunctions Preprocessing Task Fargate Flask API શମ૾ ࣄલֶशʹ͍ͭͯ

Slide 30

Slide 30 text

˙ΞʔΩςΫνϟɿࣄલֶश S3 wHFOTJNXPSEWFDΛ༻͍ͯݴޠ ϞσϧΛ࡞੒͠4΁อଘ wίʔύεʹ͸ϚϚϦ಺ͷσʔλΛ ࢖༻ w2v model

Slide 31

Slide 31 text

˙ΞʔΩςΫνϟ Fargate S3 Training RDS Glue Question Data Prediction EndPoint SageMaker StepFunctions Preprocessing Task Fargate Flask API શମ૾

Slide 32

Slide 32 text

˙ΞʔΩςΫνϟ Fargate S3 Training RDS Glue Question Data Prediction EndPoint SageMaker StepFunctions Preprocessing Task Fargate Flask API શମ૾ &5-ͱલॲཧʹ͍ͭͯ

Slide 33

Slide 33 text

˙ΞʔΩςΫνϟɿ&5-ͱલॲཧ Fargate S3 RDS Glue StepFunctions Preprocessing Task

Slide 34

Slide 34 text

˙ΞʔΩςΫνϟɿ&5-ͱલॲཧ Fargate S3 RDS Glue StepFunctions Preprocessing Task w(MVFΛར༻ͯ͠3%4͔Β ֶशσʔλΛநग़ train.tsv

Slide 35

Slide 35 text

˙ΞʔΩςΫνϟɿ&5-ͱલॲཧ Fargate S3 RDS Glue StepFunctions Preprocessing Task train.tsv train.tsv w2v model w'BSHBUFͷόονॲཧ಺Ͱ ݴޠϞσϧͱֶशσʔλΛ ࢖༻͠ɺςΩετͷલॲཧ ϕΫτϧԽ

Slide 36

Slide 36 text

˙ΞʔΩςΫνϟɿ&5-ͱલॲཧ Fargate S3 RDS Glue StepFunctions Preprocessing Task train.tsv train.tsv w2v model ෼͔ͪॻ͖ .FDBC ඼ࢺ੍ݶ<໊ࢺ ಈࢺ ܗ༰ࢺ> ਖ਼نԽ ετοϓϫʔυܭࢉআڈ ࣙॻͷ࡞੒ &NCFEEJOH.BUSJYͷ࡞੒ ςΩετσʔλΛγʔέϯεԽ σʔλΛUSBJO UFTUʹ෼ׂ

Slide 37

Slide 37 text

˙ΞʔΩςΫνϟɿ&5-ͱલॲཧ Fargate S3 RDS Glue StepFunctions Preprocessing Task train.tsv train.tsv w2v model ෼͔ͪॻ͖ .FDBC ඼ࢺ੍ݶ<໊ࢺ ಈࢺ ܗ༰ࢺ> ਖ਼نԽ ετοϓϫʔυܭࢉআڈ ࣙॻͷ࡞੒ &NCFEEJOH.BUSJYͷ࡞੒ ςΩετσʔλΛγʔέϯεԽ σʔλΛUSBJO UFTUʹ෼ׂ

Slide 38

Slide 38 text

˙ΞʔΩςΫνϟɿ&5-ͱલॲཧ Fargate S3 RDS Glue StepFunctions Preprocessing Task train.tsv train.tsv w2v model ෼͔ͪॻ͖ .FDBC ඼ࢺ੍ݶ<໊ࢺ ಈࢺ ܗ༰ࢺ> ਖ਼نԽ ετοϓϫʔυܭࢉআڈ ࣙॻͷ࡞੒ &NCFEEJOH.BUSJYͷ࡞੒ ςΩετσʔλΛγʔέϯεԽ σʔλΛUSBJO UFTUʹ෼ׂ ࠓ೔͸"84-PGUͰ ࣗવݴޠॲཧͷొஃΛ͠·͢ʂ

Slide 39

Slide 39 text

˙ΞʔΩςΫνϟɿ&5-ͱલॲཧ Fargate S3 RDS Glue StepFunctions Preprocessing Task train.tsv train.tsv w2v model ෼͔ͪॻ͖ .FDBC ඼ࢺ੍ݶ<໊ࢺ ಈࢺ ܗ༰ࢺ> ਖ਼نԽ ετοϓϫʔυܭࢉআڈ ࣙॻͷ࡞੒ &NCFEEJOH.BUSJYͷ࡞੒ ςΩετσʔλΛγʔέϯεԽ σʔλΛUSBJO UFTUʹ෼ׂ ࠓ೔͸"84-PGUͰ ࣗવݴޠॲཧͷొஃΛ͠·͢ʂ <ࠓ೔ "84 -PGU ࣗવݴ ޠॲཧ ొஃ ͢Δ>

Slide 40

Slide 40 text

˙ΞʔΩςΫνϟɿ&5-ͱલॲཧ Fargate S3 RDS Glue StepFunctions Preprocessing Task train.tsv train.tsv w2v model ෼͔ͪॻ͖ .FDBC ඼ࢺ੍ݶ<໊ࢺ ಈࢺ ܗ༰ࢺ> ਖ਼نԽ ετοϓϫʔυܭࢉআڈ ࣙॻͷ࡞੒ &NCFEEJOH.BUSJYͷ࡞੒ ςΩετσʔλΛγʔέϯεԽ σʔλΛUSBJO UFTUʹ෼ׂ ࠓ೔͸"84-PGUͰ ࣗવݴޠॲཧͷొஃΛ͠·͢ʂ <ࠓ೔ "84 -PGU ࣗવݴ ޠॲཧ ొஃ ͢Δ> <ࠓ೔ BXT MPGU ࣗવݴ ޠॲཧ ొஃ ͢Δ>

Slide 41

Slide 41 text

˙ΞʔΩςΫνϟɿ&5-ͱલॲཧ Fargate S3 RDS Glue StepFunctions Preprocessing Task train.tsv train.tsv w2v model ෼͔ͪॻ͖ .FDBC ඼ࢺ੍ݶ<໊ࢺ ಈࢺ ܗ༰ࢺ> ਖ਼نԽ ετοϓϫʔυܭࢉআڈ ࣙॻͷ࡞੒ &NCFEEJOH.BUSJYͷ࡞੒ ςΩετσʔλΛγʔέϯεԽ σʔλΛUSBJO UFTUʹ෼ׂ

Slide 42

Slide 42 text

˙ΞʔΩςΫνϟɿ&5-ͱલॲཧ Fargate S3 RDS Glue StepFunctions Preprocessing Task train.tsv train.tsv w2v model ෼͔ͪॻ͖ .FDBC ඼ࢺ੍ݶ<໊ࢺ ಈࢺ ܗ༰ࢺ> ਖ਼نԽ ετοϓϫʔυܭࢉআڈ ࣙॻͷ࡞੒ &NCFEEJOH.BUSJYͷ࡞੒ ςΩετσʔλΛγʔέϯεԽ σʔλΛUSBJO UFTUʹ෼ׂ

Slide 43

Slide 43 text

˙ετοϓϫʔυͱࣙॻ ɾετοϓϫʔυɿෆඞཁͳ୯ޠ ɹFHҰൠޠʢ͋Ε ͢Δ ࢥ͏ʣɺ௿ग़ݱޠͳͲ ɾࣙॻʢUPLFOJ[FSʣɿ୯ޠʹJOEFYΛׂΓ౰ͯͨ΋ͷ ɹFHʮࠓ೔͸"84-PGUͰࣗવݴޠॲཧͷొஃΛ͠·͢ʂʯ ɹɹɹɹˠ<ࠓ೔> <ࣗવݴޠॲཧ>ʜ # kerasΛ༻͍ͨtokenizerͷੜ੒&อଘ tokenizer = Tokenizer(lower=False) all_content = train['content'].values.tolist() + test['content'].values.tolist() tokenizer.fit_on_texts(all_content) save_text_tokenizer(tokenizer, TOKENIZER_FILE_NAME)

Slide 44

Slide 44 text

˙&NCFEEJOH.BUSJY w χϡʔϥϧωοτϫʔΫͷຒΊࠐΈ૚ʢ&NCFEEJOH-BZFSʣ ʹॳظઃఆ͢ΔͨΊͷ୯ޠͷ෼ࢄදݱߦྻ FHTIBQF ୯ޠ਺ XWϞσϧֶश࣌ͷ࣍ݩ਺ ࠓ೔ ɾɾɾɾɾ BXT ɾɾɾɾɾ MPGU ɾɾɾɾɾ ࣗવݴޠॲཧ ɾɾɾɾɾ ొஃ ɾɾɾɾɾ ɾɾɾɾɾ ɾɾɾɾɾ ɾɾɾɾɾ ɾɾɾɾɾ ɾɾɾɾɾ ɾɾɾɾɾ

Slide 45

Slide 45 text

˙&NCFEEJOH.BUSJY w χϡʔϥϧωοτϫʔΫͷຒΊࠐΈ૚ʢ&NCFEEJOH-BZFSʣ ʹॳظઃఆ͢ΔͨΊͷ୯ޠͷ෼ࢄදݱߦྻ FH # embedding_matrixͷઃఆ vocab_size = len(tokenizer.word_index) + 1 # ಛ௃ྔͷ࠷େ஋ embedding_vector_size = model_w2v.wv.vector_size # ࣄલֶशϞσϧͷ࣍ݩ਺ embedding_matrix = np.zeros((vocab_size, embedding_vector_size)) for word, i in tokenizer.word_index.items(): try: embedding_vector = model_w2v.wv[word] # w2vϞσϧ͔Βର৅ͷ୯ޠ͕͋Ε͹ͦͷ෼ࢄදݱΛઃఆ except KeyError: pass if embedding_vector is not None: embedding_matrix[i] = embedding_vector # ୯ޠͷ෼ࢄදݱΛɺߦྻͷ୯ޠindex൪໨ʹઃఆ # อଘ np.save(EMBEDDING_FILE_NAME, embedding_matrix)

Slide 46

Slide 46 text

˙ΞʔΩςΫνϟɿ&5-ͱલॲཧ Fargate S3 RDS Glue StepFunctions Preprocessing Task train.tsv train.tsv w2v model

Slide 47

Slide 47 text

˙ΞʔΩςΫνϟɿ&5-ͱલॲཧ Fargate S3 RDS Glue StepFunctions Preprocessing Task train.tsv train.tsv w2v model train.csv emb.npy test.csv stopwords.pkl tokenizer.pkl wֶशσʔλʢUSBJODTWʣ wςετσʔλʢUFTUDTWʣ wࣙॻʢUPLFOJ[FSQLMʣ wετοϓϫʔυʢTUPQXPSETQLMʣ w&NCFEEJOH.BUSJYʢFNCOQZʣ

Slide 48

Slide 48 text

˙ΞʔΩςΫνϟɿ&5-ͱલॲཧ Fargate S3 RDS Glue StepFunctions Preprocessing Task train.tsv train.tsv w2v model train.csv emb.npy test.csv stopwords.pkl tokenizer.pkl wֶशσʔλʢUSBJODTWʣ wςετσʔλʢUFTUDTWʣ wࣙॻʢUPLFOJ[FSQLMʣ wετοϓϫʔυʢTUPQXPSETQLMʣ w&NCFEEJOH.BUSJYʢFNCOQZʣ ೔෇ຖʹ؅ཧ͢Δ͜ͱʹΑΓɺ Ϟσϧʹ࢖༻ͨ͠σʔλΛ໌֬Խ ˠϞσϧͷ࠶ݱੑΛอূ

Slide 49

Slide 49 text

˙ΞʔΩςΫνϟɿ&5-ͱલॲཧ Fargate S3 RDS Glue StepFunctions Preprocessing Task train.tsv train.tsv w2v model train.csv emb.npy test.csv stopwords.pkl tokenizer.pkl wҰ࿈ͷॲཧ͸4UFQ'VODUJPOTͰࣗ ಈԽ͞Ε͍ͯΔͨΊɺϞσϧߋ৽ͷ ࡍʹखΛಈ͔͢ͷ͸ޙड़͢Δ 4BHF.BLFSͰͷֶशσϓϩΠͷ Έ

Slide 50

Slide 50 text

˙ΞʔΩςΫνϟɿ&5-ͱલॲཧ Fargate S3 RDS Glue StepFunctions Preprocessing Task train.tsv train.tsv w2v model train.csv emb.npy test.csv stopwords.pkl tokenizer.pkl wҰ࿈ͷॲཧ͸4UFQ'VODUJPOTͰࣗ ಈԽ͞Ε͍ͯΔͨΊɺϞσϧߋ৽ͷ ࡍʹखΛಈ͔͢ͷ͸ޙड़͢Δ 4BHF.BLFSͰͷֶशσϓϩΠͷ Έ Ϟσϧͷߏஙʹ஫ྗͰ͖Δ

Slide 51

Slide 51 text

˙ΞʔΩςΫνϟ Fargate S3 Training RDS Glue Question Data Prediction EndPoint SageMaker StepFunctions Preprocessing Task Fargate Flask API શମ૾

Slide 52

Slide 52 text

˙ΞʔΩςΫνϟ Fargate S3 Training RDS Glue Question Data Prediction EndPoint SageMaker StepFunctions Preprocessing Task Fargate Flask API શମ૾ ֶशͱσϓϩΠʹ͍ͭͯ

Slide 53

Slide 53 text

˙ΞʔΩςΫνϟɿֶशͱσϓϩΠ Training EndPoint SageMaker w4BHF.BLFSͰֶशΤϯυ ϙΠϯτͷσϓϩΠ wֶशʹ͸ࣄલߏங͞Εͨ 5FOTPS'MPXίϯςφΛ εΫϦϓτϞʔυͰ࢖༻ train.csv emb.npy test.csv

Slide 54

Slide 54 text

˙ΞʔΩςΫνϟɿֶशͱσϓϩΠ Training EndPoint SageMaker w4BHF.BLFSͰֶशΤϯυ ϙΠϯτͷσϓϩΠ wֶशʹ͸ࣄલߏங͞Εͨ 5FOTPS'MPXίϯςφΛ εΫϦϓτϞʔυͰ࢖༻ train.csv emb.npy test.csv

Slide 55

Slide 55 text

˙εΫϦϓτϞʔυ w 4BHF.BLFSಛ༗ͷίʔσΟϯάنఆΛ͋·Γҙࣝ͠ͳͯ͘Α͍ ˠϩʔΧϧ౳Ͱ࡞੒ͨ͠εΫϦϓτϑΝΠϧ͕ʢ͋Δఔ౓ʣͦ ͷ··ྲྀ༻Մೳ w ࣮ߦํ๏͸γϯϓϧͰɺτϨʔχϯά༻ίϯςφىಈ࣌ʹҾ਺ ͱͯ͠εΫϦϓτϑΝΠϧΛ౉͚ͩ͢Ͱྑ͍ ˠϞσϧΛߋ৽͢Δࡍ͸εΫϦϓτϑΝΠϧͷमਖ਼ͷΈͰ0,

Slide 56

Slide 56 text

˙εΫϦϓτϞʔυʢOPUFCPPLʣ w FTUJNBUPSͷҾ਺ʹεΫϦϓτϑΝΠϧ໊Λ౉͚ͩ͢Ͱ0, [1] from sagemaker.tensorflow import TensorFlow [2] estimator = TensorFlow( entry_point='clf_keras_lstm.py', role=role, framework_version='1.12.0', hyperparameters=hyper_param, train_instance_count=1, train_instance_type='ml.p3.2xlarge', script_mode=True, output_path='s3://' + s3_bucket + '/questions/model', code_location='s3://' + s3_bucket + '/questions/model', py_version='py3' ) FH

Slide 57

Slide 57 text

˙εΫϦϓτϞʔυʢOPUFCPPLʣ w FTUJNBUPSͷҾ਺ʹεΫϦϓτϑΝΠϧ໊Λ౉͚ͩ͢Ͱ0, [1] from sagemaker.tensorflow import TensorFlow [2] estimator = TensorFlow( entry_point='clf_keras_lstm.py', role=role, framework_version='1.12.0', hyperparameters=hyper_param, train_instance_count=1, train_instance_type='ml.p3.2xlarge', script_mode=True, output_path='s3://' + s3_bucket + '/questions/model', code_location='s3://' + s3_bucket + '/questions/model', py_version='py3' ) FH

Slide 58

Slide 58 text

˙εΫϦϓτϞʔυʢQZεΫϦϓτʣ w JG@@OBNF@@@@NBJO@@ͰҾ਺ͷઃఆΛ͢Δ if __name__ == '__main__': parser = argparse.ArgumentParser() # ϋΠύʔύϥϝʔλΛड͚औΔ parser.add_argument('--batch-size', type=int, default=512) parser.add_argument('--epochs', type=int, default=5) # SageMaker ݻ༗ͷҾ਺ ؀ڥม਺ʹ͸σϑΥϧτ஋͕ઃఆࡁ parser.add_argument('--output-data-dir', type=str, default=os.environ['SM_OUTPUT_DATA_DIR']) # ϞσϧҎ֎ͷग़ྗϑΝΠϧͷอଘઌ parser.add_argument('--model-dir', type=str, default=os.environ['SM_MODEL_DIR']) # ֶशޙͷϞσϧͷอଘઌ parser.add_argument('--train', type=str, default=os.environ['SM_CHANNEL_TRAIN']) # ֶशσʔλͷύε parser.add_argument('--test', type=str, default=os.environ['SM_CHANNEL_TEST']) # ςετσʔλͷύε parser.add_argument('--embedding', type=str, default=os.environ['SM_CHANNEL_EMBEDDING']) # ຒΊࠐΈσʔλͷύε args, _ = parser.parse_known_args() # ֶश train(args) FH

Slide 59

Slide 59 text

˙εΫϦϓτϞʔυʢQZεΫϦϓτʣ w JG@@OBNF@@@@NBJO@@ͰҾ਺ͷઃఆΛ͢Δ if __name__ == '__main__': parser = argparse.ArgumentParser() # ϋΠύʔύϥϝʔλΛड͚औΔ parser.add_argument('--batch-size', type=int, default=512) parser.add_argument('--epochs', type=int, default=5) # SageMaker ݻ༗ͷҾ਺ ؀ڥม਺ʹ͸σϑΥϧτ஋͕ઃఆࡁ parser.add_argument('--output-data-dir', type=str, default=os.environ['SM_OUTPUT_DATA_DIR']) # ϞσϧҎ֎ͷग़ྗϑΝΠϧͷอଘઌ parser.add_argument('--model-dir', type=str, default=os.environ['SM_MODEL_DIR']) # ֶशޙͷϞσϧͷอଘઌ parser.add_argument('--train', type=str, default=os.environ['SM_CHANNEL_TRAIN']) # ֶशσʔλͷύε parser.add_argument('--test', type=str, default=os.environ['SM_CHANNEL_TEST']) # ςετσʔλͷύε parser.add_argument('--embedding', type=str, default=os.environ['SM_CHANNEL_EMBEDDING']) # ຒΊࠐΈσʔλͷύε args, _ = parser.parse_known_args() # ֶश train(args) FH

Slide 60

Slide 60 text

˙εΫϦϓτϞʔυʢQZεΫϦϓτʣ w USBJOʹ࡞੒ͨ͠εΫϦϓτΛهड़ def train(args): X_train, X_valid, y_train, y_valid = train_test_split(X_train, y_train, test_size=0.3, random_state=0, stratify=y_train) model = build_model(emb_matrix=embedding_matrix, input_length=X_train.shape[1]) model.compile(loss="binary_crossentropy",optimizer='adam',metrics=['accuracy']) es_cb = EarlyStopping(monitor='val_loss', patience=3, verbose=2, mode='auto') model.fit( x=X_train, y=y_train, validation_data=(X_valid, y_valid), batch_size=batch_size, epochs=epochs, verbose=2, callbacks=[es_cb] ) # Ϟσϧͷอଘ save(model, args.model_dir) FH

Slide 61

Slide 61 text

˙εΫϦϓτϞʔυʢQZεΫϦϓτʣ w USBJOʹ࡞੒ͨ͠εΫϦϓτΛهड़ def train(args): X_train, X_valid, y_train, y_valid = train_test_split(X_train, y_train, test_size=0.3, random_state=0, stratify=y_train) model = build_model(emb_matrix=embedding_matrix, input_length=X_train.shape[1]) model.compile(loss="binary_crossentropy",optimizer='adam',metrics=['accuracy']) es_cb = EarlyStopping(monitor='val_loss', patience=3, verbose=2, mode='auto') model.fit( x=X_train, y=y_train, validation_data=(X_valid, y_valid), batch_size=batch_size, epochs=epochs, verbose=2, callbacks=[es_cb] ) # Ϟσϧͷอଘ save(model, args.model_dir) FH

Slide 62

Slide 62 text

˙εΫϦϓτϞʔυʢQZεΫϦϓτʣ w TBWFͰ࡞੒ͨ͠ϞσϧΛ4ʹอଘ def save(model, model_dir): sess = K.get_session() tf.saved_model.simple_save( sess, os.path.join(model_dir, 'model/1'), inputs={'inputs': model.input}, outputs={t.name: t for t in model.outputs}) FH

Slide 63

Slide 63 text

˙εΫϦϓτϞʔυʢQZεΫϦϓτʣ w TBWFͰ࡞੒ͨ͠ϞσϧΛ4ʹอଘ def save(model, model_dir): sess = K.get_session() tf.saved_model.simple_save( sess, os.path.join(model_dir, 'model/1'), inputs={'inputs': model.input}, outputs={t.name: t for t in model.outputs}) FH ҙࣝ͢Δͷ͸ ʮҾ਺ͷઃఆʯͱʮϞσϧͷอଘʯ͚ͩ

Slide 64

Slide 64 text

˙εΫϦϓτϞʔυʢQZεΫϦϓτʣ w TBWFͰ࡞੒ͨ͠ϞσϧΛ4ʹอଘ def save(model, model_dir): sess = K.get_session() tf.saved_model.simple_save( sess, os.path.join(model_dir, 'model/1'), inputs={'inputs': model.input}, outputs={t.name: t for t in model.outputs}) FH ؆୯

Slide 65

Slide 65 text

˙ΞʔΩςΫνϟ Fargate S3 Training RDS Glue Question Data Prediction EndPoint SageMaker StepFunctions Preprocessing Task Fargate Flask API શମ૾

Slide 66

Slide 66 text

˙ΞʔΩςΫνϟ Fargate S3 Training RDS Glue Question Data Prediction EndPoint SageMaker StepFunctions Preprocessing Task Fargate Flask API શମ૾ ϦΞϧλΠϜਪ࿦ʹ͍ͭͯ

Slide 67

Slide 67 text

˙ΞʔΩςΫνϟɿਪ࿦ S3 Fargate Flask API stopwords.pkl tokenizer.pkl

Slide 68

Slide 68 text

˙ΞʔΩςΫνϟɿਪ࿦ S3 Fargate Flask API wΞϓϦ͔ΒSBXσʔλΛड͚औΓɺ ϕΫτϧԽͯ͠4BHF.BLFSͷਪ࿦Τ ϯυϙΠϯτʹ౉͢ stopwords.pkl tokenizer.pkl 0, 13, 542, 9, 4723, 65 ࠓ೔͸AWS LoftͰ ࣗવݴޠॲཧͷ ొஃΛ͠·͢ʂ

Slide 69

Slide 69 text

˙ΞʔΩςΫνϟɿਪ࿦ S3 Fargate Flask API wΞϓϦ͔ΒSBXσʔλΛड͚औΓɺ ϕΫτϧԽͯ͠4BHF.BLFSͷਪ࿦Τ ϯυϙΠϯτʹ౉͢ wਪ࿦ΤϯυϙΠϯτ͔Β͸zҧ൓౤ߘ ֬཰z͕Ϧλʔϯ͞ΕΔ stopwords.pkl tokenizer.pkl 0, 13, 542, 9, 4723, 65 0.189 ࠓ೔͸AWS LoftͰ ࣗવݴޠॲཧͷ ొஃΛ͠·͢ʂ

Slide 70

Slide 70 text

˙ΞʔΩςΫνϟɿਪ࿦ S3 0 or 1 Fargate Flask API wΞϓϦ͔ΒSBXσʔλΛड͚औΓɺ ϕΫτϧԽͯ͠4BHF.BLFSͷਪ࿦Τ ϯυϙΠϯτʹ౉͢ wਪ࿦ΤϯυϙΠϯτ͔Β͸zҧ൓౤ߘ ֬཰z͕Ϧλʔϯ͞ΕΔ w֬཰͕ࢦఆͷᮢ஋ΑΓ௿͚Ε͹ਖ਼ৗ ͏౤ߘ ɺߴ͚Ε͹ҧ൓౤ߘ ͱͯ͠ΞϓϦʹϦλʔϯ stopwords.pkl tokenizer.pkl 0, 13, 542, 9, 4723, 65 0.189 ࠓ೔͸AWS LoftͰ ࣗવݴޠॲཧͷ ొஃΛ͠·͢ʂ

Slide 71

Slide 71 text

˙ΞʔΩςΫνϟ Fargate S3 Training RDS Glue Question Data Prediction EndPoint SageMaker StepFunctions Preprocessing Task Fargate Flask API

Slide 72

Slide 72 text

˙ΞʔΩςΫνϟ Fargate S3 Training RDS Glue EndPoint SageMaker StepFunctions Preprocessing Task Fargate Flask API ࠓ೔͸AWS LoftͰࣗવݴޠॲཧͷొஃΛ͠·͢ʂ 0 or 1 0, 13, 542, 9, 4723, 65 . . . 0.189 train.tsv train.tsv w2v model train.csv emb.npy test.csv stopwords.pkl tokenizer.pkl w2v model train.csv emb.npy test.csv stopwords.pkl tokenizer.pkl

Slide 73

Slide 73 text

Ϟσϧͷӡ༻

Slide 74

Slide 74 text

˙Ϟσϧͷӡ༻ ఆظతʹϞσϧͷߋ৽Λ͍ͯ͘͠தͰɺ ֶशσʔλͰ͸ͦͦ͜͜ͷਫ਼౓͕ग़ͯ΋ɺ ݁ہ͸ຊ൪Ͱͷਫ਼౓͕ॏཁ

Slide 75

Slide 75 text

˙Ϟσϧͷӡ༻ ఆظతʹϞσϧͷߋ৽Λ͍ͯ͘͠தͰɺ ֶशσʔλͰ͸ͦͦ͜͜ͷਫ਼౓͕ग़ͯ΋ɺ ݁ہ͸ຊ൪Ͱͷਫ਼౓͕ॏཁ Ϟσϧ͕ͲͷΑ͏ͳڍಈΛ͍ͯ͠Δ͔ ϞχλϦϯά͢Δඞཁ͕͋Δ

Slide 76

Slide 76 text

˙Ϟσϧͷӡ༻ ఆظతʹϞσϧͷߋ৽Λ͍ͯ͘͠தͰɺ ֶशσʔλͰ͸ͦͦ͜͜ͷਫ਼౓͕ग़ͯ΋ɺ ݁ہ͸ຊ൪Ͱͷਫ਼౓͕ॏཁ

Slide 77

Slide 77 text

˙Ϟσϧͷӡ༻ 'MBTL"1*Ͱͷਪ࿦݁Ռϩάͱ 3%4ʹอଘ͞Ε͍ͯΔσʔλΛར༻͠ ೔࣍ͰϞχλϦϯά

Slide 78

Slide 78 text

˙Ϟσϧͷӡ༻

Slide 79

Slide 79 text

˙Ϟσϧͷӡ༻

Slide 80

Slide 80 text

·ͱΊ

Slide 81

Slide 81 text

˙·ͱΊ ೔ຊޠͷࣗવݴޠॲཧ͸΍Δ͜ͱ͕ଟ͍΋ͷͷ "84ͷ֤αʔϏεΛ༗ޮ׆༻͢Δ͜ͱͰ ԁ׈ͳ.-ϑϩʔΛߏஙͰ͖Δ

Slide 82

Slide 82 text

͝ਗ਼ௌ͋Γ͕ͱ͏͍͟͝·ͨ͠ʂ