Slide 1

Slide 1 text

How to use AWS Lambda in Document Processing Pipeline @suzu_v VOYAGE GROUP 2016/04/22 at AWS Tokyo Office

Slide 2

Slide 2 text

ࢲʹ͍ͭͯ • ͚ͣ͢Μ, https://github.com/suzuken, @suzu_v • GopherͰ͢ / ࠓ೔͸Javaͷ࿩Λ͠·͢ • http://fluct.jp Ͱ޿ࠂ഑৴ / ղੳج൫ͷιϑτ ΢ΣΞΤϯδχΞΛ͍ͯ͠·͢

Slide 3

Slide 3 text

ΞδΣϯμ • ޿ࠂ഑৴γεςϜͷจষղੳج൫ͰLambda͕ Ͳ͏࢖ΘΕ͍ͯΔͷ͔Λઆ໌͠·͢ • API Gateway + LambdaͰ͸ͳ͘ɺKinesis Stream + LambdaͷࣄྫͰ͢ ൃදதʹ΋ؾܰʹ࣭໰͍ͯͩ͘͠͞ʂ

Slide 4

Slide 4 text

༻్ɾཁ݅ɾഎܠ • ޿ࠂ഑৴ͷͨΊʹϖʔδ಺จষΛݟͯͦͷ಺༰ Λ෼ੳɾ෼ྨ͠ɺ഑৴ʹ׆༻͍ͨ͠ • ϖʔδ͸Ϋϩʔϧͯ͠औಘɺͳΔ΂͘Ϋϩʔϧ ͔ͯ͠Βૣ͘෼ྨ͍ͨ͠ • 1೔ʹऔಘɾ෼ੳ͍ͨ͠ϖʔδ͸100ສϖʔδ ΄Ͳ ։ൃ͔ΒϦϦʔε·Ͱ͸3ϲ݄ఔ౓

Slide 5

Slide 5 text

ํ਑ • ӡ༻ʹख͔͚ؒͨ͘ͳ͍ͷͰͳΔ΂͘Ϛωʔδ υαʔϏεΛ͔ͭ͏ • ෼ੳɾ෼ྨɾจॻݕࡧ͋ͨΓ͸ࠓޙ৭ʑͳख๏ ΛࢼͤΔΑ͏ʹ࡞Δ • ֤ίϯϙʔωϯτ͸ͦΕͧΕಠཱͯ͠ಈ࡞͠ɺ 1͕ͭམͪͯ΋શମʹӨڹ͕ͳ͍Α͏ʹ͢Δ

Slide 6

Slide 6 text

ߏ੒ཁૉ • WebΫϩʔϥ (EC2 / Go): URLΛࢦఆͯ͠ίϯςϯπΛऔಘ͢Δ daemon • ຊจநग़ػ (Lambda / Java8): ຊจͰ͋Δͱਪఆ͞ΕΔ෦෼Λ ൈ͖ग़͢ • Lucene / KuromojiΛ͔͍͍ͭͨͷͰ • ෼ྨث (EC2 / Go): ຊจ΍ϖʔδ͔ΒಘΒΕΔ৘ใΛݩʹจষΛ ΧςΰϦ෼͚ͳͲΛ୲౰͢Δ • υΩϡϝϯτετΞ (EC2 / Elasticsearch): Ϋϩʔϧͯ͠෼ྨࡁΈ ͷίϯςϯπΛ֨ೲ͠ɺݕࡧՄೳʹ͢Δ • API (EC2 + ELB / Go): ෼ྨ݁ՌΛฦ͢internalͳHTTP API

Slide 7

Slide 7 text

No content

Slide 8

Slide 8 text

ΞʔΩςΫνϟ Kinesis StreamΛॏๅ͍ͯ͠·͢ • ϐʔΫͰ~100MB/sͰΫϩʔϥ͕ίϯςϯπΛ fetch • ͦΕΛ௚઀Kinesis StreamʹPutRecordsͰૠೖ • Ϋϩʔϥ͸Go੡ (with aws-sdk-go)ɺॻ͖ࠐΈ ͷϦτϥΠ΍όοϑΝϦϯά΋͍ͯ͠Δ • ႈ౳ੑͷ୲อ͸ElasticsearchͰ

Slide 9

Slide 9 text

ͳͥKinesis Stream͔ • PutRecords / GetRecords ͕҆ఆ͍ͯ͠Δ • ϦΞϧλΠϜʹΫϩʔϧ݁ՌΛղੳ͢ΔͨΊͷ σʔλͷόοϑΝͱͯ͠ॏๅ͍ͯ͠Δ • Lambdaͱ࿈ܞ͢Δ͜ͱͰετϦʔϜॲཧ༻Ξ ϓϦέʔγϣϯ΋؆୯ʹॻ͚Δ

Slide 10

Slide 10 text

ͳͥLambda͔ • Kinesis Streamͱͷ࿈ܞ͕؆୯ • ݕূʹ΋Άͪͬͱ৽͍͠Lambda Function࡞Ε ͹͍͍ͷͰखܰ • Kinesis Streamͷσʔλ͸shardʹσʔλ͕͋ ΔͷͰಉ͡σʔλͰͷςετ΋खܰ • Testing in Production (Data)

Slide 11

Slide 11 text

Lambdaͷྑ͍఺ • Kinesis ApplicationΛࣗલͰॻ͘ͱγϟʔυͷ΍Γ ͘Γ͕໘౗ • ͦͷ͋ͨΓΛLambdaଆͷwrapper͕͍͍ײ͡ʹ ͯ͘͠ΕΔ • σϓϩΠָ͕ • Ϗϧυ࣮ͯ͠ߦՄೳόΠφϦΛs3ʹ͓͚͹ͦΕΛ ར༻Ͱ͖Δ • daemon؅ཧͳͲΛߟ͑ͳ͍͍ͯ͘

Slide 12

Slide 12 text

JavaͰͷ࣮૷ྫ

Slide 13

Slide 13 text

࣮૷ྫ in Java KinesisͷϨίʔυܗࣜͱରͱͳΔPOJOΦϒδΣΫ τΛ࡞੒ public class KinesisMessageModel implements Serializable{ public String id; public String url; public String body; public String title; public String description; // ... } see: ྫ: ϋϯυϥʔͷೖग़ྗʹ POJO Λ࢖༻͢Δ (Java) - AWS Lambda

Slide 14

Slide 14 text

σʔλΛՃ޻ͯ࣍͠ͷKinesis Stream΁ public class Boiler { // Kinesis Stream͔ΒͷσʔλΛ͏͚ͱΔϋϯυϥ public void recordHandler(KinesisEvent event) throws IOException { PutRecordsRequest putRecordsRequest = getPutRecordsRequest(this.kinesisOutputStreamName); List putRecordsRequestEntryList = new ArrayList<>(); // 1ͭͷeventʹ͸ෳ਺ͷϨίʔυ͕ೖ͍ͬͯΔ batch sizeͰઃఆՄೳɻ for(KinesisEventRecord rec : event.getRecords()) { KinesisMessageModel record = toClass(rec); PutRecordsRequestEntry putRecordsRequestEntry = new PutRecordsRequestEntry(); // ϨίʔυͷՃ޻ʢ࣮ࡍʹ͸͜͜Ͱຊจநग़Λ͍ͯ͠·͢ʣ ByteBuffer data = ByteBuffer.wrap(new ObjectMapper().writeValueAsString(record)); putRecordsRequestEntry.setData(data); putRecordsRequestEntry.setPartitionKey(record.getSomeKey()); putRecordsRequestEntryList.add(putRecordsRequestEntry); } // ࣍ͷKinesis Stream΁ͷPutRecordsͷ૊Έཱ͍ͯͯΔ putRecordsRequest.setRecords(putRecordsRequestEntryList); PutRecordsResult putRecordsResult = this.kinesis.putRecords(putRecordsRequest); } }

Slide 15

Slide 15 text

Java࣮૷ͷॴײ • ͬ͘͞ͱॻ͘ͳΒnode.jsͷ΄͏ָ͕ • Javaͷ৔߹͸blueprint͕ͳ͍ & Lambda Console͔Βͬ͘͞ ͱࢼ͢͜ͱ͸Ͱ͖ͳ͍ • ύοέʔδϯά͸MavenͰ΍͍ͬͯͯɺMaven Shade PluginͰ uber jarΛ͓͍͍ͭͬͯͯ͘·͢ɻ • uber jar: ґଘϥΠϒϥϦͳͲΛશ෦1ͭͷjarʹ͍Εͨjarͷ͜ͱ • ܗଶૉղੳ༻ͷࣙॻ΋jarʹ͍Ε͍ͯ·͢ Lambda ؔ਺ϋϯυϥʔ (Java) - AWS Lambda Apache Maven Shade Plugin – Introduction

Slide 16

Slide 16 text

࣮૷ʹ͋ͨͬͯؾΛ͚ͭΔ͜ͱ • ΤϥʔϋϯυϦϯά • 1ͭͰ΋มͳϨίʔυ͕͘ΔͱKinesis StreamଆͷϨίʔυ͕expire͢Δ·Ͱ Lambda͕retry͚ͭͮ͠Δ • failͤ͞Δͱఀࢭͯ͠͠·͏ͷͰɺskip͢Δ Α͏ʹ࣮૷͢Δ͜ͱ

Slide 17

Slide 17 text

Lambda࡞੒: aws-cli aws lambda create-function --region ap-northeast-1 --function-name my-lambda-function --code S3Bucket=mybucket,S3Key=path/to/my.jar --role arn:aws:iam::999999999999:role/lambda_kinesis_rw --runtime java8 --handler com.your.app.Handler::recordHandler --description "my kinesis stream!" --timeout 15 --memory-size 512 aws lambda create-event-source-mapping --event-source-arn arn:aws:kinesis:ap-northeast-1:999999999999:stream/your-stream --function-name my-lambda-function --enable --batch-size 100 --starting-position TRIM_HORIZON

Slide 18

Slide 18 text

σϓϩΠํ๏: aws-cli • Pull Request -> merge -> build (on Travis CI) - > S3 • Travis CIͰuber jarΛ͍ͭͬͯ͘·͢ • ͋ͱ͸ update-function-code Ͱ൓ө aws lambda update-function-code --function-name my-lambda-function --s3-bucket mybucket --s3-key path/to/my.jar

Slide 19

Slide 19 text

LambdaͰͷϩΪϯά • Log4jΛ͔͍ͭͬͯ·͢ • ΤϥʔϩάͳͲ͸CloudWatch Logs͔ΒݟΔ͜ͱ ͕Ͱ͖Δ • खݩͰ͸࠶ݱ͠ͳ͍ෆ۩߹ͳͲ͕͋Δ৔߹ʹ͸ CloudWatch Logs͔ΒݟΔ͜ͱ AWS Lambda ͷ Amazon CloudWatch ϩά΁ͷΞ Ϋηε - AWS Lambda ϩΪϯά (Java) - AWS Lambda

Slide 20

Slide 20 text

·ͱΊ • Lambda + Kinesis StreamͰจষΛϦΞϧλΠ Ϝ෼ྨ͢Δ͜ͱ͕Ͱ͖ΔΑ͏ʹͳΓ·ͨ͠ • Lambda, ͬ͘͞ͱ͔͓ͭ͑ͯ͢͢ΊͰ͢