Upgrade to Pro — share decks privately, control downloads, hide ads and more …

Is Lambda Architecture is a new normal for cloud native apps? by Anton Kranga

Riga Dev Day
March 13, 2016
120

Is Lambda Architecture is a new normal for cloud native apps? by Anton Kranga

Riga Dev Day

March 13, 2016
Tweet

Transcript

  1. :~ whoami: Antons Kranga Full stack developer ~ 15years Cloud

    Architect DevOps evangelist Innovation Labs of Accenture Cloud Platform Speaker Marathon runner
  2. What is Streaming? We often want to deploy data models

    based on new data that continuously arrive from the multiple sources 0 1 0 1 0 1 0 1 0 1 0 1 0 1 0 1
  3. Challenges Users expect data will appear immediately after it arrived

    Fault tolerant Distributed data consistency Scalability (how not to lose data when scale down)
  4. What is “λ” 0 1 0 1 0 1 0

    1 0 0 0 1 1 1 0 1 Speed Layer Batch Layer new data master data realtime view Serving Layer view View View … map-red query query realtime view
  5. AWS Blueprint for Lambda Architectures https://d0.awsstatic.com/whitepapers/lambda-architecure-on-for-batch- aws.pdf Published at July

    2015 Amazon Kinesis Amazon Kinesis–
 enabled app S3 buckets Amazon EMR speed layer batch layer emr on serving and merging layer
  6. Kinesis aws region az1 az2 az3 Lambda S3 storage Redshift

    consumers EC2 Instance EMR producers
  7. Kinesis producers aws region az1 az2 az3 Lambda S3 storage

    Redshift consumers EC2 Instance EMR AmazonKinesis kinesis = ... ... PutRecordRequest putRecord = new PutRecordRequest(); putRecord.setStreamName(streamName); putRecord.setData(ByteBuffer.wrap(bytes)); putRecord.setSequenceNumberForOrdering(null); ... kinesis.putRecord(putRecord); Producer
  8. Kinesis aws region az1 az2 az3 Lambda S3 storage Redshift

    consumers EC2 Instance EMR AmazonKinesis kinesis = ... ... PutRecordRequest putRecord = new PutRecordRequest(); putRecord.setStreamName(streamName); putRecord.setData(ByteBuffer.wrap(bytes)); putRecord.setSequenceNumberForOrdering(null); ... kinesis.putRecord(putRecord); Producer AmazonKinesisClient kinesisClient = ... GetShardIteratorRequest req = ... req.setStreamName("my-kinesis"); req.setShardIteratorType("TRIM_HORIZON"); ... GetRecordsResult result = kinesisClient.getRecords(req); records = result.getRecords(); for (Record record : records) { ... = record.getData(); } Consumer producers
  9. Kinesis streams What: Enables to build near-real-time data processing applications

    Use cases: • Real time analytics • Log files processing • Reporting Durability: data streams replicated across 3AZ
  10. Kinesis streams Cost Model: Shard Hour: • 5 read transaction

    per second • 2 MB data read per second • 100 write transactions per second • 1 MB data write per second aprox 12.5USD/Mo Extended data retention • Up to 7 days
  11. Kinesis streams Not good when: • Small scale throughput less

    than 200KB/sec • Long term data storage (more than 24H)
  12. Lambda What: Lambda allows to write function without having actual

    server Use cases: • Real time Stream processing • Tiny ETL • In few cases can replace EC2 • Process IaaS Events Runtimes: Java8, NodeJS, Python Backed by: provides /tmp for ephemeral storage. Durability: No maintenance windows, 3 retries before failure
  13. Lambda Cost Model: Requests per function: • GB/seconds • Step

    100 millisec • 0.20 USD Mill-Requests; $0.00001667 per GB
  14. Lambda Not good when: • Timeout 300 sec (cannot be

    changed) • Forces developer to think stateless • Highly dynamic web-sites. • Competes with t2.nano ($4.75/month)
  15. S3 storage SNS consumers Kinesis Lambda … Lambda S3 storage

    SNS consumers Kinesis … myApp.
 ZIP Java8 Python NodeJS
  16. EMR What: Managed service of Apache Hadoop Use cases: •

    MapRed data processing • Large data ETL jobs • Data movement • Log processing and analytics Backed by: 1 or cluster of EC2 instances Durability: on storage level provides by S3 See more: https://media.amazonwebservices.com/AWS_Amazon_EMR_Best_Practices.pdf
  17. EMR Cost Model: • Charges apply per EC2 sizes model

    • S3 storage charges applies (0.03 GB/Mo)
  18. EMR Not good when: • Small to Medium data sets

    • ACID (atomicity, consistency, isolation, durability) • Competes with RDS: Dynamo DB, Aurora DB
  19. S3 What: Highly fully managed persistent storage • Static content

    web sites • File storage (primarily for reading) • Archives storage Backed by: covered by AWS S3 SLA Durability: storage: 99.999999999%; availability: 99.99%
  20. S3 Cost Model: GB/Mo • Standard Storage: $0.03 GB/Mo •

    Infrequent Access Storage: $0.0125 GB/Mo • Glacier Storage: $0.007 GB/Mo
  21. S3 Not good when: • S3 write can be slow

    • Glacier can restore up to 5% of storage per months
  22. Redshift What: Petabytes scale Data Warehouse as managed service •

    Data warehouse (OLAP) • BI and ETL • Store large historical data Backed by: AWS provides automatic data backup Durability: on storage level provides by S3 Scaling: Start with 160GB node and then you can scale
  23. Redshift Cost Model: • Charges apply per EC2 sizes model

    • S3 storage charges applies (0.03 GB/Mo)
  24. Kinesis shard shard shard producer batch layer speed layer ec2

    S3 Bucket Map Red Process Stream serving layer View DynamoDB Primer Lambda (every hour)
  25. Kinesis shard shard shard producer batch layer speed layer ec2

    S3 Bucket Map Red Process Stream serving layer View DynamoDB Primer Lambda (every hour) streams execution Lambda (every hour)
  26. Kinesis shard shard shard producer batch layer speed layerf ec2

    S3 Bucket Map Red Process Stream serving layer View DynamoDB Primer Lambda (every hour) Lambda (every hour) Presentation Layer JS app Lambda