Upgrade to Pro — share decks privately, control downloads, hide ads and more …

Analysing_Streaming_Data_AWS_Summit_Madrid_2019_with_Infinia.pdf

 Analysing_Streaming_Data_AWS_Summit_Madrid_2019_with_Infinia.pdf

Análisis de datos en tiempo real.

Analysing Streaming Data. AWS Kinesis Data Analytics (with SQL) and Apache Flink (Java). Amazon Managed Streaming for Kafka.

Frank Munz

May 07, 2019
Tweet

More Decks by Frank Munz

Other Decks in Programming

Transcript

  1. © 2019, Amazon Web Services, Inc. or its affiliates. All

    rights reserved. S U M M I T Ruben Hernando Technical Director Infinia Dr Frank Munz Senior Technical Evangelist Amazon Web Services Analysing Streaming Data
  2. © 2019, Amazon Web Services, Inc. or its affiliates. All

    rights reserved. S U M M I T S U M M I T © 2019, Amazon Web Services, Inc. or its affiliates. All rights reserved. Introductory - 200 “These sessions provide an overview of AWS services and features, and they assume that attendees are new to the topic. These sessions highlight basic use cases, features, functions, and benefits."
  3. © 2019, Amazon Web Services, Inc. or its affiliates. All

    rights reserved. S U M M I T - Streaming Architectures - Amazon Kinesis - Serverless Stream Processing - Amazon Managed Streaming for Kafka (MSK) - Ruben Hernando from Infinia Agenda
  4. © 2019, Amazon Web Services, Inc. or its affiliates. All

    rights reserved. S U M M I T Streaming Data Web Clickstream Application Logs IoT Sensors [Wed Oct 11 14:32:52 2018] [error] [client 127.0.0.1] client denied by server configuration: /export/home/live/ap/ht docs/test Continuously generated, small size events, low latency requirements
  5. © 2019, Amazon Web Services, Inc. or its affiliates. All

    rights reserved. S U M M I T Transform and Process Continuously Streaming Ingest video & data as it’s generated Process data on the fly Real-time analytics/ML, alerts, actions
  6. S U M M I T © 2019, Amazon Web

    Services, Inc. or its affiliates. All rights reserved.
  7. © 2019, Amazon Web Services, Inc. or its affiliates. All

    rights reserved. S U M M I T Amazon Kinesis Real-time data streaming and analytics Easily collect, process, and analyze streams in real time Kinesis Video Streams Kinesis Data Streams Kinesis Data Firehose Kinesis Data Analytics Capture, process, and store video streams for analytics Load data streams into AWS data stores Analyze data streams with SQL or Java Build custom applications that analyze data streams NEW!
  8. © 2019, Amazon Web Services, Inc. or its affiliates. All

    rights reserved. S U M M I T Amazon Kinesis Data Streams Overview
  9. © 2019, Amazon Web Services, Inc. or its affiliates. All

    rights reserved. S U M M I T Data Ingestion from a Variety of Sources Kinesis Data Streams Transactions ERP Web logs/ cookies Connected devices AWS SDKs • Publish directly from application code via APIs • AWS Mobile SDK • Managed AWS sources: CloudWatch Logs, AWS IoT, Kinesis Data Analytics and more • RDS Aurora via Lambda Kinesis Agent • Monitors log files and forwards lines as messages to Kinesis Data Streams 3rd party and open source • Log4j appender • Apache Kafka • Flume, fluentd, and more … Kinesis Producer Library (KPL) • Background process aggregates and batches messages
  10. © 2019, Amazon Web Services, Inc. or its affiliates. All

    rights reserved. S U M M I T Kinesis Data Streams: Standard consumers
  11. © 2019, Amazon Web Services, Inc. or its affiliates. All

    rights reserved. S U M M I T New: Lambda supports Kinesis Data Streams Enhanced Fan-Out and HTTP/2 for faster streaming Enhanced fan-out allows customers to scale the number of functions reading from a stream in parallel while maintaining performance. HTTP/2 data retrieval API improves data delivery speed between data producers and Lambda functions by more than 65% Amazon Kinesis Data Streams
  12. S U M M I T © 2019, Amazon Web

    Services, Inc. or its affiliates. All rights reserved.
  13. © 2019, Amazon Web Services, Inc. or its affiliates. All

    rights reserved. S U M M I T The Serverless Operational Model No provisioning, no management Pay for value Automatic scaling Highly available and secure
  14. © 2019, Amazon Web Services, Inc. or its affiliates. All

    rights reserved. S U M M I T Processing a Data Stream with AWS Lambda data producer Kinesis Data Streams Amazon SNS Continuously stream data Lambda service Lambda function A Lambda function B Continuously polls for new data, 1 poll per second Automatically invokes your function(s) when data found Lambda polls each shard once per second, reads records in batch Lambda’s maximum execution time is 15 minutes
  15. © 2019, Amazon Web Services, Inc. or its affiliates. All

    rights reserved. S U M M I T Kinesis Streaming Data Analytics: SQL or Apache Flink (Java)
  16. © 2019, Amazon Web Services, Inc. or its affiliates. All

    rights reserved. S U M M I T Kinesis Streaming Data Analytics / SQL
  17. © 2019, Amazon Web Services, Inc. or its affiliates. All

    rights reserved. S U M M I T Kinesis Streaming Data Analytics / Apache Flink Framework and engine for stateful processing of data streams. Simple programming High performance Stateful Processing Strong data integrity Easy to use and flexible APIs make building apps fast In-memory computing provides low latency & high throughput Durable application state saves Exactly-once processing and consistent state
  18. © 2019, Amazon Web Services, Inc. or its affiliates. All

    rights reserved. S U M M I T Kinesis Data Firehose: Ingest Transform Load (ITL)
  19. © 2019, Amazon Web Services, Inc. or its affiliates. All

    rights reserved. S U M M I T Kinesis Data Firehose—How it Works Ingest Transform Deliver Amazon S3 Amazon Redshift Amazon Elasticsearch Service AWS IoT Amazon Kinesis Agent Amazon Kinesis Streams Amazon CloudWatch Logs Amazon CloudWatch Events Apache Kafka
  20. © 2019, Amazon Web Services, Inc. or its affiliates. All

    rights reserved. S U M M I T Kinesis Data Firehose: Record format Conversion Kinesis Data Firehose Amazon S3 Glue Data Catalog Data Producer schema convert to columnar format JSON data /failed
  21. S U M M I T © 2019, Amazon Web

    Services, Inc. or its affiliates. All rights reserved.
  22. © 2019, Amazon Web Services, Inc. or its affiliates. All

    rights reserved. S U M M I T Demo Architecture
  23. S U M M I T © 2019, Amazon Web

    Services, Inc. or its affiliates. All rights reserved.
  24. © 2019, Amazon Web Services, Inc. or its affiliates. All

    rights reserved. S U M M I T Challenges operating Apache Kafka Difficult to setup, configure and operate Hard to achieve high availability Tricky to scale AWS integrations = development No console, no visible metrics
  25. S U M M I T © 2019, Amazon Web

    Services, Inc. or its affiliates. All rights reserved.
  26. Thank you! S U M M I T © 2019,

    Amazon Web Services, Inc. or its affiliates. All rights reserved. frankmunz @frankmunz https://medium.com/@frank.munz (Blog) https://speakerdeck.com/fmunz (Slides)