Slide 1

Slide 1 text

© 2019, Amazon Web Services, Inc. or its Affiliates. Apache Kafka on AWS Amazon Managed Streaming for Apache Kafka Dr. Frank Munz Senior Technical Evangelist Amazon Web Services @frankmunz

Slide 2

Slide 2 text

© 2019, Amazon Web Services, Inc. or its affiliates. All rights reserved. S U M M I T About me • Software Architect / DevOps Engineer • Technical Evangelist @ AWS • Published an AWS book • Containers, serverless and a sprinkle of ML & big / fast data @frankmunz

Slide 3

Slide 3 text

© 2019, Amazon Web Services, Inc. or its Affiliates. Table of contents • Streaming Data • Modern Streaming Architectures • Apache Kafka • Amazon Managed Streaming for Apache Kafka (MSK) • Apache Kafka or Amazon Kinesis? • Q & A

Slide 4

Slide 4 text

© 2019, Amazon Web Services, Inc. or its affiliates. All rights reserved. S U M M I T Streaming Data

Slide 5

Slide 5 text

© 2019, Amazon Web Services, Inc. or its affiliates. All rights reserved. S U M M I T Streaming Data Web Clickstream Application Logs IoT Sensors [Wed Oct 11 14:32:52 2018] [error] [client 127.0.0.1] client denied by server configuration: /export/home/live/ap/ht docs/test Continuously generated, small size events, low latency requirements

Slide 6

Slide 6 text

© 2019, Amazon Web Services, Inc. or its affiliates. All rights reserved. S U M M I T Timely Decisions Source: Perishable insights, Mike Gualtieri, Forrester Data loses value quickly over time Real time Seconds Minutes Hours Days Months Value of data to decision-making Preventive/Predictive Actionable Reactive Historical Time critical decisions Traditional “batch” business intelligence Information half-life in decision-making

Slide 7

Slide 7 text

© 2019, Amazon Web Services, Inc. or its affiliates. All rights reserved. S U M M I T Less Surreal, Modern Architectures

Slide 8

Slide 8 text

© 2019, Amazon Web Services, Inc. or its affiliates. All rights reserved. S U M M I T How Kafka Started: LinkedIn Reduced Complexity Decoupling

Slide 9

Slide 9 text

© 2019, Amazon Web Services, Inc. or its affiliates. All rights reserved. S U M M I T Better Decoupling: Microservices Event Sourcing CQRS Choreography time-ordered, processable events Separates read (query) from write (command) operations. Writes are event sourced. choreography orchestration

Slide 10

Slide 10 text

© 2019, Amazon Web Services, Inc. or its affiliates. All rights reserved. S U M M I T https://www.confluent.io/blog/publishing-apache-kafka-new-york-times/ log.retention.hours = -1 Kafka as Data or Event Store

Slide 11

Slide 11 text

© 2019, Amazon Web Services, Inc. or its affiliates. All rights reserved. S U M M I T Apache Kafka

Slide 12

Slide 12 text

© 2019, Amazon Web Services, Inc. or its affiliates. All rights reserved. S U M M I T Commit Log 0 1 2 3 4 … n Message Offset Producer Consumer A Consumer B https://www.quora.com/Kafka-writes-every-message-to-broker-disk-Still-performance-wise-it-is-better-than-some-of-the-in-memory-message-storing-message-queues-Why-is-that new old Topic A

Slide 13

Slide 13 text

© 2019, Amazon Web Services, Inc. or its affiliates. All rights reserved. S U M M I T TopicA Partition1 TopicA Partition3 Partition Replica Replica Producer Zoo- keeper Zoo- keeper Zoo- keeper State & Config TopicA Partition2 Replica Cluster Partitioned, Replicated Commit Log

Slide 14

Slide 14 text

© 2019, Amazon Web Services, Inc. or its affiliates. All rights reserved. S U M M I T Challenges Operating Apache Kafka Difficult to setup, configure and operate Hard to achieve high availability Tricky to scale AWS integrations No console, no visible metrics Operational experience

Slide 15

Slide 15 text

© 2019, Amazon Web Services, Inc. or its affiliates. All rights reserved. S U M M I T How to run Apache Kafka on AWS? Self managed on EC2 Amazon Managed Streaming for Kafka (this talk!) On top of Kubernetes, e.g. as K8s operator

Slide 16

Slide 16 text

© 2019, Amazon Web Services, Inc. or its affiliates. All rights reserved. S U M M I T Amazon Managed Streaming for Apache Kafka (MSK)

Slide 17

Slide 17 text

© 2019, Amazon Web Services, Inc. or its affiliates. All rights reserved. S U M M I T Apache Zookeeper (ZK) ? Zookeeper runs under the hood ZK is set up highly available No additional cost

Slide 18

Slide 18 text

Getting started with Amazon MSK is easy!

Slide 19

Slide 19 text

© 2019, Amazon Web Services, Inc. or its affiliates. All rights reserved. S U M M I T Security Data is always encrypted at rest and can be encrypted in transit

Slide 20

Slide 20 text

© 2019, Amazon Web Services, Inc. or its affiliates. All rights reserved. S U M M I T Cluster Wide Storage Scaling You can increase storage after creation but not decrease it aws kafka update-broker-storage --cluster-arn ClusterArn --current-version Current- Cluster-Version --target-broker-ebs-volume-info '{"KafkaBrokerNodeId": "All", "VolumeSizeGB": Target-Volume-in-GiB}'

Slide 21

Slide 21 text

© 2019, Amazon Web Services, Inc. or its affiliates. All rights reserved. S U M M I T Cloud Formation Support for MSK https://docs.aws.amazon.com/AWSCloudFormation/latest/UserGuide/aws-resource-msk-cluster.html

Slide 22

Slide 22 text

© 2019, Amazon Web Services, Inc. or its affiliates. All rights reserved. S U M M I T CloudWatch Integration https://docs.aws.amazon.com/msk/latest/developerguide/monitoring.html MSK monitoring levels: DEFAULT, PER_BROKER, or PER_TOPIC_PER_BROKER

Slide 23

Slide 23 text

© 2019, Amazon Web Services, Inc. or its affiliates. All rights reserved. S U M M I T Custom Configuration Option Default configuration for brokers, topics, and Apache ZooKeeper nodes: You can create custom configurations and use them for cluster creation

Slide 24

Slide 24 text

© 2019, Amazon Web Services, Inc. or its affiliates. All rights reserved. S U M M I T MSK Pricing On-demand, hourly pricing for broker and storage prorated to the second: kafka.m5.large $0.21/hr $0.10 per GB-month You don’t pay for the number of topics or replication traffic or ZK.

Slide 25

Slide 25 text

© 2019, Amazon Web Services, Inc. or its affiliates. All rights reserved. S U M M I T Amazon Kinesis or Managed Streaming for Apache Kafka?

Slide 26

Slide 26 text

© 2019, Amazon Web Services, Inc. or its affiliates. All rights reserved. S U M M I T Amazon Kinesis Real-time data streaming and analytics Easily collect, process, and analyze streams in real time Kinesis Video Streams Kinesis Data Streams Kinesis Data Firehose Kinesis Data Analytics Capture, process, and store video streams for analytics Load data streams into AWS data stores Analyze data streams with SQL or Java Build custom applications that analyze data streams NEW!

Slide 27

Slide 27 text

© 2019, Amazon Web Services, Inc. or its affiliates. All rights reserved. S U M M I T Comparing Amazon Kinesis Data Streams to MSK Amazon Kinesis Data Streams Amazon MSK Newest data Oldest data 5 0 1 2 3 4 0 1 2 3 0 1 2 3 4 Shard 2 Shard 1 Shard 3 Writes from Producers Stream with 3 shards Newest data Oldest data 5 0 1 2 3 4 0 1 2 3 0 1 2 3 4 Partition 2 Partition 1 Partition 3 Writes from Producers Topic with 3 partitions

Slide 28

Slide 28 text

© 2019, Amazon Web Services, Inc. or its affiliates. All rights reserved. S U M M I T AWS API Amazon Kinesis Data Streams Throughput Provisioning Model Seamless Scaling Deep AWS Integration Retention Time 1d (max 7d) Open-Source Cluster Provisioning Model Scaling not seamless to client Retention 7d (max is unlimitted) Strong 3rd party tooling Apache Kafka

Slide 29

Slide 29 text

© 2019, Amazon Web Services, Inc. or its affiliates. All rights reserved. S U M M I T Conclusion Streaming is about actionable data Apache Kafka is an open-source, versatile, and popular streaming platform Managed Streaming for Kafka (MSK): We run Apache Kafka for you Go build with MSK or Kinesis

Slide 30

Slide 30 text

© 2019, Amazon Web Services, Inc. or its affiliates. All rights reserved. S U M M I T Additional Resources bit.ly/aws-kafka

Slide 31

Slide 31 text

© 2019, Amazon Web Services, Inc. or its affiliates. All rights reserved. S U M M I T © 2019, Amazon Web Services, Inc. or its Affiliates. Thank you! frankmunz @frankmunz https://medium.com/@frank.munz (Blog) https://speakerdeck.com/fmunz (Slides)