Slide 1

Slide 1 text

© 2019, Amazon Web Services, Inc. or its affiliates. All rights reserved. S U M M I T Dr Frank Munz Senior Technical Evangelist Amazon Web Services B E R S U M 1 9 - 7 8 Designing Less Surreal Architectures with Apache Kafka in AWS @frankmunz

Slide 2

Slide 2 text

© 2019, Amazon Web Services, Inc. or its affiliates. All rights reserved. S U M M I T S U M M I T © 2019, Amazon Web Services, Inc. or its affiliates. All rights reserved. Introductory - 200 “These sessions provide an overview of AWS services and features, and they assume that attendees are new to the topic. These sessions highlight basic use cases, features, functions, and benefits."

Slide 3

Slide 3 text

© 2019, Amazon Web Services, Inc. or its affiliates. All rights reserved. S U M M I T Agenda • Streaming Data • Modern Streaming Architectures • Apache Kafka • Amazon Managed Streaming for Kafka (MSK) • Apache Kafka or Amazon Kinesis? • Q & A

Slide 4

Slide 4 text

S U M M I T © 2019, Amazon Web Services, Inc. or its affiliates. All rights reserved.

Slide 5

Slide 5 text

© 2019, Amazon Web Services, Inc. or its affiliates. All rights reserved. S U M M I T Streaming Data Web Clickstream Application Logs IoT Sensors [Wed Oct 11 14:32:52 2018] [error] [client 127.0.0.1] client denied by server configuration: /export/home/live/ap/ht docs/test Continuously generated, small size events, low latency requirements

Slide 6

Slide 6 text

© 2019, Amazon Web Services, Inc. or its affiliates. All rights reserved. S U M M I T Transform and Process Continuously Streaming Ingest video & data as it’s generated Process data on the fly Real-time analytics/ML, alerts, actions

Slide 7

Slide 7 text

© 2019, Amazon Web Services, Inc. or its affiliates. All rights reserved. S U M M I T Timely Decisions Source: Perishable insights, Mike Gualtieri, Forrester Data loses value quickly over time Real time Seconds Minutes Hours Days Months Value of data to decision-making Preventive/Predictive Actionable Reactive Historical Time critical decisions Traditional “batch” business intelligence Information half-life in decision-making

Slide 8

Slide 8 text

S U M M I T © 2019, Amazon Web Services, Inc. or its affiliates. All rights reserved.

Slide 9

Slide 9 text

© 2019, Amazon Web Services, Inc. or its affiliates. All rights reserved. S U M M I T From Batch to Streaming Analytics https://engineering.linkedin.com/distributed-systems/log-what-every-software-engineer-should-know-about-real-time-datas-unifying

Slide 10

Slide 10 text

© 2019, Amazon Web Services, Inc. or its affiliates. All rights reserved. S U M M I T How Kafka Started: LinkedIn Reduced Complexity Decoupling

Slide 11

Slide 11 text

© 2019, Amazon Web Services, Inc. or its affiliates. All rights reserved. S U M M I T Better Decoupled Microservices Event Sourcing CQRS ! Choreography time-ordered, processable events Separates read (query) from write (command) operations. Writes are event sourced. choreography orchestration

Slide 12

Slide 12 text

© 2019, Amazon Web Services, Inc. or its affiliates. All rights reserved. S U M M I T https://www.confluent.io/blog/publishing-apache-kafka-new-york-times/ log.retention.hours = -1 Kafka as Data or Event Store

Slide 13

Slide 13 text

S U M M I T © 2019, Amazon Web Services, Inc. or its affiliates. All rights reserved.

Slide 14

Slide 14 text

© 2019, Amazon Web Services, Inc. or its affiliates. All rights reserved. S U M M I T Commit Log 0 1 2 3 4 … n Message Offset Producer Consumer A Consumer B https://www.quora.com/Kafka-writes-every-message-to-broker-disk-Still-performance-wise-it-is-better-than-some-of-the-in-memory-message-storing-message-queues-Why-is-that new old Topic A

Slide 15

Slide 15 text

© 2019, Amazon Web Services, Inc. or its affiliates. All rights reserved. S U M M I T TopicA Partition1 TopicA Partition3 Partition Replica Replica Producer Zoo- keeper Zoo- keeper Zoo- keeper State & Config TopicA Partition2 Replica Cluster Partitioned, Replicated Commit Log

Slide 16

Slide 16 text

© 2019, Amazon Web Services, Inc. or its affiliates. All rights reserved. S U M M I T Challenges Operating Apache Kafka Difficult to setup, configure and operate Hard to achieve high availability Tricky to scale AWS integrations No console, no visible metrics

Slide 17

Slide 17 text

© 2019, Amazon Web Services, Inc. or its affiliates. All rights reserved. S U M M I T How to run Apache Kafka on AWS? Self managed on EC2 Amazon Managed Streaming for Kafka (this talk!) On top of Kubernetes, e.g. as K8s operator

Slide 18

Slide 18 text

S U M M I T © 2019, Amazon Web Services, Inc. or its affiliates. All rights reserved.

Slide 19

Slide 19 text

© 2019, Amazon Web Services, Inc. or its affiliates. All rights reserved. S U M M I T Why Amazon MSK?

Slide 20

Slide 20 text

© 2019, Amazon Web Services, Inc. or its affiliates. All rights reserved. S U M M I T Apache Zookeeper (ZK) ? Zookeeper runs under the hood ZK is set up highly available No additional cost

Slide 21

Slide 21 text

Getting started with Amazon MSK Preview is easy

Slide 22

Slide 22 text

© 2019, Amazon Web Services, Inc. or its affiliates. All rights reserved. S U M M I T MSK VPC AWS Managed Streaming for Kafka Availability Zone 1 Availability Zone 2 Availability Zone 2 Control Plane: Zookeeper Instances Use Zookeeper Connect String for clients 172.31.4.240:2181,172.31.44.125:2181,172.31.20.136:2181 Data Plane: Broker Instances What you do: What we do…

Slide 23

Slide 23 text

© 2019, Amazon Web Services, Inc. or its affiliates. All rights reserved. S U M M I T Plans for MSK Planned for MSK Global Availability Service level agreement (SLA) Version upgrades Scale a cluster horizontally & vertically Supports Apache Kafka partition reassignment tooling Define custom cluster configurations Auto scale storage Deeper AWS integration: Tagging, AWS CloudTrail, AWS CloudFormation Already in MSK Preview Apache Kafka 2.1 (or 1.1.1) 3 Regions: N Virginia, Ohio, Ireland Console and API provisioning M5 Broker with GP2 Storage AWS Cloud Watch, VPC, IAM and KMS Auto-healing Patches applied automatically

Slide 24

Slide 24 text

© 2019, Amazon Web Services, Inc. or its affiliates. All rights reserved. S U M M I T Amazon MSK Defaults Config Default Setting offsets.topic.replication.factor 3 transaction.state.log.replication.factor 3 transaction.state.log.min.isr 2 auto.create.topics.enable False default.replication.factor 3 min.insync.replicas 2 unclean.leader.election.enable True auto.leader.rebalance.enable True authorizer.class.name kafka.security.auth.SimpleAclAuthorizer group.initial.rebalance.delay.ms 3000 log.retention.hours 168

Slide 25

Slide 25 text

© 2019, Amazon Web Services, Inc. or its affiliates. All rights reserved. S U M M I T MSK Pricing On-demand, hourly pricing for broker and storage prorated to the second: kafka.m5.large $0.21/hr $0.10 per GB-month You don’t pay for the number of topics or replication traffic or ZK.

Slide 26

Slide 26 text

© 2019, Amazon Web Services, Inc. or its affiliates. All rights reserved. S U M M I T Comparing Amazon Kinesis Data Streams to MSK Amazon Kinesis Data Streams Amazon MSK Newest data Oldest data 5 0 1 2 3 4 0 1 2 3 0 1 2 3 4 Shard 2 Shard 1 Shard 3 Writes from Producers Stream with 3 shards Newest data Oldest data 5 0 1 2 3 4 0 1 2 3 0 1 2 3 4 Partition 2 Partition 1 Partition 3 Writes from Producers Topic with 3 partitions

Slide 27

Slide 27 text

© 2019, Amazon Web Services, Inc. or its affiliates. All rights reserved. S U M M I T AWS API Amazon Kinesis Data Streams Throughput Provisioning Model Seamless Scaling Deep AWS Integration Retention Time 1d (max 7d) Open-Source Cluster Provisioning Model Scaling not seamless to client Retention 7d (max is unlimitted) Strong 3rd Party Tooling Apache Kafka

Slide 28

Slide 28 text

© 2019, Amazon Web Services, Inc. or its affiliates. All rights reserved. S U M M I T Conclusion Streaming is about actionable data: Apache Kafka is an open-source, versatile, and popular streaming platform Managed Streaming for Kafka (MSK) We run Apache Kafka for you Go build with MSK or Kinesis

Slide 29

Slide 29 text

Berlin Summit 2019 in 2 Minutes https://www.youtube.com/watch?v=1rkfOGKF6wQ

Slide 30

Slide 30 text

© 2019, Amazon Web Services, Inc. or its affiliates. All rights reserved. S U M M I T Thank you! S U M M I T © 2019, Amazon Web Services, Inc. or its affiliates. All rights reserved. @frankmunz