I need an estimate about how long would take to put Apache Kafka in the Cloud" Software Engineer: "Sure Mr. Product Owner. Give me a day so I can come up with something solid to include in your plan." Product Owner: "And be as accurate as possible because DEV needs this for coding." "Hahaha"
World with Apache Kafka on his laptop. Easy peasy! Handling data with producers and consumers. Done! Enough from this child thing: loading data from PROD Updating my LinkedIn profile with a new Skill… Sweet: Confluent has several Docker Images pre-built. Build a Docker image from the Hello World and run in the Cloud
fantastic. Gonna write this down here with blood and fire, plus recording all…" Software Engineer: "Mr. Product Owner, I can easily put Apache Kafka in the Cloud in a couple weeks…" "Hahaha"
interesting findings… Zookeeper ports cannot be exposed? Keeping data secure while in-transit? Keeping data secure at-rest? Plan ahead IP addresses to avoid overlapping? Enabling remote SSH access? Developer authentication should use BASIC AUTH or OAuth 2.0? Partitions auto- rebalancing? How do I prevent people to change the infrastructure? Who should own the infrastructure? How do I manage the upgrades of the Apache Kafka binaries? SOC-2 Type II What if I have a hybrid center of gravity? Avoid vendor lock-in or sell my soul to XYZ? Scale in or scale out? And without downtime?
want to build infrastructure. You want to build software. 1 Distributed Systems are hard to manage. And Cloud makes it even harder. 2 Do you really know who Bob was in this story? No? Then check it out… 3
Fun with Coding since 1997 • Ex-Oracle, Red Hat, IONA Technologies • Blog: https://riferrei.net • Twitter: @riferrei o Geek stuff and Apache Kafka® o DC/Marvel, Mindhunter Book • Brazilian, Husband and Father
Yourself 1 • Pretty much you need to build everything; except of course for Apache Kafka. • Pros: o Ability to own and control everything. o Lots of learning sources out there. • Cons: o Extremely complex to manage and scale. o High cost with TCO, infrastructure costs.
Kubernetes 2 • Makes transparent where Apache Kafka is running; whether On-Premise or in the Cloud. • Pros: o Abstracts cluster and deployment details. o Tooling support: Confluent Operator, Pivotal PKS, Red Hat OpenShift, Cloud Extensions. • Cons: o Underlying infrastructure is still necessary. o There is people and infrastructure costs.
Apache Kafka is running; whether On-Premise or in the Cloud. • Pros: o Abstracts cluster and deployment details. o Tooling support: Confluent Operator, Pivotal PKS, Red Hat OpenShift, Cloud Extensions. • Cons: o Underlying infrastructure is still necessary. o There is people and infrastructure costs.
Chef 3 • Makes transparent where Apache Kafka is running; as well as all the infrastructure details. • Pros: o Abstracts cluster and deployment details. o Creates a immutable, repeatable, disposable infrastructure that can be managed as code. • Cons: o Vendor lock-in, provisioning complexity. o There is people and infrastructure costs.
Service 4 • Apache Kafka is delivered to you as SaaS; leaving more time for coding and innovation. • Pros: o SLAs about performance and availability. o Pay-as-you-Go, predictable costs up ahead. o Multiple Cloud vendor choices = No lock-in. • Cons: o You don't have control over anything. o Something new to learn and be good at it.
Meet Confluent Cloud: a scalable streaming data service based on Apache Kafka that is delivered to you 100% as a service. • Supported by the best Kafka engineers. • Best hybrid center of gravity support. • Multi-Cloud support: AWS, GCP, Others. • Apache Kafka APIs = No proprietary stuff. • Access to the Confluent Platform Tools • Professional and Enterprise Plans.
Confluent Platform ecosystem in a snap with your cluster? • Say hello to Confluent Cloud Tools: • https://github.com/confluentinc/ccloud-tools • Creates a highly-available, Multi-AZ, fully secure within a VPC – set of tools that are automatically connected into you Confluent Cloud cluster.
Private Subnet Schema Registry 1 Private Subnet REST Proxy 1 Private Subnet KSQL Server 1 Availability Zone 2 Private Subnet Schema Registry 2 Private Subnet REST Proxy 2 Private Subnet KSQL Server 2 Multi AZ for HA Public Subnet Public Internet VPC Peering OR Scaling Out Across AZ's Schema Registry REST Proxy KSQL Server Bastion Server
(Y) Motion (Z) Alice -45 0 12 Bob 1 -185 -90 John 0 -1 90 Steve -180 0 180 Number X Y Z 1 1 0 0 2 1 -90 1 3 -180 0 180 4 1 90 -1 Stream of Events Numbers Table Getting the Number 3 Time