4 Different Ways of Working with Kafka on Azure @ Global Azure 2021

FOUR Different Ways of Working with Kafka on Azure @hpgrahsl
| @Azure #GlobalAzure, 16th April 2021, Linz - Austria

@hpgrahsl | @Azure #GlobalAzure, 16th April 2021, Linz - Austria
2

3

4

5

Diminishing Value of Data @hpgrahsl | @Azure #GlobalAzure, 16th April
2021, Linz - Austria 6

Hans-Peter Grahsl • based in Graz, Austria • technical trainer
at NETCONOMY • independent engineer & consultant • Confluent Community Catalyst • MongoDB Champion @hpgrahsl | @Azure #GlobalAzure, 16th April 2021, Linz - Austria 10

Stream Processing @hpgrahsl | @Azure #GlobalAzure, 16th April 2021, Linz
- Austria 11

"... data processing that is designed with infinite data sets
in mind." — Tyler Akidau @hpgrahsl | @Azure #GlobalAzure, 16th April 2021, Linz - Austria 12

‛ messaging ‛ integration ‛ processing plus storage @hpgrahsl |
@Azure #GlobalAzure, 16th April 2021, Linz - Austria 13

14

15

16

17

central nervous system @hpgrahsl | @Azure #GlobalAzure, 16th April 2021,
Linz - Austria 18

Kafka with Azure HDInsight @hpgrahsl | @Azure #GlobalAzure, 16th April

20

HDInsight Services "Family" • large-scale parallel batch processing • general
purpose data warehousing • stream processing for IoT • data science & machine learning @hpgrahsl | @Azure #GlobalAzure, 16th April 2021, Linz - Austria 21

HDInsight Services "Family" @hpgrahsl | @Azure #GlobalAzure, 16th April 2021,
Linz - Austria 22

HDInsight Services "Family" @hpgrahsl | @Azure #GlobalAzure, 16th April 2021,
Linz - Austria 23

HDInsight Apache Kafka® • broker + zookeeper nodes • managed
disks / storage • flexible provisioning • 99.9% SLA uptime @hpgrahsl | @Azure #GlobalAzure, 16th April 2021, Linz - Austria 24

? Client Access ? @hpgrahsl | @Azure #GlobalAzure, 16th April

? Client Access ? YES: ! when run in same
VNet ! with VNet peering + IP advertising ! from on-premises with VPN gateway ! by using Kafka REST proxy @hpgrahsl | @Azure #GlobalAzure, 16th April 2021, Linz - Austria 26

Apache Kafka® HDInsight • main benefits: ✅ easy provisioning with
flexible pricing ✅ open-source Kafka components only ✅ supported by Microsoft SLAs @hpgrahsl | @Azure #GlobalAzure, 16th April 2021, Linz - Austria 27

Apache Kafka® HDInsight • main drawbacks: ⛔ outdated version (Kafka
2.1.1) ⛔ only "core" Kafka components ⛔ per default no external broker access @hpgrahsl | @Azure #GlobalAzure, 16th April 2021, Linz - Austria 28

Azure Event Hubs for Kafka @hpgrahsl | @Azure #GlobalAzure, 16th
April 2021, Linz - Austria 29

Azure Event Hubs • fully-managed PaaS • distributed event ingestion
service • supports auto-scaling capabilities • well-integrated with complementary services @hpgrahsl | @Azure #GlobalAzure, 16th April 2021, Linz - Austria 30

The Big Picture @hpgrahsl | @Azure #GlobalAzure, 16th April 2021,
Linz - Austria 31

Look-alikes "Conceptually, Kafka and Event Hubs are very similar: they're
both partitioned logs built for streaming data, whereby the client controls which part of the retained log it wants to read." @hpgrahsl | @Azure #GlobalAzure, 16th April 2021, Linz - Austria 32

Event Hubs for Kafka • overlay on top of Event
Hubs • protocol compatible with Kafka 1.0+ • transparent re-use (code + tools) • migration benefits in both ways @hpgrahsl | @Azure #GlobalAzure, 16th April 2021, Linz - Austria 33

same same but different @hpgrahsl | @Azure #GlobalAzure, 16th April

The Virtual Promise... "Update the connection string in conﬁgurations to
point to the Kafka endpoint exposed by your event hub instead of pointing to your Kafka cluster. Then, you can start streaming events from your applications that use the Kafka protocol into Event Hubs." @hpgrahsl | @Azure #GlobalAzure, 16th April 2021, Linz - Austria 35

The devil is in the details @hpgrahsl | @Azure #GlobalAzure,
16th April 2021, Linz - Austria 36

Unsupported Kafka Features ! idempotent producers & transactions ! compression
of messages ! size-based retention or log compaction ! HTTP access via Kafka REST proxy ! Kafka Streams & ksqlDB connections @hpgrahsl | @Azure #GlobalAzure, 16th April 2021, Linz - Austria 37

Customer Feedback ! 10 hubs (=topics) per namespace https://bit.ly/3dvQCA1 !
1 MB message size limit https://bit.ly/3sQDlIN ! no Kafka Streams / ksqlDB connections https://bit.ly/3mi4hyu https://bit.ly/39Huu4s @hpgrahsl | @Azure #GlobalAzure, 16th April 2021, Linz - Austria 38

Event Hubs for Kafka • main benefits: ✅ hybrid messaging
scenarios OOTB ✅ auto-inflate for elastic scaling ✅ "Azure-native & Kafka-like" experience @hpgrahsl | @Azure #GlobalAzure, 16th April 2021, Linz - Austria 39

Event Hubs for Kafka • main drawbacks: ⛔ fundamental Kafka
(protocol) features missing ⛔ selected quotas & limits ➜ show-stoppers ? ⛔ Kafka Streams / ksqlDB clients unsupported @hpgrahsl | @Azure #GlobalAzure, 16th April 2021, Linz - Austria 40

Confluent Cloud on Azure @hpgrahsl | @Azure #GlobalAzure, 16th April

Confluent Cloud • most complete and versatile service • cloud-native
with elastic scalability • ready for hybrid & multi-cloud @hpgrahsl | @Azure #GlobalAzure, 16th April 2021, Linz - Austria 42

43

Confluent Cloud hosts fully-managed: • Kafka Connect • 100+ Connectors
• ksqlDB • Schema Registry @hpgrahsl | @Azure #GlobalAzure, 16th April 2021, Linz - Austria 44

Tiered Storage • currently unique to Confluent Cloud • infinite
data growth • retention time unlimited ! BUT NO Azure Blob Storage yet @hpgrahsl | @Azure #GlobalAzure, 16th April 2021, Linz - Austria 45

provisioning options @hpgrahsl | @Azure #GlobalAzure, 16th April 2021, Linz
- Austria 46

47

Confluent Cloud on Azure • main benefits: ✅ fully-managed Kafka
by its original creators ✅ ready for hybrid- / multi-cloud ✅ widest & smoothest ecosystem integration @hpgrahsl | @Azure #GlobalAzure, 16th April 2021, Linz - Austria 48

Confluent Cloud on Azure • main drawbacks: ⛔ compare pricing
➜ not cheap ⛔ underlying infra not customizable ⛔ higher degree of vendor dependence @hpgrahsl | @Azure #GlobalAzure, 16th April 2021, Linz - Austria 49

Kafka on Kubernetes @hpgrahsl | @Azure #GlobalAzure, 16th April 2021,
Linz - Austria 50

Kubernetes • open-source container orchestration • deploying / managing /
scaling • CNCF graduate project @hpgrahsl | @Azure #GlobalAzure, 16th April 2021, Linz - Austria 51

52

AKS Azure Kubernetes Service @hpgrahsl | @Azure #GlobalAzure, 16th April

remaining challenges: Network Storage Security @hpgrahsl | @Azure #GlobalAzure, 16th

55

56

57

• Operators (cluster / topic / user) • Kafka Connect
+ managed Connectors • replication with MirrorMaker • HTTP Bridge for Kafka • Cruise Control cluster balancing @hpgrahsl | @Azure #GlobalAzure, 16th April 2021, Linz - Austria 58

Kafka on AKS with Strimzi • main benefits: ✅ k8s-native
experience with built-in security ✅ tweakable / customizable in various ways ✅ ease of use for "non-ops-savvy folks" ➜ ME @hpgrahsl | @Azure #GlobalAzure, 16th April 2021, Linz - Austria 59

Kafka on AKS with Strimzi • main drawbacks: ⛔ Kafka
is OUR OWN responsibility ⛔ k8s knowledge despite "operator magic" ⛔ no Microsoft support offering @hpgrahsl | @Azure #GlobalAzure, 16th April 2021, Linz - Austria 60

don't just roll the dice... @hpgrahsl | @Azure #GlobalAzure, 16th

dig deeper & navigate further! @hpgrahsl | @Azure #GlobalAzure, 16th

Thanks! Q & A http://bit.ly/kafka-ga21 @hpgrahsl | @Azure #GlobalAzure, 16th
April 2021, Linz - Austria

4 Different Ways of Working with Kafka on Azure...

4 Different Ways of Working with Kafka on Azure @ Global Azure 2021

More Decks by Hans-Peter Grahsl

Other Decks in Programming

Featured

Transcript