Apache Kafka in conjunction with Apache Spark became the de facto standard for processing and analyzing data. Both frameworks are open, flexible, and scalable.
Unfortunately, the latter makes operations a challenge for many teams. Ideally, teams can use serverless SaaS offerings to focus on business logic. However, hybrid and multi-cloud scenarios require a cloud-native platform that provides automated and elastic tooling to reduce the operations burden.
This session explores different architectures to build serverless Kafka and Spark multi-cloud architectures across regions and continents.
We start from the analytics perspective of a data lake and explore its relation to a fully integrated data streaming layer with Kafka to build a modern data Data Lakehouse. Real-world use cases show the joint value and explore the benefit of the "delta lake" integration