Streaming ETL on the Shoulders of Giants @ VoxxedDays Ticino 2019

Streaming ETL on the Shoulders of G I A N
T S

Hans-Peter Grahsl • working & living in Graz • technical
trainer at • independent consultant & engineer • associate lecturer • " occasional conference speaker @hpgrahsl | #VDT19 #VoxxedDays Ticino, 05th October 2019, Lugano - Switzerland 2

Speed & Agility @hpgrahsl | #VDT19 #VoxxedDays Ticino, 05th October
2019, Lugano - Switzerland 3

For businesses to stay relevant they must deliver value at
a breakneck pace and be constantly seeking new sources of value ... @hpgrahsl | #VDT19 #VoxxedDays Ticino, 05th October 2019, Lugano - Switzerland 4

@hpgrahsl | #VDT19 #VoxxedDays Ticino, 05th October 2019, Lugano -
Switzerland 5

Switzerland 6

Switzerland 7

Switzerland 8

Diminishing Value of Data @hpgrahsl | #VDT19 #VoxxedDays Ticino, 05th
October 2019, Lugano - Switzerland 9

Historic ETL causes Pain • batch-driven • brittle / error
prone • slow & late answers @hpgrahsl | #VDT19 #VoxxedDays Ticino, 05th October 2019, Lugano - Switzerland 15

Antipattern for Speed & Agility @hpgrahsl | #VDT19 #VoxxedDays Ticino,
05th October 2019, Lugano - Switzerland 16

Streaming ETL alleviates Pain • event-centric • stream-oriented • fast
& timely answers @hpgrahsl | #VDT19 #VoxxedDays Ticino, 05th October 2019, Lugano - Switzerland 17

Enabler for Speed & Agility @hpgrahsl | #VDT19 #VoxxedDays Ticino,

Modern Data Architecture? @hpgrahsl | #VDT19 #VoxxedDays Ticino, 05th October

On the Shoulders of G I A N T S
@hpgrahsl | #VDT19 #VoxxedDays Ticino, 05th October 2019, Lugano - Switzerland 25

Operational Data Store @hpgrahsl | #VDT19 #VoxxedDays Ticino, 05th October

MongoDB • rich document model • powerful queries & indexing
• ACID transactions • transparent sharding & replication @hpgrahsl | #VDT19 #VoxxedDays Ticino, 05th October 2019, Lugano - Switzerland 27

Streaming Platform @hpgrahsl | #VDT19 #VoxxedDays Ticino, 05th October 2019,
Lugano - Switzerland 28

Apache Kafka • pub / sub to event streams •
(permanently) store event streams • event streaming in near real-time @hpgrahsl | #VDT19 #VoxxedDays Ticino, 05th October 2019, Lugano - Switzerland 29

Switzerland 30

"... data processing that is designed with infinite data sets
in mind." — Tyler Akidau @hpgrahsl | #VDT19 #VoxxedDays Ticino, 05th October 2019, Lugano - Switzerland 31

EVENTS EVENTS EVERYWHERE! @hpgrahsl | #VDT19 #VoxxedDays Ticino, 05th October

Kafka APIs for "everything" • simple pub / sub scenario
❓ Producer & Consumer API • streaming data integration ❓ Connect API • powerful stream processing ❓ KStreams API + KSQL @hpgrahsl | #VDT19 #VoxxedDays Ticino, 05th October 2019, Lugano - Switzerland 33

Kafka Connect @hpgrahsl | #VDT19 #VoxxedDays Ticino, 05th October 2019,

Kafka Connect • often about data stores @hpgrahsl | #VDT19
#VoxxedDays Ticino, 05th October 2019, Lugano - Switzerland 39

Kafka Connect • concrete examples @hpgrahsl | #VDT19 #VoxxedDays Ticino,

Source Connectors @hpgrahsl | #VDT19 #VoxxedDays Ticino, 05th October 2019,

Sink Connectors @hpgrahsl | #VDT19 #VoxxedDays Ticino, 05th October 2019,

Switzerland 50

MongoDB Connector • officially supported by MongoDB • developed open-source
on GitHub • verified Gold by Confluent @hpgrahsl | #VDT19 #VoxxedDays Ticino, 05th October 2019, Lugano - Switzerland 51

Exemplary Use Cases @hpgrahsl | #VDT19 #VoxxedDays Ticino, 05th October

Single Customer View @hpgrahsl | #VDT19 #VoxxedDays Ticino, 05th October

Synchronization across Services @hpgrahsl | #VDT19 #VoxxedDays Ticino, 05th October

Real-Time Recommendations @hpgrahsl | #VDT19 #VoxxedDays Ticino, 05th October 2019,

Demo Scenario @hpgrahsl | #VDT19 #VoxxedDays Ticino, 05th October 2019,

Streaming ETL on the Shoulders of Giants @ Voxx...

Streaming ETL on the Shoulders of Giants @ VoxxedDays Ticino 2019

More Decks by Hans-Peter Grahsl

Other Decks in Programming

Featured

Transcript