Slide 1

Slide 1 text

The Changing Face of ETL Event-Driven Architectures for Data Engineers @rmoff Photo by rmoff

Slide 2

Slide 2 text

Photo by Samuel Sianipar on Unsplash

Slide 3

Slide 3 text

Photo by Khai Sze Ong on Unsplash

Slide 4

Slide 4 text

Photo by Rainier Ridao on Unsplash

Slide 5

Slide 5 text

Photo by Rohit Tandon on Unsplash

Slide 6

Slide 6 text

Photo by Theodore Moore on Unsplash

Slide 7

Slide 7 text

Photo by Cristian Grecu on Unsplash

Slide 8

Slide 8 text

The Changing Face of ETL: Event-Driven Architectures for Data Engineers @rmoff It used to be so simple Photo by Patrick Fore on Unsplash

Slide 9

Slide 9 text

The Changing Face of ETL: Event-Driven Architectures for Data Engineers @rmoff Photo by Eugenio Mazzone on Unsplash More Sources

Slide 10

Slide 10 text

The Changing Face of ETL: Event-Driven Architectures for Data Engineers @rmoff Photo by Tom Barrett on Unsplash More Targets

Slide 11

Slide 11 text

The Changing Face of ETL: Event-Driven Architectures for Data Engineers @rmoff Photo by Kirill on Unsplash More Data

Slide 12

Slide 12 text

The Changing Face of ETL: Event-Driven Architectures for Data Engineers @rmoff Batches and Buckets

Slide 13

Slide 13 text

The Changing Face of ETL: Event-Driven Architectures for Data Engineers @rmoff Photo by Deva Darshan from Pexels Applications Respond → an order was placed! Analytics Tell Us What Happened → how many orders were placed

Slide 14

Slide 14 text

The Changing Face of ETL: Event-Driven Architectures for Data Engineers @rmoff

Slide 15

Slide 15 text

The Changing Face of ETL: Event-Driven Architectures for Data Engineers @rmoff Photo by NASA on Unsplash

Slide 16

Slide 16 text

Photo by Mark Kamalov on Unsplash Events Events

Slide 17

Slide 17 text

“ The Changing Face of ETL: Event-Driven Architectures for Data Engineers @rmoff An event is both: * Notification * State transfer

Slide 18

Slide 18 text

The Changing Face of ETL: Event-Driven Architectures for Data Engineers @rmoff A Customer Experience

Slide 19

Slide 19 text

The Changing Face of ETL: Event-Driven Architectures for Data Engineers @rmoff A Sensor Reading

Slide 20

Slide 20 text

The Changing Face of ETL: Event-Driven Architectures for Data Engineers @rmoff Events Bread Tinned Spaghetti Basket

Slide 21

Slide 21 text

The Changing Face of ETL: Event-Driven Architectures for Data Engineers @rmoff Events Bread Basket Bread ItemAdd

Slide 22

Slide 22 text

The Changing Face of ETL: Event-Driven Architectures for Data Engineers @rmoff Events Bread Baked Beans Basket Bread Baked Beans ItemAdd ItemAdd

Slide 23

Slide 23 text

The Changing Face of ETL: Event-Driven Architectures for Data Engineers @rmoff Events Bread Basket Bread Baked Beans Baked Beans ItemAdd ItemAdd ItemRemove

Slide 24

Slide 24 text

The Changing Face of ETL: Event-Driven Architectures for Data Engineers @rmoff Events Bread Tinned Spaghetti Basket Bread Tinned Spaghetti Baked Beans Baked Beans ItemAdd ItemAdd ItemRemove ItemAdd

Slide 25

Slide 25 text

The Changing Face of ETL: Event-Driven Architectures for Data Engineers @rmoff Events Bread Tinned Spaghetti Basket Bread Tinned Spaghetti Baked Beans Baked Beans ItemAdd ItemAdd ItemRemove ItemAdd

Slide 26

Slide 26 text

The Changing Face of ETL: Event-Driven Architectures for Data Engineers @rmoff Events Bread Tinned Spaghetti Basket Bread Tinned Spaghetti Baked Beans Baked Beans ItemAdd ItemAdd ItemRemove ItemAdd

Slide 27

Slide 27 text

The Changing Face of ETL: Event-Driven Architectures for Data Engineers @rmoff Events Bread Tinned Spaghetti Basket Bread Tinned Spaghetti Baked Beans Baked Beans ItemAdd ItemAdd ItemRemove ItemAdd

Slide 28

Slide 28 text

The Changing Face of ETL: Event-Driven Architectures for Data Engineers @rmoff What is an Event Streaming Platform? The Log Connectors Connectors Producer Consumer Streaming Engine

Slide 29

Slide 29 text

The Changing Face of ETL: Event-Driven Architectures for Data Engineers @rmoff Immutable Event Log Old New Messages are added at the end of the log

Slide 30

Slide 30 text

The Changing Face of ETL: Event-Driven Architectures for Data Engineers @rmoff Consumers have a position all of their own Sally is here Old New Scan

Slide 31

Slide 31 text

The Changing Face of ETL: Event-Driven Architectures for Data Engineers @rmoff Consumers have a position all of their own Sally is here Fred is here Old New Scan Scan

Slide 32

Slide 32 text

The Changing Face of ETL: Event-Driven Architectures for Data Engineers @rmoff Consumers have a position all of their own Sally is here George is here Fred is here Old New Scan Scan Scan

Slide 33

Slide 33 text

The Changing Face of ETL: Event-Driven Architectures for Data Engineers @rmoff The Connect API The Log Connectors Connectors Producer Consumer Streaming Engine

Slide 34

Slide 34 text

The Changing Face of ETL: Event-Driven Architectures for Data Engineers @rmoff Streaming Integration with Kafka Connect Kafka Brokers Kafka Connect Tasks Workers Sources syslog flat file CSV JSON MQTT

Slide 35

Slide 35 text

The Changing Face of ETL: Event-Driven Architectures for Data Engineers @rmoff Streaming Integration with Kafka Connect Kafka Brokers Kafka Connect Tasks Workers Sinks Amazon S3 MQTT

Slide 36

Slide 36 text

The Changing Face of ETL: Event-Driven Architectures for Data Engineers @rmoff Streaming Integration with Kafka Connect Kafka Brokers Kafka Connect Tasks Workers Sources Sinks syslog flat file CSV JSON MQTT Amazon S3 MQTT

Slide 37

Slide 37 text

The Changing Face of ETL: Event-Driven Architectures for Data Engineers @rmoff Stream Processing in Kafka The Log Connectors Connectors Producer Consumer Streaming Engine

Slide 38

Slide 38 text

The Changing Face of ETL: Event-Driven Architectures for Data Engineers @rmoff Kafka Streams API final StreamsBuilder builder = new StreamsBuilder() .stream("orders", Consumed.with(stringSerde, ordersSerde)) .filter( (key, order) -> order.getStatus().equals("COMPLETE") ) .to("complete_orders", Produced.with(stringSerde, ordersSerde));

Slide 39

Slide 39 text

The Changing Face of ETL: Event-Driven Architectures for Data Engineers @rmoff Stream Processing with KSQL CREATE STREAM completedOrders AS SELECT * FROM orders
 WHERE status='COMPLETE';

Slide 40

Slide 40 text

The Changing Face of ETL: Event-Driven Architectures for Data Engineers @rmoff This is Something New Photo by Ash from Modern Afflatus on Unsplash

Slide 41

Slide 41 text

The Changing Face of ETL: Event-Driven Architectures for Data Engineers @rmoff Events in Action Review events reviews

Slide 42

Slide 42 text

The Changing Face of ETL: Event-Driven Architectures for Data Engineers @rmoff Events in Action Review events Operational dashboard reviews

Slide 43

Slide 43 text

The Changing Face of ETL: Event-Driven Architectures for Data Engineers @rmoff Events in Action Review events Operational dashboard reviews Data lake

Slide 44

Slide 44 text

The Changing Face of ETL: Event-Driven Architectures for Data Engineers @rmoff Events in Action Review events Filter out bad data Operational dashboard Data lake reviews reviews_clean CREATE STREAM reviews_clean AS SELECT * FROM reviews WHERE id IS NOT NULL;

Slide 45

Slide 45 text

The Changing Face of ETL: Event-Driven Architectures for Data Engineers @rmoff Events in Action Existing apps User data RDBMS txn log Kafka Connect Kafka users

Slide 46

Slide 46 text

The Changing Face of ETL: Event-Driven Architectures for Data Engineers @rmoff Events in Action Review events Operational dashboard Data lake User data users reviews reviews_clean Join events to users, and filter

Slide 47

Slide 47 text

The Changing Face of ETL: Event-Driven Architectures for Data Engineers @rmoff Events in Action Review events Operational dashboard Data lake User data CREATE STREAM reviews_clean AS SELECT * FROM reviews WHERE id IS NOT NULL CREATE STREAM enriched_reviews AS SELECT * FROM reviews_clean r INNER JOIN users u ON r.userid=u.userid; enriched_reviews reviews reviews_clean users Join events to users, and filter

Slide 48

Slide 48 text

The Changing Face of ETL: Event-Driven Architectures for Data Engineers @rmoff Events in Action Review events Operational dashboard Data lake User data Join events to users, and filter Notification service

Slide 49

Slide 49 text

The Changing Face of ETL: Event-Driven Architectures for Data Engineers @rmoff Events in Action Review events Notification service Operational dashboard Data lake User data CREATE STREAM unhappy_vips AS SELECT * FROM enriched_reviews WHERE rating < 3 AND status = 'Platinum'; unhappy_vips enriched_reviews reviews reviews_clean users Join events to users, and filter

Slide 50

Slide 50 text

The Power of an Event-Driven Architecture Photo by rmoff

Slide 51

Slide 51 text

The Changing Face of ETL: Event-Driven Architectures for Data Engineers @rmoff Not Everything is a Nail Events RDBMS

Slide 52

Slide 52 text

The Changing Face of ETL: Event-Driven Architectures for Data Engineers @rmoff Not Everything is a Nail Events Elasticsearch RDBMS

Slide 53

Slide 53 text

The Changing Face of ETL: Event-Driven Architectures for Data Engineers @rmoff Not Everything is a Nail Events Elasticsearch RDBMS Graph

Slide 54

Slide 54 text

The Changing Face of ETL: Event-Driven Architectures for Data Engineers @rmoff Side-by-Side Tech Evaluation Events HDFS

Slide 55

Slide 55 text

The Changing Face of ETL: Event-Driven Architectures for Data Engineers @rmoff Side-by-Side Tech Evaluation Events BiqQuery HDFS

Slide 56

Slide 56 text

The Changing Face of ETL: Event-Driven Architectures for Data Engineers @rmoff Side-by-Side Tech Evaluation Events BiqQuery HDFS Snowflake

Slide 57

Slide 57 text

The Changing Face of ETL: Event-Driven Architectures for Data Engineers @rmoff Evolve Data Sources Producer Consuming App A On- premises Consuming App B

Slide 58

Slide 58 text

The Changing Face of ETL: Event-Driven Architectures for Data Engineers @rmoff Evolve Data Sources Producer On- premises Producer Cloud Consuming App A Consuming App B

Slide 59

Slide 59 text

The Changing Face of ETL: Event-Driven Architectures for Data Engineers @rmoff Evolve Data Sources Producer Cloud Consuming App A Consuming App B

Slide 60

Slide 60 text

The Changing Face of ETL: Event-Driven Architectures for Data Engineers @rmoff Tight Coupling != Flexible Orders RDBMS

Slide 61

Slide 61 text

The Changing Face of ETL: Event-Driven Architectures for Data Engineers @rmoff Tight Coupling != Flexible Orders HDFS RDBMS

Slide 62

Slide 62 text

The Changing Face of ETL: Event-Driven Architectures for Data Engineers @rmoff Tight Coupling != Flexible Orders App HDFS RDBMS

Slide 63

Slide 63 text

The Changing Face of ETL: Event-Driven Architectures for Data Engineers @rmoff Loose Coupling == Freedom to Evolve Orders RDBMS

Slide 64

Slide 64 text

The Changing Face of ETL: Event-Driven Architectures for Data Engineers @rmoff Loose Coupling == Freedom to Evolve Orders HDFS RDBMS

Slide 65

Slide 65 text

The Changing Face of ETL: Event-Driven Architectures for Data Engineers @rmoff Loose Coupling == Freedom to Evolve Orders App HDFS RDBMS

Slide 66

Slide 66 text

The Changing Face of ETL: Event-Driven Architectures for Data Engineers @rmoff Transform Once, Use Many: Data Cleansing IoT App RDBMS App temp_raw

Slide 67

Slide 67 text

The Changing Face of ETL: Event-Driven Architectures for Data Engineers @rmoff Transform Once, Use Many: Data Cleansing IoT App RDBMS App temp_raw sensor_id time_epoch reading 42 1551136074 13.05 42 1551136125 13.11 1551136125 13.11 42 1551138129 13.04

Slide 68

Slide 68 text

The Changing Face of ETL: Event-Driven Architectures for Data Engineers @rmoff Transform Once, Use Many: Data Cleansing IoT App RDBMS App temp_raw Cleanse Cleanse Cleanse sensor_id time_epoch reading 42 1551136074 13.05 42 1551136125 13.11 1551136125 13.11 42 1551138129 13.04

Slide 69

Slide 69 text

The Changing Face of ETL: Event-Driven Architectures for Data Engineers @rmoff Transform Once, Use Many: Data Cleansing IoT App RDBMS App SENSOR_ID IS NOT NULL temp_raw sensor_id time_epoch reading 42 1551136074 13.05 42 1551136125 13.11 42 1551138129 13.04 temp_clean sensor_id time_epoch reading 42 1551136074 13.05 42 1551136125 13.11 1551136125 13.11 42 1551138129 13.04

Slide 70

Slide 70 text

Say NO to brittle pipelines Photo by rmoff

Slide 71

Slide 71 text

The Changing Face of ETL: Event-Driven Architectures for Data Engineers @rmoff Photo by Benjamin Lambert on Unsplash

Slide 72

Slide 72 text

Photo by Benjamin Lambert on Unsplash Latency requirements Users of the data Scale Data fidelity ! Photo by Benjamin Lambert on Unsplash

Slide 73

Slide 73 text

The Changing Face of ETL: Event-Driven Architectures for Data Engineers @rmoff App App App App search Hadoop DWH monitoring security MQ MQ cache cache

Slide 74

Slide 74 text

The Changing Face of ETL: Event-Driven Architectures for Data Engineers @rmoff KAFKA DWH Hadoop App App App App App App App App request-response messaging OR stream processing streaming data pipelines changelogs

Slide 75

Slide 75 text

Events model the real world Photo by rmoff

Slide 76

Slide 76 text

Event streaming platform Flexibility & scalability Data when you need it Data persistence Native stream processing Photo by rmoff

Slide 77

Slide 77 text

The Changing Face of ETL: Event-Driven Architectures for Data Engineers @rmoff http://cnfl.io/book-bundle

Slide 78

Slide 78 text

@rmoff confluent.io/download http://cnfl.io/slack http://cnfl.io/book-bundle Photo by rmoff

Slide 79

Slide 79 text

The Changing Face of ETL: Event-Driven Architectures for Data Engineers @rmoff • CDC Spreadsheet • Blog: No More Silos: How to Integrate your Databases with Apache Kafka and CDC • #partner-engineering on Slack for questions • BD team (#partners / [email protected]) can help with introductions on a given sales op Resources #EOF