Slide 1

Slide 1 text

Timing is Everything Understanding Event-Time Processing in Flink SQL Sharon Xie,Head of Product Founding Engineer @ Decodable

Slide 2

Slide 2 text

Agenda - Flinkʼs event time processing model - Use cases in Flink SQL

Slide 3

Slide 3 text

What is Apache Flink ● Highly scalable stream processing engine ● Exactly-once processing semantics ● Layered APIs: SQL (easy to use) ↔ Java/Python DataStream (expressive) ● Support event-driven applications, streaming ETL pipelines, streaming analytics

Slide 4

Slide 4 text

Context ● Flink streaming mode ● Flink SQL

Slide 5

Slide 5 text

Event Immutable record containing the detail of something that happened at some point in time.

Slide 6

Slide 6 text

Time in Flink Event Time ● The time at which the event happened Processing Time ● The time at which the event is observed by Flink

Slide 7

Slide 7 text

When is event time used? ● Decisions or insights based on when the event happens ○ Monitoring and alerting ○ Time-based compute or analytics

Slide 8

Slide 8 text

Event time vs Processing Time ● Event time is < processing time ● The lag is arbitrary ● Events can be out-of-order

Slide 9

Slide 9 text

Challenges How do you know when all of the events are received for a particular window?

Slide 10

Slide 10 text

Watermark ● Measures the progress of event time ● Tracks the maximum event time seen ● Indicates the completeness of the event time

Slide 11

Slide 11 text

No content

Slide 12

Slide 12 text

Watermark

Slide 13

Slide 13 text

Create table sensors ( id bigint, `value` integer _time timestamp(3), watermark for _time as _time - interval '3' minutes ) WITH ( 'scan.watermark.emit.strategy'='on-event', ... ); Define Watermark

Slide 14

Slide 14 text

Watermark Generation (on-event)

Slide 15

Slide 15 text

There is a window that ends at 105. When can the window close? Quiz

Slide 16

Slide 16 text

There is a window that ends at 105. When can the window close? Quiz - Answer

Slide 17

Slide 17 text

Multiple sources/partitions

Slide 18

Slide 18 text

Idle source/partition ● If a partition is idle (no events), the watermark will not advance ● No result will be produced ● Solutions ○ Configure source idle timeout ■ set table.exec.source.idle-timeout = 1m ○ Balance the partitions

Slide 19

Slide 19 text

Implications ● Tradeoff between Correctness and Latency ● Latency ○ Results of a window is only seen after the window closes ● Correctness ○ Late arriving events are discarded after the window is closed

Slide 20

Slide 20 text

Correctness VS Latency In general: Alerting and monitoring: latency Timely analytics: correctness

Slide 21

Slide 21 text

But…can I have both? ● Yes! Flink can process & emit “updatesˮ (changelog) ● No watermark is needed ● Downstream system must support “updatesˮ ● Itʼs costly - need to store global state

Slide 22

Slide 22 text

Trade-offs

Slide 23

Slide 23 text

Quick Summary ● Event time is the time when the event happens ● Flink uses watermark to account for out-of-order events ● Watermark strategy allows trade-off between accuracy and latency

Slide 24

Slide 24 text

Event-time Usage In Flink SQL ● Windowed Aggregations ● Window join ● Temporal join

Slide 25

Slide 25 text

Windowing Put unbounded events into finite-sized temporal buckets, over which computation is applied.

Slide 26

Slide 26 text

Window Types ● Tumble / Fixed ● Hop / Sliding ● Cumulative ● Session

Slide 27

Slide 27 text

Window Types - Tumble/Fixed Ref: https://nightlies.apache.org/flink/flink-docs-release-1.19/docs/dev/table/sql/queries/window-tvf/ ● Fixed window size ● No overlapping ● Each event belongs to exactly 1 window

Slide 28

Slide 28 text

Flink SQL Window TVF ● TVF  TableValued Function ● Returns a new relation with all columns of original stream and additional 3 columns: ○ window_start, window_end, window_time

Slide 29

Slide 29 text

Tumble Window TVF

Slide 30

Slide 30 text

1st window_start value Easy calculation: The nearest multiple of window size before 1st event time Formula: floor( 1st event time - reference time*) / window size ) * window size + window_offsets * reference time is 19700101T000000.000 https://nightlies.apache.org/flink/flink-docs-release-1.19/docs/dev/table/sql/queries/window-tvf/#wi ndow-offset

Slide 31

Slide 31 text

Quiz Whatʼs the fixed window start value for the event time 101 with window size of 5min?

Slide 32

Slide 32 text

Quiz - Answer Whatʼs the fixed window start value for the event time 101 with window size of 5min? ● 100

Slide 33

Slide 33 text

Example - Tumble Window

Slide 34

Slide 34 text

Window Types - Hop/Slide Ref: https://nightlies.apache.org/flink/flink-docs-release-1.19/docs/dev/table/sql/queries/window-tvf/ ● Fixed window size ● Overlaps when slide < window size

Slide 35

Slide 35 text

Window Types - Cumulative Ref: https://nightlies.apache.org/flink/flink-docs-release-1.18/docs/dev/table/sql/queries/window-tvf/ ● Similar to tumble window, but with early firing at the defined interval ● Defined by max window size and window step

Slide 36

Slide 36 text

Window Types - Session A new window is created when the consecutive event time > session gap Ref: https://nightlies.apache.org/flink/flink-docs-release-1.19/docs/dev/table/sql/queries/window-tvf/#session

Slide 37

Slide 37 text

Window Join ● A window join adds the dimension of time into the join criteria themselves. ● Use case: compute click-through events

Slide 38

Slide 38 text

Example - hop window join

Slide 39

Slide 39 text

Example - hop window join

Slide 40

Slide 40 text

Example - hop window join

Slide 41

Slide 41 text

Temporal Join ● Enrich a stream with the value of the joined record at the event time. ● Example: Continuously computing the price for each order based on the exchange rate happened when the order is placed

Slide 42

Slide 42 text

Example - temporal join Ref: https://nightlies.apache.org/flink/flink-docs-release-1.18/docs/dev/table/sql/queries/joins/#temporal-joins

Slide 43

Slide 43 text

No content

Slide 44

Slide 44 text

No content

Slide 45

Slide 45 text

Summary ● Event time is essential for timely response and analytics ● Watermark and windowing are the key concepts ● Flink SQL simplifies event time processing

Slide 46

Slide 46 text

Thank you Q&A @sharon_rxie

Slide 47

Slide 47 text

Decodable Talks at Current 2024 Timing is Everything: Understanding Event-Time Processing in Flink SQL 🗣 Sharon Xie 📆 Tuesday 4pm 🗺 Ballroom F Data Contracts In Practice With Debezium and Apache Flink 🗣 Gunnar Morling 📆 Tuesday 3pm 🗺 Meeting Room 18C So You Want to Write a User-Defined Function (UDF) for Flink? 🗣 Hans-Peter Grahsl 📆 Wednesday 1:30pm 🗺 Ballroom F The Joy of JARs (and Other Flink SQL Troubleshooting Tales) 🗣 Robin Moffatt 📆 Wednesday 3pm 🗺 Ballroom F