Timing is Everything
Understanding Event-Time Processing in Flink SQL
Sharon Xie,Head of Product
Founding Engineer @ Decodable
Slide 2
Slide 2 text
Agenda
- Flinkʼs event time processing model
- Use cases in Flink SQL
Slide 3
Slide 3 text
What is Apache Flink
● Highly scalable stream processing engine
● Exactly-once processing semantics
● Layered APIs: SQL (easy to use) ↔ Java/Python DataStream (expressive)
● Support event-driven applications, streaming ETL pipelines, streaming analytics
Slide 4
Slide 4 text
Context
● Flink streaming mode
● Flink SQL
Slide 5
Slide 5 text
Event
Immutable record containing
the detail of something that
happened at some point in
time.
Slide 6
Slide 6 text
Time in Flink
Event Time
● The time at which the event happened
Processing Time
● The time at which the event is observed by
Flink
Slide 7
Slide 7 text
When is event time used?
● Decisions or insights based on when the event
happens
○ Monitoring and alerting
○ Time-based compute or analytics
Slide 8
Slide 8 text
Event time vs Processing Time
● Event time is <
processing time
● The lag is arbitrary
● Events can be
out-of-order
Slide 9
Slide 9 text
Challenges
How do you know
when all of the events
are received for a
particular window?
Slide 10
Slide 10 text
Watermark
● Measures the progress of event time
● Tracks the maximum event time seen
● Indicates the completeness of the event time
Slide 11
Slide 11 text
No content
Slide 12
Slide 12 text
Watermark
Slide 13
Slide 13 text
Create table sensors (
id bigint,
`value` integer
_time timestamp(3),
watermark for _time as _time - interval '3' minutes
) WITH (
'scan.watermark.emit.strategy'='on-event',
...
);
Define Watermark
Slide 14
Slide 14 text
Watermark Generation (on-event)
Slide 15
Slide 15 text
There is a
window that
ends at 105.
When can the
window close?
Quiz
Slide 16
Slide 16 text
There is a
window that
ends at 105.
When can the
window close?
Quiz - Answer
Slide 17
Slide 17 text
Multiple sources/partitions
Slide 18
Slide 18 text
Idle source/partition
● If a partition is idle (no events), the watermark
will not advance
● No result will be produced
● Solutions
○ Configure source idle timeout
■ set table.exec.source.idle-timeout = 1m
○ Balance the partitions
Slide 19
Slide 19 text
Implications
● Tradeoff between Correctness and Latency
● Latency
○ Results of a window is only seen after the window
closes
● Correctness
○ Late arriving events are discarded after the window
is closed
Slide 20
Slide 20 text
Correctness VS Latency
In general:
Alerting and monitoring: latency
Timely analytics: correctness
Slide 21
Slide 21 text
But…can I have both?
● Yes! Flink can process & emit “updatesˮ
(changelog)
● No watermark is needed
● Downstream system must support “updatesˮ
● Itʼs costly - need to store global state
Slide 22
Slide 22 text
Trade-offs
Slide 23
Slide 23 text
Quick Summary
● Event time is the time when the event happens
● Flink uses watermark to account for out-of-order
events
● Watermark strategy allows trade-off between
accuracy and latency
Window Types - Tumble/Fixed
Ref: https://nightlies.apache.org/flink/flink-docs-release-1.19/docs/dev/table/sql/queries/window-tvf/
● Fixed window
size
● No overlapping
● Each event
belongs to
exactly 1 window
Slide 28
Slide 28 text
Flink SQL Window TVF
● TVF TableValued Function
● Returns a new relation with all columns of original stream and
additional 3 columns:
○ window_start, window_end, window_time
Slide 29
Slide 29 text
Tumble Window TVF
Slide 30
Slide 30 text
1st window_start value
Easy calculation: The nearest multiple of window size before 1st event time
Formula:
floor( 1st event time - reference time*) / window size )
* window size + window_offsets
* reference time is 19700101T000000.000
https://nightlies.apache.org/flink/flink-docs-release-1.19/docs/dev/table/sql/queries/window-tvf/#wi
ndow-offset
Slide 31
Slide 31 text
Quiz
Whatʼs the fixed window start value for the event
time 101 with window size of 5min?
Slide 32
Slide 32 text
Quiz - Answer
Whatʼs the fixed window start value for the event
time 101 with window size of 5min?
● 100
Window Types - Cumulative
Ref: https://nightlies.apache.org/flink/flink-docs-release-1.18/docs/dev/table/sql/queries/window-tvf/
● Similar to tumble
window, but with
early firing at the
defined interval
● Defined by max
window size and
window step
Slide 36
Slide 36 text
Window Types - Session
A new window is
created when the
consecutive event
time > session gap
Ref: https://nightlies.apache.org/flink/flink-docs-release-1.19/docs/dev/table/sql/queries/window-tvf/#session
Slide 37
Slide 37 text
Window Join
● A window join adds the dimension of time into
the join criteria themselves.
● Use case: compute click-through events
Slide 38
Slide 38 text
Example - hop window join
Slide 39
Slide 39 text
Example - hop window join
Slide 40
Slide 40 text
Example - hop window join
Slide 41
Slide 41 text
Temporal Join
● Enrich a stream with the value of the joined
record at the event time.
● Example: Continuously computing the price for
each order based on the exchange rate
happened when the order is placed
Slide 42
Slide 42 text
Example - temporal join
Ref: https://nightlies.apache.org/flink/flink-docs-release-1.18/docs/dev/table/sql/queries/joins/#temporal-joins
Slide 43
Slide 43 text
No content
Slide 44
Slide 44 text
No content
Slide 45
Slide 45 text
Summary
● Event time is essential for timely response and
analytics
● Watermark and windowing are the key concepts
● Flink SQL simplifies event time processing
Slide 46
Slide 46 text
Thank you
Q&A
@sharon_rxie
Slide 47
Slide 47 text
Decodable Talks at Current 2024
Timing is Everything: Understanding
Event-Time Processing in Flink SQL
🗣 Sharon Xie
📆 Tuesday 4pm
🗺 Ballroom F
Data Contracts In Practice With
Debezium and Apache Flink
🗣 Gunnar Morling
📆 Tuesday 3pm
🗺 Meeting Room 18C
So You Want to Write a User-Defined
Function (UDF) for Flink?
🗣 Hans-Peter Grahsl
📆 Wednesday 1:30pm
🗺 Ballroom F
The Joy of JARs (and Other Flink SQL
Troubleshooting Tales)
🗣 Robin Moffatt
📆 Wednesday 3pm
🗺 Ballroom F