When Streaming Needs Batch - Flink's Journey Towards a Unified Engine

When Streaming Needs Batch - Flink's Journey Towards a Unified
Engine Konstantin Knauf – @snntrable – Current 22

About Me 2 • Co-Founder @ Immerok • Apache Flink
Committer & Member of the PMC • Formerly Head of Product & Field Engineer @ Ververica Visit us at booth S14.

Steady State - Streaming Heaven 3

Backfilling following an outage

Backfilling or after an introducing a bug

7 State Bootstrapping

Catch Up

Bursty Streams

Agenda Motivation CHECK Setting the Stage Catching Up & Handling
Bursty Streams Backfilling Bootstrapping Takeaways

Setting the Stage 11

Stream Processing Batch Processing Bounded Data CHECK CHECK Unbounded Data
CHECK Some Terminology Nature of Data Nature of Processing

A Typical Streaming Job Apache Kafka* in/out * or any
other replayable queue like Apache Pulsar/AWS Kinesis/…

A Typical Streaming Job Flink Pipeline with multiple operators &
shuffles

A Typical Streaming Job incl. significant state

A Typical Streaming Job

Catching Up & Bursty Streams 20

Scenario There is a large backlog of data and you
want to catch up to real-time again. For example, happens when upstream producers send data in big chunks. Wishlist • process backlog quickly • process backlog robustly with existing resources Scenario 1

Backlog -> Backpressure When under backpressure, what do you want?
• Scale up • Or catch up steadily

Scaling Up under Backpressure Adaptive Scheduler and Reactive Mode (Flink
1.13) Task Manager Task Slot Task Slot

1.13) Task Manager Task Slot Task Slot Task Manager Task Slot Task Slot

1.13) Task Manager Task Slot Task Slot Task Manager Task Slot Task Slot • Job automatically scales up when provided with additional resources • No additional Savepoint needed for rescaling* * Caveat: This might still take quite some time during restore. Flink 1.16 brings some more improvements in that regard.

Backlog -> Backpressure When under backpressure, what do you want?
• Scale up • Or catch up steadily

Robustness under Backpressure Watermark Alignment (Flink 1.15) high throughput topic
low throughput topic

Robustness under Backpressure Watermark Alignment (Flink 1.15) advances watermark ~
with processing time advances watermark ~ with processing time

Robustness under Backpressure Watermark Alignment (Flink 1.15) advances watermark slowly
advances watermark very quickly

Robustness under Backpressure Watermark Alignment (Flink 1.15) join state grows
during processing a backlog

Robustness under Backpressure Watermark Alignment (Flink 1.15) advances watermark slowly
advances watermark very quickly

Robustness under Backpressure Unaligned Checkpoints (Flink 1.11-1.13) & Buffer Debloating
(Flink 1.14) checkpoint barrier n 6 x 5 4 3 2 1 f e c d b a h g e f d c y 9 8 7 6 5 4 a y b 1 2 3 input buffer aligning begin aligning operator operator

Robustness under Backpressure Unaligned Checkpoints (Flink 1.11-1.13) & Buffer Debloating
(Flink 1.14) Under backpressure & at scale checkpoint alignment can take hours leading to checkpoint timeouts and job failures. • Buffer debloating dramatically reduces the amount of in-flight data • Unaligned checkpoints allows barriers to overtake in-flight data checkpoint barrier n 6 x 5 4 3 2 1 f e c d b a h g e f d c y 9 8 7 6 5 4 a y b 1 2 3 input buffer aligning begin aligning operator operator

Scenario There is a large backlog of data and you
want to catch up to real-time again. For example, happens when upstream producers send data in big chunks. Wishlist • process backlog quickly CHECK (Adaptive Scheduler) • process backlog robustly with existing resources CHECK (Buffer Debloating, Watermark Alignment, Unaligned Checkpoints) Scenario

Backfilling 36

Scenario You want to reprocess a fixed amount of (recent)
historical data to correct a bug or outage. Wishlist • Code-reuse for backfilling • Same semantics and complete & correct results • Resource efficient Scenario [1] https://www.youtube.com/watch?v=4qSlsYogALo&t=668s 1

38 - The Apache Flink Community Batch is a Special
Case of Streaming

DataStream API with Streaming Execution

DataStream API with Streaming Execution All the elasticity and robustness
improvements for processing under backpressure apply to here, too.

Scenario We want to reprocess a fixed amount of (recent)
historical data to correct a bug or outage. Wishlist • Code-reuse for backfilling CHECK • Same semantics and complete & correct results CHECK • Resource efficient Evaluation Stream Execution Mode

Batch Execution Mode Implementation Timeline • Apache Flink 1.12 ◦
Initial Release ◦ Unified Sink API (beta) • Apache Flink 1.13 ◦ Support for Python DataStream API • Apache Flink 1.14 ◦ Batch Execution Mode for mixed DataStream/Table API programs ◦ Unified Sink API stable ◦ Unified Source API stable • Apache Flink 1.15 ◦ Most Sources/Sinks migrated to unified interfaces ◦ Adaptive Batch Scheduler

DataStream API with Batch Execution

Batch Execution Performance

Why is Batch Processing faster?

It all boils down to completeness & latency. Stream Processing
Batch Processing Data is incomplete Data is complete Latency SLAs No Latency SLAs

Batch Execution vs Stream Execution

Batch Execution vs Stream Execution Data Exchange Mode

Batch Execution vs Stream Execution Fault Tolerance Stream Processing

Batch Execution vs Stream Execution Fault Tolerance Object Store (S3,
GCS, HDFS, …) Periodic Snapshots Stream Processing Checkpointing

GCS, HDFS, …) Periodic Snapshots Stream Processing Checkpointing Batch Processing

GCS, HDFS, …) Periodic Snapshots Stream Processing Checkpointing Batch Processing Backtracking Local Disk or External Shuffle Service

Batch Execution vs Stream Execution Processing Order and State Backends
0 0 Stream Processing

Batch Execution vs Stream Execution Processing Order and State 0
1 Stream Processing

1 Stream Processing

2 Stream Processing

Batch Execution vs Stream Execution Processing Order and State Stream
Processing Keys are processed simultaneously. 2 2

2 Batch Processing Stream Processing Keys are processed simultaneously.

5 Stream Processing Keys are processed simultaneously. Batch Processing

2 5 Stream Processing Keys are processed simultaneously. Batch Processing

2 6 5 Stream Processing Keys are processed simultaneously. Batch Processing

2 5 6 Stream Processing Keys are processed simultaneously. Batch Processing

2 5 6 Stream Processing Keys are processed simultaneously. Batch Processing Keys are processed one after another.

Batch Execution vs Stream Execution Time • Does Processing Time
make sense when processing historical data? ◦ Not really. ◦ All processing time timers fire at the end of the input.

make sense when processing historical data? ◦ Not really. ◦ All processing time timers fire at the end of the input. • Does historical data arrive out-of-order? ◦ No, as it is complete we can sort it by timestamp if needed.

make sense when processing historical data? ◦ Not really. ◦ All processing time timers fire at the end of the input. • Does historical data arrive out-of-order? ◦ No, as it is complete we can sort it by timestamp if needed. • Do watermarks make sense in batch processing? ◦ No, we don’t need them. There is no trade off between latency and completeness. ◦ Watermark jumps from -∞ to +∞. All event time timers fire at the end of the input.

Batch Execution vs Stream Execution Summary Stream Processing Batch Processing
Data Exchange Mode Pipelined Blocking Fault Tolerance Checkpointing Backtracking Processing Order All keys simultaneously Keys one by one Time • Event processed out-of-order • Event and Processing Time • Watermarks • Events processed by event time for each key • Only Eventtime • No Watermarks

Scenario We want to reprocess a fixed amount of (recent)
historical data to correct a bug or outage. Wishlist • Code-reuse for backfilling CHECK • Same semantics and complete & correct results CHECK • Resource efficient CHECK (Potential Caveat: Resource Consumption? See Uber Talk.) Evaluation Batch Execution Mode

Bootstrapping 79

Scenario We want to process historical data (weeks, months, year)
to build up the applications state before switching the application to real-time data. Wishlist • Code-reuse for bootstrapping • Different data source for bootstrapping • Resource efficient Scenario [1] https://www.youtube.com/watch?v=BTWntKy_MJs [2] https://www.youtube.com/watch?v=JQyfXEQqKeg [3] https://www.youtube.com/watch?v=JKndMiXphzw 1 2 3

Bootstrapping with the Hybrid Source Hybrid Source automates switching of
sources from historical data to real-time data within a single streaming Job. S3 Kafka hours retention years retention S3

sources from historical data to real-time data within a single streaming Job. S3 Kafka hours retention years retention Unbounded Source Bounded Source S3

sources from historical data to real-time data within a single streaming Job. S3 Kafka hours retention years retention S3 All the elasticity and robustness improvements for processing under backpressure apply to here, too.

sources from historical data to real-time data within a single streaming Job. S3 Kafka hours retention years retention End of Input Reached S3

sources from historical data to real-time data within a single streaming Job. S3 Kafka hours retention years retention S3

Bootstrapping with the Hybrid Source

to build up the applications state before switching the application to real-time data. Wishlist • Code-reuse for bootstrapping 🗸 • Different data source for bootstrapping 🗸 • Resource efficient Evaluation Bootstrapping with Hybrid Source

Bootstrapping with Batch Execution /dev/null Bootstrapping Job With Batch Execution
Separate Data Source Discarding Sink S3

Bootstrapping with Batch Execution Savepoint /dev/null Bootstrapping Job With Batch
Execution Separate Data Source Discarding Sink produces a final Savepoint S3

Bootstrapping with Batch Execution Savepoint /dev/null Bootstrapping Job With Batch
Execution Separate Data Source Discarding Sink Real-Time Job With Stream Execution produces a final Savepoint takes a final Savepoint as initial state S3 Pre-Release!

Final Savepoints for Batch Jobs Next Steps 1. Still some
limitations & open questions to address in prototype 2. Publish FLIP & discuss with the Community 3. We are optimistic about Flink 1.17.

to build up the applications state before switching the application to real-time data. Wishlist • Code-reuse for bootstrapping CHECK • Different data source for bootstrapping CHECK • Resource efficient CHECK Evaluation Bootstrapping with Batch Execution

Takeaways 96

Takeaways • Just because you are streaming, doesn’t mean you
can always avoid processing lots of data at once

can always avoid processing lots of data at once • Batch processing techniques are usually more resource efficient for this.

can always avoid processing lots of data at once • Batch processing techniques are usually more resource efficient for this. • Apache Flink has done a lot recently to make sure those two processing modes work well together in real-world applications.

doesn’t mean you can always avoid processing lots of data at once • Batch processing techniques are usually more resource efficient for this. • Apache Flink has done a lot recently to make sure those two processing modes work well together in real-world applications. • Final Savepoints for Batch Jobs is the last mile for Batch Execution in DataStream API.

Thanks Konstantin Knauf @snntrable [email protected] CDC Stream Processing with Apache
Flink Timo Walther, 2pm, Ballroom G

When Streaming Needs Batch - Flink's Journey To...

When Streaming Needs Batch - Flink's Journey Towards a Unified Engine

Other Decks in Technology

Featured

Transcript