Slide 1

Slide 1 text

Flink Forward 2023 © Apache Flink 2.0 Preview Jark Wu & Xintong Song Apache Flink PMC Member, Flink 2.0 Release Manager

Slide 2

Slide 2 text

Flink Forward 2023 © Timeline 2014.04 – Flink entered Apache Incubator 2014.12 – Flink graduated and became an Apache TLP 2014 Flink 1.0 release Formally guarantee API backwards compatibility 2016.03 Flink 1.18 release 19 versions, 7.5 years ~5 months / version 2023.10 Flink 2.0 release First major version bump in 8.5 years 2024 Why Flink 2.0? Why it takes so long?

Slide 3

Slide 3 text

Flink Forward 2023 © Terminology X . Y . Z release X – Major version number Y – Minor version number Z – Patch version number 1.18.0 – the 18th minor release in major version 1 1.17.1 – the first patch release for minor version 1.17

Slide 4

Slide 4 text

Flink Forward 2023 © Minor Releases Planning Time-based releases – features not ready by the freeze date will be postponed Short release cycles – 4~5 months Predictability Quick Delivery & Feedback Cycle Quality Minor releases of the same major version are expected to be backward compatible.

Slide 5

Slide 5 text

Flink Forward 2023 © API Compatibility Guarantees Annotation Major release (Source / Binary) Minor release (Source / Binary) Patch release (Source / Binary) @Public x / x ✓ / x ✓ / ✓ @PublicEvolving x / x x / x ✓ / ✓ @Experimental x / x x / x x / x

Slide 6

Slide 6 text

Flink Forward 2023 © Why Flink 2.0? Long time not having new major releases, we are observing more and more issues. - New features and improvements that involve breaking changes are blocked - Developers hesitate to mark new APIs as stable (@Public) Backwards compatibility is deviating from its original intention. Top priority of Flink 2.0 – Completing long-blocked features / improvements / technical debt payoffs / bug-fixes that requires breaking changes.

Slide 7

Slide 7 text

Flink Forward 2023 © API Evolving Process Scope – Programming API, Configuration, REST API, Metrics @Experimental @PublicEvolving @Public @Deprecated Removal 2 minor releases 2 minor releases @Public – 2 minor releases + major version bump @PublicEvolving – 1 minor release + minor version bump

Slide 8

Slide 8 text

Flink Forward 2023 © Time Plan 2023.10 No enough time for carefully planning the API changes. Flink 1.18 2024.2 All @Public APIs that are planned to be broken in 2.0 need to be deprecated. Flink 1.19 2024.6 All @PublicEvolving APIs that are planned to be broken in 2.0 need to be deprecated. Flink 1.20 2024.10 Remove deprecated APIs. Replacements should have been ready. Flink 2.0

Slide 9

Slide 9 text

Flink Forward 2023 © On-going Discussions LTS Version Time-based Major Releases

Slide 10

Slide 10 text

Flink Forward 2023 © Flink 2.0 Breaking Changes APIs will be removed - All Scala APIs - DataSet API - SinkV1 - Legacy TableSource / TableSink - Legacy SQL function stack - Deprecated methods / fields / classes in DataStream / Table / REST API - Deprecated config options and metrics APIs might be removed - SourceFunction - Queryable State DataStream API will not be removed We are working on a DataStream API V2 with better abstraction and no internal exposes, but will not replace the current DataStream API (V1) any time soon

Slide 11

Slide 11 text

Flink Forward 2023 © Flink 2.0 Breaking Changes Configuration - Unified key-value form - Comply with standard YAML format - Clear effective scope - Revisit all option types and default values Various Improvements in Metrics & REST API Java Support - Support and default Java 17 - Drop support for Java 8 (and Java 11) Serialization - Pluggable custom serializers - Make Kryo opt-in

Slide 12

Slide 12 text

Flink Forward 2023 © Disaggregated State Management - Require less local disk space - Faster checkpointing and recovery - Faster rescaling Task Local Disk Remote Storage Backup Read/Write Task Local Disk (Cache) Remote Storage Read/Write Get/Put “Disaggregated State Management in Apache Flink 2.0” 15:15 at Ecosystem Track Yuan Mei Learn More:

Slide 13

Slide 13 text

Flink Forward 2023 © Stream-Batch Unification - Overall windowing on keyed & non- keyed bounded streams - Adaptive query execution that leverages runtime statistics for dynamically deciding execution plan - JM failover without restarting all tasks - Integration of Hybrid Shuffle mode and Apache Celeborn Unified API & Engine Batch Execution Improvements

Slide 14

Slide 14 text

Flink Forward 2023 © Stream-Batch Unification - Automatically choose execution mode according to data freshness - Dynamically switching as needed - Hide the execution details from users Unified Application Stream-Batch Mixed Execution Streaming Low Latency Batch High Throughput Auto-Switch

Slide 15

Slide 15 text

Flink Forward 2023 © Streaming Lakehouse Paimon ODS Paimon DWD Paimon DWS Flink Streaming & Batch Flink Streaming & Batch Flink Streaming & Batch Data Serving Systems ADS Flink OLAP Queries DBMS Logs

Slide 16

Slide 16 text

Flink Forward 2023 © Flink as a Unified SQL Platform Stream-batch unified SQL semantics Unified SQL Full window functionality based on Table-Valued Function (TVF) Window Performance improvements for large- scale join Join State compatibility of upgrading Flink version for SQL jobs Upgrade Experience

Slide 17

Slide 17 text

Flink Forward 2023 © Thank you Jark Wu @jarkwu