Lock in $30 Savings on PRO—Offer Ends Soon! ⏳

Apache Flink 2.0 Preview

Jark Wu
November 14, 2023

Apache Flink 2.0 Preview

This is a talk in Flink Forward Seattle 2023.

The Apache Flink community has been preparing for the 2.0 version of Flink in the recent couple of months. This is going to be the first major version bump ever since the release of Flink 1.0 in 2016. In this talk, I’ll introduce the important changes planned for the 2.0 release and share the progress on preparing for it. The purpose of this talk is to help users and developers better understand what to expect from the upcoming major release, and to seek inputs and suggestions from the community.

Jark Wu

November 14, 2023
Tweet

More Decks by Jark Wu

Other Decks in Technology

Transcript

  1. Flink Forward 2023 © Apache Flink 2.0 Preview Jark Wu

    & Xintong Song Apache Flink PMC Member, Flink 2.0 Release Manager
  2. Flink Forward 2023 © Timeline 2014.04 – Flink entered Apache

    Incubator 2014.12 – Flink graduated and became an Apache TLP 2014 Flink 1.0 release Formally guarantee API backwards compatibility 2016.03 Flink 1.18 release 19 versions, 7.5 years ~5 months / version 2023.10 Flink 2.0 release First major version bump in 8.5 years 2024 Why Flink 2.0? Why it takes so long?
  3. Flink Forward 2023 © Terminology X . Y . Z

    release X – Major version number Y – Minor version number Z – Patch version number 1.18.0 – the 18th minor release in major version 1 1.17.1 – the first patch release for minor version 1.17
  4. Flink Forward 2023 © Minor Releases Planning Time-based releases –

    features not ready by the freeze date will be postponed Short release cycles – 4~5 months Predictability Quick Delivery & Feedback Cycle Quality Minor releases of the same major version are expected to be backward compatible.
  5. Flink Forward 2023 © API Compatibility Guarantees Annotation Major release

    (Source / Binary) Minor release (Source / Binary) Patch release (Source / Binary) @Public x / x ✓ / x ✓ / ✓ @PublicEvolving x / x x / x ✓ / ✓ @Experimental x / x x / x x / x
  6. Flink Forward 2023 © Why Flink 2.0? Long time not

    having new major releases, we are observing more and more issues. - New features and improvements that involve breaking changes are blocked - Developers hesitate to mark new APIs as stable (@Public) Backwards compatibility is deviating from its original intention. Top priority of Flink 2.0 – Completing long-blocked features / improvements / technical debt payoffs / bug-fixes that requires breaking changes.
  7. Flink Forward 2023 © API Evolving Process Scope – Programming

    API, Configuration, REST API, Metrics @Experimental @PublicEvolving @Public @Deprecated Removal 2 minor releases 2 minor releases @Public – 2 minor releases + major version bump @PublicEvolving – 1 minor release + minor version bump
  8. Flink Forward 2023 © Time Plan 2023.10 No enough time

    for carefully planning the API changes. Flink 1.18 2024.2 All @Public APIs that are planned to be broken in 2.0 need to be deprecated. Flink 1.19 2024.6 All @PublicEvolving APIs that are planned to be broken in 2.0 need to be deprecated. Flink 1.20 2024.10 Remove deprecated APIs. Replacements should have been ready. Flink 2.0
  9. Flink Forward 2023 © Flink 2.0 Breaking Changes APIs will

    be removed - All Scala APIs - DataSet API - SinkV1 - Legacy TableSource / TableSink - Legacy SQL function stack - Deprecated methods / fields / classes in DataStream / Table / REST API - Deprecated config options and metrics APIs might be removed - SourceFunction - Queryable State DataStream API will not be removed We are working on a DataStream API V2 with better abstraction and no internal exposes, but will not replace the current DataStream API (V1) any time soon
  10. Flink Forward 2023 © Flink 2.0 Breaking Changes Configuration -

    Unified key-value form - Comply with standard YAML format - Clear effective scope - Revisit all option types and default values Various Improvements in Metrics & REST API Java Support - Support and default Java 17 - Drop support for Java 8 (and Java 11) Serialization - Pluggable custom serializers - Make Kryo opt-in
  11. Flink Forward 2023 © Disaggregated State Management - Require less

    local disk space - Faster checkpointing and recovery - Faster rescaling Task Local Disk Remote Storage Backup Read/Write Task Local Disk (Cache) Remote Storage Read/Write Get/Put “Disaggregated State Management in Apache Flink 2.0” 15:15 at Ecosystem Track Yuan Mei Learn More:
  12. Flink Forward 2023 © Stream-Batch Unification - Overall windowing on

    keyed & non- keyed bounded streams - Adaptive query execution that leverages runtime statistics for dynamically deciding execution plan - JM failover without restarting all tasks - Integration of Hybrid Shuffle mode and Apache Celeborn Unified API & Engine Batch Execution Improvements
  13. Flink Forward 2023 © Stream-Batch Unification - Automatically choose execution

    mode according to data freshness - Dynamically switching as needed - Hide the execution details from users Unified Application Stream-Batch Mixed Execution Streaming Low Latency Batch High Throughput Auto-Switch
  14. Flink Forward 2023 © Streaming Lakehouse Paimon ODS Paimon DWD

    Paimon DWS Flink Streaming & Batch Flink Streaming & Batch Flink Streaming & Batch Data Serving Systems ADS Flink OLAP Queries DBMS Logs
  15. Flink Forward 2023 © Flink as a Unified SQL Platform

    Stream-batch unified SQL semantics Unified SQL Full window functionality based on Table-Valued Function (TVF) Window Performance improvements for large- scale join Join State compatibility of upgrading Flink version for SQL jobs Upgrade Experience