$30 off During Our Annual Pro Sale. View Details »

Apache Flink 2.0 Preview

Jark Wu
November 14, 2023

Apache Flink 2.0 Preview

This is a talk in Flink Forward Seattle 2023.

The Apache Flink community has been preparing for the 2.0 version of Flink in the recent couple of months. This is going to be the first major version bump ever since the release of Flink 1.0 in 2016. In this talk, I’ll introduce the important changes planned for the 2.0 release and share the progress on preparing for it. The purpose of this talk is to help users and developers better understand what to expect from the upcoming major release, and to seek inputs and suggestions from the community.

Jark Wu

November 14, 2023
Tweet

More Decks by Jark Wu

Other Decks in Technology

Transcript

  1. Flink Forward 2023 ©
    Apache Flink 2.0
    Preview
    Jark Wu & Xintong Song
    Apache Flink PMC Member, Flink 2.0 Release Manager

    View Slide

  2. Flink Forward 2023 ©
    Timeline
    2014.04 – Flink entered
    Apache Incubator
    2014.12 – Flink graduated
    and became an Apache
    TLP
    2014
    Flink 1.0 release
    Formally guarantee API
    backwards compatibility
    2016.03
    Flink 1.18 release
    19 versions, 7.5 years
    ~5 months / version
    2023.10
    Flink 2.0 release
    First major version bump
    in 8.5 years
    2024
    Why Flink 2.0?
    Why it takes so long?

    View Slide

  3. Flink Forward 2023 ©
    Terminology
    X . Y . Z release
    X – Major version number
    Y – Minor version number
    Z – Patch version number
    1.18.0 – the 18th minor release in
    major version 1
    1.17.1 – the first patch release for
    minor version 1.17

    View Slide

  4. Flink Forward 2023 ©
    Minor Releases Planning
    Time-based releases – features not ready by the freeze date will be postponed
    Short release cycles – 4~5 months
    Predictability
    Quick Delivery &
    Feedback Cycle
    Quality
    Minor releases of the same major version are expected to be backward compatible.

    View Slide

  5. Flink Forward 2023 ©
    API Compatibility Guarantees
    Annotation Major release
    (Source / Binary)
    Minor release
    (Source / Binary)
    Patch release
    (Source / Binary)
    @Public x / x ✓ / x ✓ / ✓
    @PublicEvolving x / x x / x ✓ / ✓
    @Experimental x / x x / x x / x

    View Slide

  6. Flink Forward 2023 ©
    Why Flink 2.0?
    Long time not having new major releases, we are observing more and more issues.
    - New features and improvements that involve breaking changes are blocked
    - Developers hesitate to mark new APIs as stable (@Public)
    Backwards compatibility is deviating from its original intention.
    Top priority of Flink 2.0 – Completing long-blocked features / improvements / technical debt
    payoffs / bug-fixes that requires breaking changes.

    View Slide

  7. Flink Forward 2023 ©
    API Evolving Process
    Scope – Programming API, Configuration, REST API, Metrics
    @Experimental @PublicEvolving @Public
    @Deprecated
    Removal
    2 minor
    releases
    2 minor
    releases
    @Public – 2 minor releases +
    major version bump
    @PublicEvolving – 1 minor
    release + minor version bump

    View Slide

  8. Flink Forward 2023 ©
    Time Plan
    2023.10
    No enough time for
    carefully planning the API
    changes.
    Flink 1.18
    2024.2
    All @Public APIs that are
    planned to be broken in
    2.0 need to be
    deprecated.
    Flink 1.19
    2024.6
    All @PublicEvolving APIs
    that are planned to be
    broken in 2.0 need to be
    deprecated.
    Flink 1.20
    2024.10
    Remove deprecated APIs.
    Replacements should
    have been ready.
    Flink 2.0

    View Slide

  9. Flink Forward 2023 ©
    On-going Discussions
    LTS Version
    Time-based
    Major Releases

    View Slide

  10. Flink Forward 2023 ©
    Flink 2.0 Breaking Changes
    APIs will be removed
    - All Scala APIs
    - DataSet API
    - SinkV1
    - Legacy TableSource / TableSink
    - Legacy SQL function stack
    - Deprecated methods / fields / classes in DataStream
    / Table / REST API
    - Deprecated config options and metrics
    APIs might be removed
    - SourceFunction
    - Queryable State
    DataStream API will not be removed
    We are working on a DataStream API V2 with better
    abstraction and no internal exposes, but will not
    replace the current DataStream API (V1) any time soon

    View Slide

  11. Flink Forward 2023 ©
    Flink 2.0 Breaking Changes
    Configuration
    - Unified key-value form
    - Comply with standard YAML format
    - Clear effective scope
    - Revisit all option types and default values
    Various Improvements in Metrics & REST API
    Java Support
    - Support and default Java 17
    - Drop support for Java 8 (and Java 11)
    Serialization
    - Pluggable custom serializers
    - Make Kryo opt-in

    View Slide

  12. Flink Forward 2023 ©
    Disaggregated State Management
    - Require less local disk space
    - Faster checkpointing and recovery
    - Faster rescaling
    Task
    Local
    Disk
    Remote
    Storage
    Backup
    Read/Write
    Task
    Local
    Disk
    (Cache)
    Remote
    Storage
    Read/Write
    Get/Put
    “Disaggregated State Management
    in Apache Flink 2.0”
    15:15 at Ecosystem Track
    Yuan Mei
    Learn More:

    View Slide

  13. Flink Forward 2023 ©
    Stream-Batch Unification
    - Overall windowing on keyed & non- keyed bounded streams
    - Adaptive query execution that leverages runtime statistics for dynamically deciding
    execution plan
    - JM failover without restarting all tasks
    - Integration of Hybrid Shuffle mode and Apache Celeborn
    Unified API & Engine Batch Execution Improvements

    View Slide

  14. Flink Forward 2023 ©
    Stream-Batch Unification
    - Automatically choose execution mode
    according to data freshness
    - Dynamically switching as needed
    - Hide the execution details from users
    Unified Application Stream-Batch Mixed Execution
    Streaming
    Low Latency
    Batch
    High Throughput
    Auto-Switch

    View Slide

  15. Flink Forward 2023 ©
    Streaming Lakehouse
    Paimon
    ODS
    Paimon
    DWD
    Paimon
    DWS
    Flink
    Streaming
    & Batch
    Flink
    Streaming
    & Batch
    Flink
    Streaming
    & Batch
    Data Serving
    Systems
    ADS
    Flink
    OLAP
    Queries
    DBMS
    Logs

    View Slide

  16. Flink Forward 2023 ©
    Flink as a Unified SQL Platform
    Stream-batch unified
    SQL semantics
    Unified SQL
    Full window functionality
    based on Table-Valued
    Function (TVF)
    Window
    Performance
    improvements for large-
    scale join
    Join
    State compatibility of
    upgrading Flink version
    for SQL jobs
    Upgrade Experience

    View Slide

  17. Flink Forward 2023 ©
    Thank you
    Jark Wu
    @jarkwu

    View Slide