Upgrade to Pro — share decks privately, control downloads, hide ads and more …

A Year in Flink: The most important changes of the last versions

A Year in Flink: The most important changes of the last versions

Another year has passed and the Flink community was very busy creating 3 new major releases containing more than 7000 commits and 3000 resolved issues. The included changes reach from the batch execution over the reactive mode to support for stateful Python UDFs. As a community member, it can be an increasingly difficult task to stay on top of all these developments and to understands what benefits the individual features bring. In this talk, I want to walk you through Flink’s most important new features which have been completed in the past year. I will explain the benefits and limitations of the new additions and tell you how they fit into the bigger picture. Moreover, I want to give you an outlook on what’s coming with the next Flink releases.

Till Rohrmann

October 27, 2021

More Decks by Till Rohrmann

Other Decks in Technology


  1. © 2021 Ververica A Year in Flink: The Most Important

    Changes of The Last Versions Till Rohrmann, Senior Staff Engineer, stsffap
  2. © 2021 Ververica A Year in Flink in Numbers What

    did happen in the past year? • Commits: ~6300 • LOC: +2416733/-1935420 • Contributors: ~500 • Resolved issues: >3400 • Finished FLIPs: >40 2 Flink 1.12 Flink 1.13 Flink 1.14 December 2020 May 2021 September 2021 • One of the most active Apache projects ─ Most active MLs for big data projects ─ Top 2 most active source visits ─ Top 2 most commits
  3. © 2021 Ververica Shipped Features Far too many to list

    them all 3 Hybrid sources New source/sink interfaces Pipelined region scheduling Buffer debloating Reactive mode Fine grained resource management Unaligned checkpoints Unified savepoint format Scala free runtime Standardized connector metrics Pulsar connector JDBC exactly-once sink Checkpointing of bounded streams Batch execution mode in DataStream Hive query syntax compatibility Table-DataStream interoperability Windows via table-valued functions Temporal table joins Metadata handling in SQL connectors Upsert Kafka connector Flamegraphs Batch streaming unification Operations & observability Connectors SQL
  4. © 2021 Ververica Batch & Streaming Unification Motivation • Ease

    of use by having a single API for both workloads • Consistent semantics for batch and streaming jobs • Real world streaming applications require often batch processing ─ Reprocessing, reinstatement, bootstrapping • Batch execution can be done more efficiently • Fewer operational costs because of a single system One system to rule them all! 5
  5. © 2021 Ververica Soft Deprecation of DataSet API • DataStream

    API can execute bounded jobs in batch execution mode (FLIP-134) ─ More efficient execution • execution.runtime-mode: ─ Streaming (default) ─ Batch ─ Automatic (based on boundedness of sources) Goal • SQL/Table API will become main API for analytical tasks • DataStream API offers lower level primitives • Possible to switch seamlessly between SQL and DataStream API (FLIP-136) • DataSet API will soon be deprecated (FLIP-131) Batch execution mode in the DataStream API 6
  6. © 2021 Ververica New Sources And Sinks • Building a

    common source/sink stack that is used by all APIs and execution modes New source interfaces (FLIP-27) • New source splits work in discovery and reading ─ Boundedness is a property of the discovery • More shared building blocks for common functionality (event-time alignment, per partition watermarks, etc.) • Faster checkpoints New sink interfaces (FLIP-143) • New sink splits work in writing and committing • Reusable building blocks A unified interface that supports streaming and batch execution 7 Old Source Enumerator Reader New Source Flink >= 1.12 Old Sink Writer Committer New Sink Flink >= 1.12 Global Committer Black box
  7. © 2021 Ververica HybridSource: Bridging The Gap Between Now and

    Past • HybridSource (FLIP-150) allows to read sequentially from heterogeneous sources • Bootstrapping online machine learning model with historic data and then switching to live data • Consuming a CDC stream consisting of a snapshot stored on S3 and the online changelog stored in Kafka • Switch time can be static or dynamically derived Processing historical and live data in one job 8 HybridSource 1. 2. 1. 2.
  8. © 2021 Ververica Support Checkpoints For Bounded Streams • Bounded

    sources terminate → topology can be partially finished • Checkpointing is required to recover from faults and to commit results in exactly-once • Checkpointing needs to be possible with finished operators ⇒ FLIP-147 • Allows to mix bounded and unbounded streams :-) • Terminates the job with a final checkpoint → Makes sure that all data is committed Checkpointing partially finished topologies 9 Bounded Unbounded Bounded Unbounded
  9. © 2021 Ververica SQL Improvements • SQL and Table API

    will become the standard API for analytics with Flink • Community works on completing existing capabilities and adding new functionality • Strong focus on usability • New features are added on a monthly basis • Legacy Flink SQL engine has been dropped with Flink 1.14 (FLINK-14437) How to make analytics even better 11
  10. © 2021 Ververica Temporal Table Joins • Allows enrichment of

    data with changing metadata • Joining a table against a versioned table wrt time • FLIP-132 How to easily handle time 12 SELECT order_id, price * conversion_rate, order_time, FROM orders LEFT JOIN currency_rates FOR SYSTEM_TIME AS OF orders.order_time ON orders.currency = currency_rates.currency Timestamp Currency Conversion Rate 11:00 ¥ 0,13 12:00 ¥ 0,11 13:00 ¥ 0,15 1, 11:00, 100, ¥ 3, 13:00, 50, ¥ 2, 12:00, 200, ¥ 1, 11:00, 13, € 3, 13:00, 7.5, € 2, 12:00, 22, €
  11. © 2021 Ververica Table Valued Functions For The Win •

    Time windows can now easily be expressed using table valued functions (TVFs) (FLIP-145) Expressing windows intuitively 13 SELECT * FROM TABLE( TUMBLE(TABLE Bid, DESCRIPTOR(bidtime), INTERVAL '10' MINUTES)); SELECT * FROM TABLE( HOP(TABLE Bid, DESCRIPTOR(bidtime), INTERVAL '5' MINUTES, INTERVAL '10' MINUTES)); SELECT * FROM TABLE( CUMULATE(TABLE Bid, DESCRIPTOR(bidtime), INTERVAL '2' MINUTES, INTERVAL '10' MINUTES));
  12. © 2021 Ververica Bringing Table API/SQL And DataStream Together Table

    table = tableEnv.fromDataStream( dataStream, Schema.newBuilder() .columnByMetadata("rowtime", "TIMESTAMP(3)") .watermark("rowtime", "SOURCE_WATERMARK()") .build()); DataStream<Row> dataStream = tableEnv.toDataStream(table) .keyBy(r -> r.getField("user")) .window(...); Improved interoperability between Table API/SQL and DataStream 14 • Automatic type conversion • Consistent event time configuration and watermarks • Enhanced Row class for easier access • FLIP-136 • More details: Flink's Table & DataStream API: A Perfect Symbiosis (October 27, 1:30 PM PST)
  13. © 2021 Ververica Polygamy in The Flink World • Flink

    is storage system agnostic and works with many different systems • Flink as a stream processing engine needs data to do its job ⇒ connectors are of paramount importance Flink loves them all 16
  14. © 2021 Ververica Strengthening The Connector Ecosystem • Community is

    strengthening the connector ecosystem ─ New source/sink interface to build upon ─ Unified interfaces for streaming and batch execution ─ Reusable building blocks • Existing connectors are ported to the new interfaces • Building a vibrant connector ecosystem with frequent releases and easy contributions • More information: Integrating Flink into your ecosystem - How to build a Flink connector from scratch (October 26, 1:30 PM CEST) Flink is only as strong as its connector ecosystem 17
  15. © 2021 Ververica The Latest Additions to The Connector Ecosystem

    • Exactly once JDBC sink (FLINK-15578) ─ Exactly-once processing guarantees for XA-compliant databases • Pulsar connector (FLINK-20726) ─ Supports streaming and batch execution mode ─ Exactly-once processing guarantees using Pulsar’s transaction (>= 2.8.0) • CDC connectors ─ Using Debezium to capture changes from various DBs ─ Exactly-once processing guarantees ─ No need to deploy Debezium • Iceberg ─ Use Flink to process data stored in Iceberg • Standardized connector metrics (FLIP-33) Who are the new kids on the block? 18
  16. © 2021 Ververica Making it Easy to Run a Complex

    System • Operating Flink can be a daunting task because there are so many knobs to tune → Improve observability of Flink and its jobs • Fast and predictable checkpoints are required for low latency results → Improve checkpointing algorithm • Long running workloads face changing workloads → Make it easy to rescale Flink It is important to understand what’s going on 20
  17. © 2021 Ververica Where am I backpressured? • Bottleneck detection

    (FLINK-14712) ─ Using mailbox metrics to assess reliably whether a task is busy/idling/back-pressured Understanding what your Flink job is doing 21
  18. © 2021 Ververica Flamegraphs • Understanding where your job spends

    the most time (FLINK-13550) ─ Flink can generate flamegraphs for analysis • rest.flamegraph.enabled: true It’s getting hot in here 22
  19. © 2021 Ververica More Observability Features • Access latency metrics

    for state (FLINK-21736) ─ Helps uncovering potential misconfigurations ─ state.backend.rocksdb.latency-track-enabled: true • Exception history in the web UI (FLINK-6042) ─ Shows all occurred exceptions ─ Groups concurrent exceptions together 23
  20. © 2021 Ververica Canonical Savepoint Format • Savepoints are written

    in a canonical format (FLIP-41) • Users can change the state backend when resuming from a savepoint ─ E.g. start with the HashMapStateBackend and then switch to RocksDB as the state size grows Making it easier to run Flink jobs 24
  21. © 2021 Ververica Unaligned Checkpoints Aligned checkpoints • Checkpoint barriers

    flow with the stream and cannot overtake records • Backpressure strongly affects the checkpoint duration Unaligned checkpoints (FLIP-76) • Checkpoint barriers can overtake records (must be included in the checkpoint) • Decouples checkpointing from the end-to-end latency of the job → More reliable checkpoint times • Higher I/O load because of additional writes Reliable checkpoints under backpressure 25
  22. © 2021 Ververica Buffer Debloating • Network memory buffers records

    to keep network connections fully utilized • All buffered records need to be processed before a checkpoint can complete • The more buffered records, the longer the checkpoint takes • Ideally, Flink adjusts memory to minimize in-flight records and keep connections fully utilized Minimizing the in-flight data while keeping the network connections saturated 26 Buffer debloating (FLIP-183) • Dynamically adjust memory wrt to consumers throughput • Keep as many bytes as can be processed in X ms • Stable and predictable checkpoint times under backpressure
  23. © 2021 Ververica Outlook: Changelog Statebackend (FLIP-158) • State changes

    are written to a changelog and tables • Changelog is continuously persisted → checkpoint can be ack’ed as soon as the changelog is persisted • Tables represent read-optimized view of state changes • Changelog is truncated whenever tables are uploaded (asynchronously) → Faster and more reliable checkpoints • More information about Flink’s checkpointing: Rundown of Flink's Checkpoints (October 27, 12:50 PST) Generalized incremental checkpoints 27
  24. © 2021 Ververica Elastic Jobs • Long running streaming applications

    will eventually face changing workloads • Risk to over/under provision • Ideally Flink would adjust resources based on workload How to react to changing workloads? 28
  25. © 2021 Ververica TaskManager 1 The Reactive Mode • Flink

    reacts to the available set of TaskManagers and makes use of all slots (FLIP-159) • If #TaskManagers increases, then the job scales up • If #TaskManagers decreases, then the job scales down • External tool that monitors resource usage and controls #TaskManagers can implement auto scaling Making it easy to scale Flink jobs up/down 29 • More details: Demystifying Deployments: Applications or Clusters, Active and Reactive Scaling - What is it about? (October 27, 11:20 PST) TaskManager 1 TaskManager 2 JobManager JobManager
  26. © 2021 Ververica Wrap Up • Flink is a truly

    unified stream and batch processor • Flink’s SQL has significantly matured and is even more expressive than before • Flink’s diverse connector ecosystem is steadily growing • Improved operations and observability make it a lot easier to run Flink • Flink creates checkpoints faster and more reliably for fast exactly-once jobs • Flink can be easily rescaled to adapt to changing workloads • Flink community continues to innovate. Join us on the journey! What to take home 30
  27. © 2021 Ververica We are hiring! ❏ Technical Lead Distributed

    Data Management Systems ❏ Junior Engineer Data Intensive Systems ❏ Senior Engineer Data Intensive Systems ❏ Solutions Architect ❏ Senior Technical Product Marketing Manager ❏ Technical Writer Work in one of the most active Open Source communities Build a technology used by some of the biggest companies in the world Work in one of the most cutting-edge, innovative technology spaces ...and more Apply today at: ververica.com/careers