Slide 1

Slide 1 text

© 2020 Ververica News from Flink’s engine room: “Full steam ahead” Till Rohrmann @stsffap

Slide 2

Slide 2 text

© 2020 Ververica Scheduling And Failover

Slide 3

Slide 3 text

© 2020 Ververica Recap: Batch & Streaming Unification One engine to rule them all • Batch is just a bounded stream! 3

Slide 4

Slide 4 text

© 2020 Ververica Unbounded Stream Processing Processing Data as It Arrives 4 older more recent Watermarks Sources

Slide 5

Slide 5 text

© 2020 Ververica Bounded Stream Processing Having All Data Available 5 older more recent Watermarks Sources

Slide 6

Slide 6 text

© 2020 Ververica How to Process Bounded Streams Fast? • All data is available at start time ─ Massively parallel out-of-order ingestion ─ Latency not very important → efficient batching of records ─ Optimized operators ─ Results are ready at the end → no watermarks, no incremental results ─ Job can be executed in stages Boundedness Allows Different Execution Strategies 6

Slide 7

Slide 7 text

© 2020 Ververica Recap: Faster Failover for Bounded Streams • Avoid redundant work due to failovers • Separating topology into pipelined regions • Store results produced by each pipelined regions • Resume computation from latest available result FLIP-1: Fine Grained Recovery 7 Src Map Sink Operator Result

Slide 8

Slide 8 text

© 2020 Ververica What’s The Point? • TPC-H Query 3 • Exponential failure rate Benefits of FLIP-1 8 SELECT l_orderkey, SUM(l_extendedprice*(1-l_discount)) AS revenue, o_orderdate, o_shippriority FROM customer, orders, lineitem WHERE c_mktsegment = '[SEGMENT]' AND c_custkey = o_custkey AND l_orderkey = o_orderkey AND o_orderdate < date '[DATE]' AND l_shipdate > date '[DATE]' GROUP BY l_orderkey, o_orderdate, o_shippriority;

Slide 9

Slide 9 text

© 2020 Ververica How to Benefit From FLIP-1? • FLIP-1 introduced with Flink 1.9 • FLIP-1 is used when using the Blink Table Planner • DataSet jobs use pipelined mode by default → FLIP-1 won’t have any effect until changing the ExecutionMode ─ ExecutionConfig.setExecutionMode(ExecutionMode.BATCH) ─ ExecutionConfig.setExecutionMode(ExecutionMode.BATCH_FORCED) It is not always on! 9

Slide 10

Slide 10 text

© 2020 Ververica Problems When Scheduling Bounded Streams • Lazy-from-sources scheduling strategy ─ Task centric view ─ Schedule tasks as soon as inputs are ready Flink’s Old Scheduler 10 SELECT customerId, name FROM customers, orders WHERE customerId = orderCustomerId Csts Ords Join Blocking Pipelined Tasks to schedule: Csts, Ords, Join #Available slots: 1 Scheduling Order 1: 1. Ords 2. ? Scheduling Order 2: 1. Csts 2. Ords 3. Join

Slide 11

Slide 11 text

© 2020 Ververica Pipelined Regions Scheduler • Scheduling units are the pipelined region (all tasks which need to run at the same time) • Schedule pipelined regions as soon as all its inputs are ready Pipelined Region Centric View 11 Csts Ords Join Blocking Pipelined Pipelined region Pipelined region Scheduling order: 1. Csts 2. Ords + Join

Slide 12

Slide 12 text

© 2020 Ververica Benefits of Pipelined Region Scheduler • Reliable scheduling of bounded jobs under constrained resources ─ Guarantees to make progress as long as the largest pipelined region can be run ─ No more deadlocks due to bad scheduling decisions • Better resource utilization ─ Only schedule tasks which can actually make progress 12

Slide 13

Slide 13 text

© 2020 Ververica Unified Batch & Streaming Scheduling & Failover • Pipelined regions are units for scheduling & failover • Generalizes well to streaming/unbounded workloads → Just a single large pipelined region which produces infinite results ─ If single pipelined region: Pipelined region scheduling == “All at once” scheduling strategy Putting the Pieces Together 13 Pipelined region

Slide 14

Slide 14 text

© 2020 Ververica Elastic Streaming Pipelines

Slide 15

Slide 15 text

© 2020 Ververica Changing Workloads Change is The Only Constant 15

Slide 16

Slide 16 text

© 2020 Ververica Elastic Streaming Pipelines Adjust to The Actual Workload 16

Slide 17

Slide 17 text

© 2020 Ververica Deployment Modes Flink is Not Always in Charge 17 ● Yarn, Mesos, Kubernetes ● Flink can ask for more resources ● Standalone, Containerized ● Resources are assigned by a third party Active deployments Oblivious deployments

Slide 18

Slide 18 text

© 2020 Ververica Reactive Execution Mode Reacting to Available Resources 18 Job Master Resource Manager Need ∞ resources TaskExecutor Register( ) Assign( ) TaskExecutor Register( )

Slide 19

Slide 19 text

© 2020 Ververica How Can Flink Declare ∞ Resources? Old slot allocation protocol • Every task asks for its slot individually • Fails if we cannot obtain all slots ⇒ Won’t work if we want to react to available resources Declarative slot allocation protocol • Declare the amount of required resources • ResourceManager tries to fulfill the declared resources as good as possible • Reactive mode declares ∞ resource requirements → all slots go to the JobMaster as soon as they arrive • FLIP-138: Declarative Resource Management A New Slot Allocation Protocol 19

Slide 20

Slide 20 text

© 2020 Ververica How to Make Use of Changing Resources? Old scheduler 1. Pre-determine the parallelism 2. Ask for slots 3. Execute the job Declarative scheduler 1. Declare required resources 2. Wait for resources to arrive 3. Decide on the parallelism based on available resources ⇒ Invert resource declaration and deciding on parallelism 4. Adjust parallelism if more resources arrive The Declarative Scheduler 20

Slide 21

Slide 21 text

© 2020 Ververica The Declarative Scheduler A Small Example 21 JobGraph Resources ExecutionGraph ∅ Required Available Used 4 0 0 Parallelism: 0

Slide 22

Slide 22 text

© 2020 Ververica The Declarative Scheduler A Small Example 22 JobGraph ExecutionGraph Resources Required Available Used 4 2 2 Parallelism: 1

Slide 23

Slide 23 text

© 2020 Ververica The Declarative Scheduler A Small Example 23 JobGraph ExecutionGraph Resources Required Available Used 4 4 2 ⇒ Take checkpoint and trigger job restart How to make use of the new resources? Parallelism: 1

Slide 24

Slide 24 text

© 2020 Ververica The Declarative Scheduler A Small Example 24 JobGraph ExecutionGraph Resources Required Available Used 4 4 4 Parallelism: 2

Slide 25

Slide 25 text

© 2020 Ververica Outlook Autoscaling • User defined RescalingPolicies set target value ─ target: Ideal parallelism to run the job with • Periodically querying the RescalingPolicies for target values • Declare target resource requirements • Rely on declarative scheduler to rescale job when new resources arrive Enabling Flink to Scale an Application 25 t = 1 t = 1 ResourceM anager

Slide 26

Slide 26 text

© 2020 Ververica Outlook Autoscaling • User defined RescalingPolicies set target value ─ target: Ideal parallelism to run the job with • Periodically querying the RescalingPolicies for target values • Declare target resource requirements • Rely on declarative scheduler to rescale job when new resources arrive Enabling Flink to Scale an Application 26 t = 2 t = 1 ResourceM anager #Target Slots: 3

Slide 27

Slide 27 text

© 2020 Ververica Outlook Autoscaling • User defined RescalingPolicies set target value ─ target: Ideal parallelism to run the job with • Periodically querying the RescalingPolicies for target values • Declare target resource requirements • Rely on declarative scheduler to rescale job when new resources arrive Enabling Flink to Scale an Application 27 t = 2 t = 1 ResourceM anager Allocate 3rd slot

Slide 28

Slide 28 text

© 2020 Ververica User Benefits • Better resource utilization under changing workloads (no more under/over-provisioning) • Easier operations ─ Resources can be added on the fly ─ Flink can better tolerate resource loss • Easier deployments ─ Application style deployments w/o running a cluster 28

Slide 29

Slide 29 text

© 2020 Ververica Conclusion • Unified scheduling and failover for batch & streaming • Flink schedules and fails over batch jobs now more efficiently • Flink will soon support fully elastic streaming pipelines ─ Being able to better handle changing workloads • Reactive mode will ease operations and deployment significantly What to take home? 29

Slide 30

Slide 30 text

© 2020 Ververica THANK YOU!

Slide 31

Slide 31 text

© 2020 Ververica Ververica is hiring! Write me (till@ververica.com) or visit https://www.ververica.com/careers

Slide 32

Slide 32 text

© 2020 Ververica QUESTION?