Slide 1

Slide 1 text

Deployment Workshop Deploying Dask Distributed Jacob Tomlinson

Slide 2

Slide 2 text

Dask Distributed A centrally managed, distributed, dynamic task scheduler

Slide 3

Slide 3 text

Dask Overview

Slide 4

Slide 4 text

No content

Slide 5

Slide 5 text

Worker Worker Worker Scheduler Client Protocols TCP UCX Websocket Dask components can communicate via a variety of different protocols.

Slide 6

Slide 6 text

Scheduler Starting a scheduler

Slide 7

Slide 7 text

Connecting a worker Worker Scheduler

Slide 8

Slide 8 text

Client Scheduler Worker Connecting a client

Slide 9

Slide 9 text

Client Scheduler Worker Submitting work

Slide 10

Slide 10 text

Dask Dashboard

Slide 11

Slide 11 text

JupyterLab Extension

Slide 12

Slide 12 text

Cluster Managers Utility classes to simplify cluster creation

Slide 13

Slide 13 text

Local Cluster Scheduler Worker Worker Worker Worker LocalCluster creates everything for you. It will break down a large CPU into multiple workers withy multiple threads as this can be more performant.

Slide 14

Slide 14 text

Client Local Cluster Scheduler Worker Worker Worker Worker

Slide 15

Slide 15 text

Get logs

Slide 16

Slide 16 text

Scaling

Slide 17

Slide 17 text

How do I get more resource? Moving beyond a single machine

Slide 18

Slide 18 text

SSH ... You could SSH to a bunch of machines and start the Dask components manually.

Slide 19

Slide 19 text

SSHCluster Or you could use SSHCluster which will bootstrap a cluster for you on a list of machines. All you need is passwordless SSH configured for each machine.

Slide 20

Slide 20 text

No content

Slide 21

Slide 21 text

Deployment Workshop Thank you! @_jacobtomlinson