Slide 1

Slide 1 text

Deployment Workshop Dask on Kubernetes Jacob Tomlinson

Slide 2

Slide 2 text

Types of Dask cluster Fixed Ephemeral

Slide 3

Slide 3 text

Fixed clusters More traditional cluster deployments where you set things up and leave them running indefinitely. They idle while not in use but are always ready to go when you need them.

Slide 4

Slide 4 text

Ephemeral clusters Dynamic clusters which are created in the moment of need and destroyed again when you’re done. These rely on some underlying scheduling system which can quickly provision resources.

Slide 5

Slide 5 text

Dask Helm Chart A chart which launches a fixed size Dask cluster alongside a Jupyter notebook. Fixed

Slide 6

Slide 6 text

No content

Slide 7

Slide 7 text

Service Worker Worker Worker Scheduler Jupyter Dask Helm Chart Deployment Service Deployment Deployment Ingress Ingress 💻 The user creates the cluster once. Then they can connect to it multiple times in the future.

Slide 8

Slide 8 text

Service Worker Worker Worker Scheduler Jupyter Dask Helm Chart Deployment Service Deployment Deployment Ingress Ingress If the user disconnects the cluster still exists.

Slide 9

Slide 9 text

dask-kubernetes A collection of cluster managers and utilities for Kubernetes Ephemeral

Slide 10

Slide 10 text

KubeCluster() Spawns ephemeral clusters by requesting Pods directly via the Kubernetes API.

Slide 11

Slide 11 text

No content

Slide 12

Slide 12 text

Worker Worker Worker Scheduler KubeCluster Service Ingress 🏽 💻 Worker The user dynamically creates the cluster resources at runtime.

Slide 13

Slide 13 text

Worker Worker Worker Scheduler Service Ingress Worker KubeCluster If the user disconnects.

Slide 14

Slide 14 text

KubeCluster The cluster is garbage collected.

Slide 15

Slide 15 text

HelmCluster() Connects to an existing Helm Chart deployment and provides the cluster manager interface including log retrieval and manual scaling.

Slide 16

Slide 16 text

No content

Slide 17

Slide 17 text

dask-gateway A central hub which launches Dask clusters on behalf of users. Can launch onto Kubernetes (and more). Ephemeral/Fixed

Slide 18

Slide 18 text

Node Scheduler Worker Worker Worker Dask Gateway Ingress Worker Service Dask Gateway Dask Proxy Service Worker Scheduler Scheduler Worker Worker Node Node Node The user connects to the gateway and requests cluster resources. Dask gateway launches the cluster on their behalf and proxies traffic through.

Slide 19

Slide 19 text

No content

Slide 20

Slide 20 text

DaskHub Helm Chart JupyterHub and Dask Gateway packaged as a single Helm chart. Provides a central portal for launching Jupyter and Dask together. Ephemeral/Fixed

Slide 21

Slide 21 text

Jupyter Hub Jupyter Proxy Jupyter Scheduler Scheduler Worker Worker Worker DaskHub Helm Chart 💻 Dask Gateway Service Service Ingress Jupyter Worker Service Lines omitted for clarity. Things got a bit crazy. Just imagine lines from basically everything to everything. Jupyter Auth Jupyter Spawner Database

Slide 22

Slide 22 text

Deployment Workshop Stay tuned for Deploy JupyterHub with Dask Gateway on Kubernetes in 15 minutes Amit Kumar, Adam Lewis 17:20 UTC

Slide 23

Slide 23 text

Dask Deployments on Kubernetes Gateway Dask Helm Chart Fixed Deploys a Jupyter Server and Dask cluster via Kubernetes deployments. Can be manually scaled via kubectl/helm/dask_kubernetes.HelmCluster(). https://github.com/dask/helm-chart dask_kubernetes.KubeCluster() Ephemeral Dynamically launch Dask clusters onto Kubernetes and scale adaptively. Gets garbage collected when idle. https://github.com/dask/dask-kubernetes Dask Gateway Ephemeral/Fixed/Centralized Central hub for spawning Dask Clusters. Great for teams and organizations who want to run many clusters for many users. https://gateway.dask.org/

Slide 24

Slide 24 text

Deployment Workshop Thank you! @_jacobtomlinson