Upgrade to Pro — share decks privately, control downloads, hide ads and more …

Deploying Dask Distributed

Deploying Dask Distributed

Ca3d0556227d66b3c15be1eadf69473b?s=128

Jacob Tomlinson

May 19, 2021
Tweet

Transcript

  1. Deployment Workshop Deploying Dask Distributed Jacob Tomlinson

  2. Dask Distributed A centrally managed, distributed, dynamic task scheduler

  3. Dask Overview

  4. None
  5. Worker Worker Worker Scheduler Client Protocols TCP UCX Websocket Dask

    components can communicate via a variety of different protocols.
  6. Scheduler Starting a scheduler

  7. Connecting a worker Worker Scheduler

  8. Client Scheduler Worker Connecting a client

  9. Client Scheduler Worker Submitting work

  10. Dask Dashboard

  11. JupyterLab Extension

  12. Cluster Managers Utility classes to simplify cluster creation

  13. Local Cluster Scheduler Worker Worker Worker Worker LocalCluster creates everything

    for you. It will break down a large CPU into multiple workers withy multiple threads as this can be more performant.
  14. Client Local Cluster Scheduler Worker Worker Worker Worker

  15. Get logs

  16. Scaling

  17. How do I get more resource? Moving beyond a single

    machine
  18. SSH ... You could SSH to a bunch of machines

    and start the Dask components manually.
  19. SSHCluster Or you could use SSHCluster which will bootstrap a

    cluster for you on a list of machines. All you need is passwordless SSH configured for each machine.
  20. None
  21. Deployment Workshop Thank you! @_jacobtomlinson