Upgrade to Pro — share decks privately, control downloads, hide ads and more …

Dask Summit: Native Cloud Deployment with Dask-Cloudprovider

Dask Summit: Native Cloud Deployment with Dask-Cloudprovider

Jacob Tomlinson

February 27, 2020
Tweet

More Decks by Jacob Tomlinson

Other Decks in Technology

Transcript

  1. Overview of a cluster manager Cluster Scheduler Worker Worker Worker

    Client Cloud resource Cloud resource Cloud resource Cloud resource Array Dataframe Bag ML Xarray...
  2. AWS from dask_cloudprovider import FargateCluster cluster = FargateCluster() cluster.scale(10) from

    dask_cloudprovider import ECSCluster cluster = ECSCluster( cluster_arn="arn" ) cluster.scale(10) AWS Fargate • Managed container platform • Scale by CPU and Memory • Billing per CPU/memory second • Low account limits (~50 workers) AWS Elastic Container Service • Unmanaged container platform • Full control over VM type (GPU, ARM) • Scale by VMs • Billing per VM second
  3. AzureML - Targets Data Scientists from all backgrounds in enterprise

    settings - Easy to use interfaces for interacting with cloud resources (GUI, Python SDK, R SDK, ML CLI) - Powerful hundred node clusters of Azure CPU or GPU VMs for various workloads
  4. AzureML - Targets Data Scientists from all backgrounds in enterprise

    settings - Easy to use interfaces for interacting with cloud resources (GUI, Python SDK, R SDK, ML CLI) - Powerful hundred node clusters of Azure CPU or GPU VMs for various workloads Data science, ML Software development
  5. AzureML - Targets Data Scientists from all backgrounds in enterprise

    settings - Easy to use interfaces for interacting with cloud resources (GUI, Python SDK, R SDK, ML CLI) - Powerful hundred node clusters of Azure CPU or GPU VMs for various workloads Data science, ML Software development Distributed systems and HPC
  6. AzureML | dask-cloudprovider # import from Azure ML Python SDK

    and Dask from azureml.core import Workspace from dask.distributed import Client from dask_cloudprovider import AzureMLCluster # specify Workspace - authenticate interactively or otherwise ws = Workspace.from_config() # see https://aka.ms/azureml/workspace # get (or create) desired Compute Target and Environment (base image + conda/pip installs) ct = ws.compute_targets[‘cpu-cluster’] # see https://aka.ms/azureml/computetarget env = ws.environments[‘AzureML-Dask-CPU’] # see https://aka.ms/azureml/environments # start cluster, print widget and links cluster = AzureMLCluster(ws, ct, env, initial_node_count=100, jupyter=True) # optionally, use directly in client c = Client(cluster) # optionally, use directly in Client
  7. AzureML | Architecture • Derives from the distributed.deploy.cluster.Cluster class •

    Starts the scheduler via an experiment run • Headnode also runs a worker (maximize resources utilization) • Submits an experiment run for each worker • Port forwarding: • Port mapping via socat if on the same VNET • SSH-tunnel port forward otherwise (needs SSH creds) https://github.com/dask/dask-cloudprovider/pull/67
  8. AzureML | Links https://github.com/dask/dask-cloudprovider/pull/67 - [email protected] - Cody - PM

    @ Azure ML - [email protected] - Tom - Senior Data Scientist @ Azure ML - https://github.com/lostmygithubaccount/dasky - CPU demos - https://github.com/drabastomek/GTC - GPU demos - @tomekdrabas @codydkdc - Twitter - NVIDIA’s GTC in San Jose and Microsoft’s //build in Seattle