Upgrade to Pro — share decks privately, control downloads, hide ads and more …

[Webinar] An introduction to Ray for scaling machine learning (ML) workloads

Anyscale
August 18, 2021

[Webinar] An introduction to Ray for scaling machine learning (ML) workloads

Modern machine learning (ML) workloads, such as deep learning and large-scale model training, are compute-intensive and require distributed execution. Ray was created in the UC Berkeley RISELab to make it easy for every engineer to scale their applications, without requiring any distributed systems expertise.

Join Robert Nishihara, co-creator of Ray, and Bill Chambers, product lead for Ray, for an introduction to Ray for scaling your ML workloads. Learn how Ray libraries (eg. Ray Tune, Ray Serve, etc) help you easily scale every step of your ML pipeline — from model training and hyperparameter search to production serving.

Highlights include:
* Ray overview & core concepts
* Library ecosystem and use cases
* Demo: Ray for scaling ML workflows
* Getting started resources

Anyscale

August 18, 2021
Tweet

More Decks by Anyscale

Other Decks in Technology

Transcript

  1. Introduction to Ray for scaling machine learning Robert Nishihara Co-founder,

    Anyscale and co-creator of Ray Bill Chambers Product lead, Anyscale
  2. - Machine learning is pervasive in every domain - Distributed

    machine learning is becoming a necessity - Distributed computing is notoriously hard Why Ray?
  3. - Machine learning is pervasive in every domain - Distributed

    machine learning is becoming a necessity - Distributed computing is notoriously hard Why Ray?
  4. - Machine learning is pervasive in every domain - Distributed

    machine learning is becoming a necessity - Distributed computing is notoriously hard Why Ray?
  5. 35x every 18 m onths 2020 GPT-3 Compute demand growing

    faster than supply Moore’s Law (2x every 18 months) CPU https://openai.com/blog/ai-and-compute/
  6. 35x every 18 m onths 2020 GPT-3 Specialized hardware is

    also not enough Moore’s Law (2x every 18 months) CPU https://openai.com/blog/ai-and-compute/ GPU* TPU *
  7. 35x every 18 m onths 2020 GPT-3 Specialized hardware is

    also not enough Moore’s Law (2x every 18 months) CPU https://openai.com/blog/ai-and-compute/ GPU* TPU * No way out but to distribute!
  8. - Machine learning is pervasive in every domain - Distributed

    machine learning is becoming a necessity - Distributed computing is notoriously hard Why Ray?
  9. - Machine learning is pervasive in every domain - Distributed

    machine learning is becoming a necessity - Distributed computing is notoriously hard Ray’s vision: Make distributed computing accessible to every developer Why Ray?
  10. Rich ecosystem for scaling ML workloads Native libraries - easily

    scale common bottlenecks in ML workflows - Examples: Ray Tune for HPO, RLlib for RLlib, Ray Serve for Serving, etc. Integrations - scale popular frameworks with Ray with minimal changes - Examples: XGBoost, TF, Jax, PyTorch etc.
  11. Rich ecosystem for scaling ML workloads Ray Core / Datasets

    Model Serving Data Processing Training Serving Ray Core + Datasets Reinforcement Learning Hyper. Tuning ** a small subset of the Ray ecosystem in ML
  12. Rich ecosystem for scaling ML workloads Ray Core / Datasets

    Model Serving Data Processing Training Serving Ray Core + Datasets Reinforcement Learning Hyper. Tuning ** a small subset of the Ray ecosystem in ML Integrate Ray only based on your needs!
  13. Challenges in scaling hyperparameter tuning? Rich ecosystem for scaling ML

    workloads Ray Core / Datasets Model Serving Data Processing Training Serving Ray Core + Datasets Reinforcement Learning Hyper. Tuning
  14. Rich ecosystem for scaling ML workloads Ray Core / Datasets

    Model Serving Data Processing Training Serving Ray Core + Datasets Reinforcement Learning Hyper. Tuning Integrate Ray Tune! No need to adopt entire Ray framework.
  15. Rich ecosystem for scaling ML workloads Ray Core / Datasets

    Model Serving Data Processing Training Serving Ray Core + Datasets Reinforcement Learning Hyper. Tuning Unified, distributed toolkit to go end-to-end
  16. Ray Core / Datasets Model Serving Data Processing Training Serving

    Reinforcement Learning Hyper. Tuning Companies scaling ML with Ray
  17. Starting scaling your ML workloads Getting Started: Documentation (docs.ray.io) Quick

    start example, reference guides, etc Forums (discuss.ray.io) Learn / share with broader Ray community, including core team Ray Slack Connect with the Ray team and community