Upgrade to Pro — share decks privately, control downloads, hide ads and more …

State of Ray (Robert Nishihara, Anyscale)

State of Ray (Robert Nishihara, Anyscale)

Robert Nishihara, co-founder and CEO, Anyscale, kicks off Ray Summit 2021 with a look back at the past year, celebrating the growth of the Ray project, ecosystem, and community.

Af07bbf978a0989644b039ae6b8904a5?s=128

Anyscale
PRO

July 14, 2021
Tweet

Transcript

  1. State of Ray Robert Nishihara CEO and Co-Founder, Anyscale

  2. An Amazing Year for Ray Vibrant Community Richer Ecosystem Inspiring

    User Stories
  3. 2017 2018 2019 2020 2021 480+ contributors 487 200 400

    600 Community
  4. Vibrant Community Richer Ecosystem Inspiring User Stories An Amazing Year

    for Ray
  5. Native Libraries 3rd Party Libraries Your app here! Ecosystem -

    the big picture! Universal framework for distributed computing Run anywhere Library + app ecosystem
  6. New and improved integrations Ecosystem and more ...

  7. 5X speed up with Ray XGBoost and more ...

  8. “Ray will play an increasingly important role in bringing much

    needed common infrastructure and standardization to the production machine learning ecosystem, both within Uber and the industry at large.” Horovod and more ...
  9. 13X speed up with Ray Dask and more ...

  10. Ray is the tool of choice for scaling libraries Ecosystem

    and more ...
  11. Vibrant Community Richer Ecosystem Inspiring User Stories An Amazing Year

    for Ray
  12. Diverse users … with diverse use cases

  13. Making Boats Fly with AI Mckinsey | QuantumBlack Australia

  14. Scaling Ecosystem Restoration Dendra Systems

  15. Large Scale ML Platforms Uber, Shopify, Robinhood, and more

  16. Vibrant Community Richer Ecosystem Inspiring User Stories An Amazing Year

    for Ray
  17. What’s new? A lookback at the last six months

  18. Object spilling Improved memory management Scalability testing Robustness & Scale

    NEW
  19. Driver Driver Ray cluster import tensorflow import pandas @ray.remote def

    f(): # app logic if __name__ == ‘__main__’: main() import tensorflow import pandas @ray.remote def f(): # app logic if __name__ == ‘__main__’: main() ray.client() Streamlined Workflows NEW
  20. Driver import tensorflow import pandas @ray.remote def f(): # app

    logic if __name__ == ‘__main__’: main() Ray cluster ray.client() Streamlined Workflows NEW
  21. ray.client().env( ) ray.client() Environments Streamlined Workflows

  22. ray.client().env( ) ray.client().env( { 'pip': ['tensorflow', 'pandas'], } ) Environments

    Streamlined Workflows
  23. ray.client().env( { 'pip': ['tensorflow', 'pandas'], 'working_dir': '/home/my-project' } ) ray.client().env(

    { 'pip': ['tensorflow', 'pandas'], } ) Environments Streamlined Workflows
  24. Revamped API Refactored architecture Production Serving NEW

  25. What’s next?

  26. Expand your knowledge 50+ breakout sessions Go deeper Tutorials (day

    3) Connect Slack and Gather.town Welcome to Ray Summit!
  27. Gather Slack Connect with other attendees Office hours with Ray

    developers
  28. Thank you!

  29. Get Involved Meet Ray users and developers (today after the

    break) Keynote: Ion Stoica Find the talks online youtube.com/anyscale Thank you!
  30. Backup

  31. Ray Summit 2021

  32. Ray Summit 2021 Ray Core • Ray internals: object management

    (today) • Deep dive into Ray’s scheduling (tomorrow) Ecosystem • Patterns of ML models in production (today) • Distributed XGBoost on Ray (today) • A bridge for preprocessing and training (today) • Data processing on Ray (today) • The ML ecosystem (tomorrow)
  33. An incredible year Community Ecosystem User Stories

  34. Ecosystem • Airflow

  35. Ecosystem • Airflow • XGBoost

  36. Ecosystem • Airflow • XGBoost • PyTorch

  37. Ecosystem • Airflow • XGBoost • PyTorch • Horovod We

    believe that Ray will continue to play an increasingly important role in bringing much needed common infrastructure and standardization to the production machine learning ecosystem, both within Uber and the industry at large.
  38. Ecosystem • Airflow • XGBoost • PyTorch • Horovod •

    Hugging Face
  39. Ecosystem • Airflow • XGBoost • PyTorch • Horovod •

    Hugging Face • MLflow
  40. Ecosystem • Airflow • XGBoost • PyTorch • Horovod •

    Hugging Face • MLflow • Scikit-learn
  41. Ecosystem • Airflow • XGBoost • PyTorch • Horovod •

    Hugging Face • MLflow • Scikit-learn • Dask
  42. Ecosystem • Airflow • XGBoost • PyTorch • Horovod •

    Hugging Face • MLflow • Scikit-learn • Dask • And more… Ray is the tool of choice for scaling libraries
  43. Native Libraries 3rd Party Libraries Your app here! Ecosystem universal

    framework for distributed computing run anywhere library + app ecosystem
  44. What’s New - Robustness & Scale Last Ray Summit

  45. What’s New - Robustness & Scale • 13x throughput (1.4

    PB/hour) • 90% cost reduction
  46. What’s New - Robustness & Scale • Deploying Ray on

    200K cores
  47. What’s New - Dev & Prod Improved workflows for development

    & production • Ray client • Environments
  48. What’s New - Dev & Prod Ray Client Driver Driver

    Ray cluster import tensorflow import pandas @ray.remote def f(): # app logic if __name__ == ‘__main__’: main() import tensorflow import pandas @ray.remote def f(): # app logic if __name__ == ‘__main__’: main()
  49. What’s New - Dev & Prod Ray Client Driver import

    tensorflow import pandas @ray.remote def f(): # app logic if __name__ == ‘__main__’: main() Ray cluster
  50. What’s New - Dev & Prod ray.client() Environments ray.client().env( )

  51. What’s New - Dev & Prod Environments ray.client().env( ) Environments

    ray.client().env( { 'pip': ['tensorflow', 'pandas'], } )
  52. What’s New - Dev & Prod Environments ray.client().env( { 'pip':

    ['tensorflow', 'pandas'], } ) Environments ray.client().env( { 'pip': ['tensorflow', 'pandas'], 'working_dir': '/home/my-project' } )
  53. What’s New - Serving

  54. User Stories

  55. User Stories - Uber

  56. User Stories - Uber Ray will play an increasingly important

    role in bringing much needed common infrastructure and standardization to the production machine learning ecosystem, both within Uber and the industry at large.
  57. User Stories - Uber

  58. User Stories - Ant Group

  59. User Stories - McKinsey

  60. Conclusion Talk about call to actions Also, call out exciting

    upcoming talks
  61. Asawari’s Version

  62. State of Ray Robert Nishihara CEO and Co-Founder, Anyscale

  63. Welcome TO Ray Summit 2021

  64. An Amazing Year! Vibrant Community Richer Ecosystem Inspiring User Stories

  65. 2017 2018 2019 2020 2021 560+ contributors from XXX companies

    560 200 400 600 COMMUNITY
  66. New and improved integrations ECOSYSTEM and more ...

  67. 5X speed up with Ray XGBoost and more ...

  68. We believe that Ray will continue to play an increasingly

    important role in bringing much needed common infrastructure and standardization to the production machine learning ecosystem, both within Uber and the industry at large. Horovod and more ...
  69. Ray is the tool of choice for scaling libraries ECOSYSTEM

    and more ...
  70. Native Libraries 3rd Party Libraries Your app here! Ecosystem -

    the big picture! Universal framework for distributed computing Run anywhere Library + app ecosystem
  71. Diverse users …

  72. … with diverse use cases

  73. Making Boats Fly with AI Mckinsey | QuantumBlack Australia

  74. Scaling Ecosystem Restoration Dendra Systems

  75. Large Scale ML Platforms Uber, Spotify, Robinhood, and more

  76. An Amazing Year! Vibrant Community Richer Ecosystem Inspiring User Stories

  77. Thank You We wouldn’t be here without YOU.

  78. What’s new? A lookback at last year

  79. Object spilling Improved memory management Scalability envelope Proof point (Optional)

    Robustness & Scale NEW
  80. ray.connect() One line benefit statement Streamlined Workflows NEW

  81. Environments One line benefit statement Streamlined Workflows NEW

  82. Feature / Improvement Feature / improvement Serving Improvement NEW

  83. What’s next? A peek into the future

  84. Feature / Improvement Feature / improvement bucket ROADMAP

  85. Thank You

  86. Backup

  87. None
  88. 480+ contributors from XXX companies

  89. What’s New - Robustness & Scale • Data Processing on

    Ray ◦ SangBin Cho • Ray Internals: Object Management with the Ownership Model ◦ Stephanie Wang, Yi Cheng • A Deep Dive into Ray’s Scheduling Policy ◦ Alex Wu • MLDataset: A Ray Bridge for Data Preprocessing and Distributed Training ◦ Clark Zinzow • Building High Availability and Scalability Online Computing Applications on Ray ◦ Tengwei Cai • Improving Ray for Large-scale Applications ◦ Hao Chen
  90. What’s New - Robustness & Scale Enables 100TB shuffle