Upgrade to Pro — share decks privately, control downloads, hide ads and more …

Powering Open Data Hub with Ray (Erik Erlandson, Red Hat, AI Center of Excellence)

Powering Open Data Hub with Ray (Erik Erlandson, Red Hat, AI Center of Excellence)

Ray is quickly gaining momentum as a distributed computing platform that combines a powerful parallel compute model with a cloud native serverless-style scaling model. Open Data Hub (ODH) is a flexible and customizable federation of open source data science tools that is a great fit for taking advantage of Ray compute clusters.

In this talk, Erik will explain how to integrate Ray with Open Data Hub, by configuring ODH profiles that deploy on-demand Ray clusters for Jupyter notebooks. He’ll demonstrate Ray in action as a compute resource for ODH, and explore the potential use cases opened up by self-service notebooks backed by Ray. Along the way he’ll also discuss the logistics of adapting Ray to OpenShift’s security features.

Attendees will learn how Ray integrates with Open Data Hub’s architecture, and how they can power ODH with Ray to solve distributed computing problems in the popular Jupyter environment.

Af07bbf978a0989644b039ae6b8904a5?s=128

Anyscale
PRO

July 21, 2021
Tweet

Transcript

  1. Powering ODH With Ray Erik Erlandson, Red Hat, Inc. eje@redhat.com

    @ManyAngled
  2. Or... Erik Erlandson, Red Hat, Inc. eje@redhat.com @ManyAngled

  3. Jupyter & Ray In The Cloud Erik Erlandson, Red Hat,

    Inc. eje@redhat.com @ManyAngled
  4. Landscape Motivations Open Data Hub and Jupyter in Context Ray

    on ODH Demo Community Collaborations
  5. Native Ray Libraries • Tune: Scalable Hyperparameter Tuning • RLlib:

    Scalable Reinforcement Learning • RaySGD: Distributed Training Wrappers • Ray Serve: Scalable and Programmable Serving
  6. Ray Community Integrations • XGBoost • Dask • Horovod •

    sklearn • Spacy • huggingface https://docs.ray.io/en/master/ray-libraries.html
  7. Ray Community Integrations • XGBoost • Dask • Horovod •

    sklearn • Spacy • huggingface https://docs.ray.io/en/master/ray-libraries.html
  8. Literate And Interactive Ray... https://docs.ray.io/en/master/ray-libraries.html

  9. Hosted In The Cloud https://docs.ray.io/en/master/ray-libraries.html

  10. Jupyter + Ray 1.X Jupyter + Ray Head Pod Ray

    Worker Pods
  11. Jupyter + Ray 1.X Jupyter + Ray Head Pod Ray

    Worker Pods
  12. Jupyter + Ray 2.0 Ray Worker Pods Ray Head Pod

    Jupyter Pod
  13. Jupyter ...

  14. Jupyter via Open Data Hub

  15. Open Data Hub Is ... Open Source Downstream Reference Platform

    Federated Meta Operator
  16. Open Data Hub Is ... Open Source Downstream Reference Platform

    Federated Meta Operator
  17. Open Data Hub Is ... Open Source Downstream Reference Platform

    Federated Meta Operator
  18. Open Data Hub Is ... Open Source Downstream Reference Platform

    Federated Meta Operator
  19. Open Data Hub Is ... Open Source Downstream Reference Platform

    Federated Meta Operator
  20. Data Science with ODH Set goals Gather and prepare data

    Develop ML model Deploy ML models in app dev process Implement Apps & Inference ML models Monitoring & Management
  21. Data Science with ODH Set goals Gather and prepare data

    Develop ML model Deploy ML models in app dev process Implement Apps & Inference ML models Monitoring & Management
  22. Data Science with ODH Set goals Gather and prepare data

    Develop ML model Deploy ML models in app dev process Implement Apps & Inference ML models Monitoring & Management App developer IT operations Data engineer Business leadership Data scientists ML Engineer
  23. Data Science with ODH Set goals Gather and prepare data

    Develop ML model Deploy ML models in app dev process Implement Apps & Inference ML models Monitoring & Management App developer IT operations Data engineer Business leadership Data scientists ML Engineer Seldon Jupyter Ceph Spark TensorFlow Kafka SuperSet Argo/Airflow/Tekton Hue Prometheus/Grafana Argo/Airflow/Tekton Ceph Kafka Seldon Middleware M odel to M icroservice
  24. Dog-Fooding ODH at Red Hat Application Logs Applications in the

    product release pipeline store their runtime logs in our system. These groups are also engaged for anomaly detection Cluster Metrics Operational metrics from OpenShift clusters. AIOps is engaged here. Customer Support Data Storage of customer data like SOSReports, customer feedback, etc.
  25. Analogy: Spark on ODH ODH JupyterHub Launcher

  26. Analogy: Spark on ODH ODH JupyterHub Launcher Jupyter Environment

  27. Analogy: Spark on ODH ODH JupyterHub Launcher Spark SingleUser Profile

    Jupyter Environment
  28. Analogy: Spark on ODH ODH JupyterHub Launcher Spark SingleUser Profile

    Spark Cluster Service Template Jupyter Environment
  29. Analogy: Spark on ODH Spark cluster ODH JupyterHub Launcher Spark

    SingleUser Profile Spark Cluster Service Template Jupyter Environment
  30. Analogy: Spark on ODH Spark cluster ODH JupyterHub Launcher Spark

    SingleUser Profile Spark Cluster Service Template Jupyter Environment
  31. Analogy: Spark on ODH Spark cluster Spark SingleUser Profile Spark

    Cluster Service Template ConfigMap ConfigMap
  32. Ray on ODH? Ray cluster ODH JupyterHub Launcher Ray SingleUser

    Profile Ray Cluster Service Template Jupyter Environment
  33. Ray Single User Profile

  34. Ray Cluster Service Template

  35. Demo: Ray on ODH! Ray cluster ODH JupyterHub Launcher Ray

    SingleUser Profile Ray Cluster Service Template Jupyter Environment
  36. Ray on ODH at the Mass-Open Cloud Led by Boston

    University, the MOC is a collaborative effort among BU, Harvard, UMass Amherst, MIT, and Northeastern University, as well as the Massachusetts Green High-Performance Computing Center (MGHPCC) and Oak Ridge National Laboratory (ORNL). It is supported by a broad alliance of industry partners, including Red Hat.
  37. Ray on MOC • Maximum 5 workers + 1 head

    • 1 CPU, 1 GB memory • Pre-installed:
  38. Operate First https://www.operate-first.cloud/ Developing Software In The Open Operating Software

    and Services In the Open
  39. Operate First PRs for Ray https://github.com/operate-first/support/issues/102

  40. Collaboration: IBM • Ray with Code Engine • Ray on

    IBM OpenShift Clusters • Scikit-Learn pipelines on Ray • Ray Use Cases ◦ Machine Learning Model Explorations ◦ Earth Science
  41. IBM Research at Ray Summit Raghu Ganti: Scaling and Unifying

    SciKit Learn and Spark Pipelines using Ray Linsong Chu: Serverless Earth Science Data Labeling using Unsupervised Deep Learning with Ray
  42. Roadmap • Community Ray Operator in Catalog • Maintain Ray

    Images via Project Thoth • Community Use Cases With Jupyter • Formal Integration With KF and ODH • KF Pipeline Nodes Backed by Ray
  43. Call To Action • Play with Ray on Jupyter up

    on MOC • File issues and PRs with op-1st • Report Back! eje@redhat.com https://www.operate-first.cloud/users/moc-ray-demo/README.md https://odh.operate-first.cloud/
  44. 1 1 π ≈ 4 Σ Σ( + )