Introduction_to_Apache_Submarine

Introduction to Apache Submarine Cloud-Native Machine Learning Platform

About me Wen-Chih (Ryan) Lo Apache Submarine committer Master’s student
at NCKU Contact me lowc1012@gmail.com 2

Agenda ◦ What is MLOps? ◦ Apache Submarine Intro ◦
Community & Collaboration ◦ Release Plan ◦ Q & A 3

What is MLOps ? 1 4

“ It’s similar to DevOps ◦ Robust automation to shorten
the development lifecycle (build, test, release...) ◦ Dependable releases ◦ Monitoring ◦ etc... but there is a critical difference ” data science” 5

Adapt DevOps to Machine Learning (MLOps = ML + Dev
+ Ops) Dev ML Ops 6 Experiment Data acquisition Business Understanding Initial Modeling Development Modeling + Testing Continuous Integration Continuous Deployment Operations Continuous Delivery Data Feedback Loop System + Model Monitoring

Challenges of using Machine Learning 7 Configuration Data Verification Data
Collection Feature Extraction Data Analysis Process Management Tools ML Code Serving Infrastructure Monitoring Machine Resource Management Hidden Technical Debt in Machine Learning Systems, Google (NIPS’ 15)

8 Google researchers analyze 3,000 production ML pipelines, comprising over
450,000 models trained. In consistent with wide understating regarding ML is mostly about the training. Production Machine Learning Pipelines: Empirical Analysis and Optimization Opportunities (SIGMOD’ 21)

Challenges of using Machine Learning (cont.) ◦ Training-Serving Skew Ø
The difference between performance during training and performance during serving ◦ Model Drift Ø The performance of a model deployed to production deteriorates on new, unseen data or the underlying assumptions about the data change 9

Roles and Responsibilities Data Engineer ◦ Build, maintain and optimize
data pipelines ◦ Big data infra, SQL/NoSQL, ETL tools, Message queue (e.g. Kafka), etc. ◦ Support data scientists with data related requirements Data Scientist ◦ Data and algorithms exploration ◦ Build ML models that address the business question ◦ Python, R, ML / DL, CV, NLP, etc. ◦ Not familiar with platform and infrastructure stuffs ML infra engineer ◦ Build and scale machine learning infrastructure ◦ Work with ML pipelines and products ◦ Monitor model performance in production ◦ Distributed System, DevOps, System design, etc. 10 Deploy & monitor Training Build models Data preparation Data extraction

Netflix Tech Blog https://netflixtechblog.com/open-sourcing-metaflow-a-human-centric-framework-for-data-science-fa72e04a5d9 11

“ Data scientists want to focus on model and quickly
test out how they work in the production. 12

What does Data Scientist expect? ◦ Data / Model exploration
and analysis Ø Experiment using sampled dataset with notebooks or full dataset to get best results. ◦ Reproducible experiment Ø Record parameters, code and metrics of experiment Ø Dependency management, coding once, run everywhere ◦ Model management Ø Automated model packaging and delivery for easy deployment to production Ø Visibility into the performance of models 13

Apache Submarine Intro 2 14

Apache Submarine (Top-Level project) The following are the goals :
◦ Data scientists can create end-to-end ML workflows that cover each stage in ML model lifecycle. ◦ Easy for data scientists to manage versions of experiment and dependencies of environment. ◦ Support popular machine learning frameworks including PyTorch and TensorFlow. ◦ Run and track distributed training experiments via ease-to-use UI. ◦ Support Kubernetes/Hadoop YARN and many compute resources (e.g. CPU and GPU) 15

Features of Apache Submarine Experiment Submarine comes with Python SDK
which enables users to run TensorFlow/PyTorch experiments from notebook. Environment Users can save different environment profile that define a set of libraries and a Docker image to run an experiment or a notebook instance. Notebook Enable users launch their own Jupyter Notebook instances in the same cluster. Easy Install Just one simple command to install Submarine on an existing K8s cluster with Helm. Model management Integrating with MLflow for tracking experiments and model registry. (coming soon) Web UI Submarine workbench provide a Data-Scientist-friendly UI to make data scientists have a good user experience. 16

Submarine Workbench (Experiment) 17

Submarine Environment ◦ Manage Docker/Conda environments, and Git integration for
experiments/notebooks 22

Submarine Notebook Support ◦ Enable users to launch Jupyter notebook
instances in the same cluster. ◦ Choose different Environment profiles (Docker/conda kernel) when you launch notebook instances. 23

Submarine Python SDK ◦ The Submarine Python SDK includes experiment
package that makes user easy to run distributed or non-distributed TensorFlow / PyTorch experiments. ◦ Release 0.6.0 will provide model package to support model management initially. 24

Community & collaboration 3 25

Submarine community ◦ Diverse community Contributors are from Cloudera, DiDi,
NetEase, JD.com, NTU, NCKU, NTCU, etc. ◦ Growing project There are 49 contributors including 9 PMCs and 28 committers. ◦ Beginner-friendly You can raise an issue on GitHub or WeChat (communicating in Chinese) even if it is a small and minor problem. 26 Special thanks to all the people who contribute to the project! GitHub: https://github.com/apache/submarine Zoom: https://cloudera.zoom.us/j/880548968 (Every week 12:40 PM GMT+8 on Tuesdays) Jira: https://issues.apache.org/jira/projects/SUBMARIN E/issues/

Release Plan 4 27

Roadmap 28 1 3 5 6 4 2 Initial release
0.1.0 Support TensorFlow distributed training on YARN (Jan, 2019) Release 0.3.0 Provide mini-submarine and basic K8s support (Feb, 2020) Release 0.5.0 Support notebook service, environment profile and UI (Dec, 2020) Release 0.2.0 Support PyTorch and integrated with TonY (Jul, 2019) Release 0.4.0 Fully support training TensorFlow and PyTorch on K8s (Jul, 2020) Release 0.6.0 Support model management initially. Refactor Submarine Operator and CI/CD. (Coming soon)

Thank you! Q & A You can find me at
lowc1012@gmail.com 29

Introduction_to_Apache_Submarine

Introduction_to_Apache_Submarine

Hazel Shen

More Decks by Hazel Shen

Other Decks in Technology

Featured

Transcript