Upgrade to Pro — share decks privately, control downloads, hide ads and more …

Kubeflow: portable and scalable machine learnin...

Kubeflow: portable and scalable machine learning using Jupyterhub and Kubernetes [PyData Delhi 2018]

ML solutions in production start from data ingestion and extend upto the
actual deployment step. We want this workflow to be scalable, portable
and simple. Containers and kubernetes are great at the former two but not
the latter if you aren't a devops practitioner. We'll explore how you can
leverage the Kubeflow project to deploy best-of-breed open-source systems
for ML to diverse infrastructures.

Akash Tandon

August 11, 2018
Tweet

More Decks by Akash Tandon

Other Decks in Programming

Transcript

  1. Agenda - Need of DevOps for ML and Data Science

    (DataOps) - Containers and Kubernetes for ML - Opportunities and challenges - Kubeflow: composable, portable and scalable ML - Components - Low bar, high ceiling - Issues and roadmap - Summary and demo
  2. DataOps - DevOps in Data Science and ML DataOps is

    an automated, process-oriented methodology, used by analytic and data teams to improve the quality and reduce the cycle time of data analytics. DataOps manifesto: http://dataopsmanifesto.org
  3. Containers • Containers allow you to easily package an application's

    code, configurations, and dependencies into easy to use building blocks. • These building blocks deliver environmental consistency, operational efficiency, developer productivity, and version control. • To put it simply, your code runs in any environment!
  4. Kubernetes • Kubernetes is an orchestration manager for containers. •

    It orchestrates computing, network and storage. • Simply put, it makes your life easier when working with containers.
  5. Steep DevOps learning curve • Containers • Kubernetes primitives •

    Persistent storage • APIs • Cloud platforms • and it goes on...
  6. Kubeflow • ML toolkit for Kubernetes • Open-source and community-driven

    • Support for multiple ML frameworks • End-to-end workflows which can be shared, scaled and deployed Source: https://github.com/kubeflow/kubeflow/issues/187
  7. Low bar, high ceiling • Low bar: allow data science

    practitioners to get up and running on Kubernetes cluster even without DevOps know-how. • High ceiling: allow sysdmins and DevOps practitioners to modify defaults and extend the framework as needed.
  8. Components • Jupyterhub (collaboration and interactivity) • K8s- native tensorflow

    controller (model building) • K8s- native tensorflow serving deployment (model deployment) • Ambassador (reverse proxy) • Current and upcoming components for model tuning, model building and much more... • Out-of-the-box setup for putting all of this together!
  9. Tensorflow - Open source numerical computing and ML - Developed

    by Google, open-sourced in 2015 - Huge community and ecosystem - Support for multiple ML models - Tf-serving (model deployment), tensorboard (training visualization), etc. - Supports distributed training and deployment of models
  10. Why Kubeflow? Based on current functionality you should consider using

    Kubeflow if: • You want to train/serve TensorFlow models in different environments (e.g. local, on prem, and cloud) • You want to use Jupyter notebooks to manage TensorFlow training jobs • You want to launch training jobs that use resources – such as additional CPUs or GPUs – that aren’t available on your personal computer • You want to combine TensorFlow with other processes ◦ For example, you may want to use tensorflow/agents to run simulations to generate data for training reinforcement learning models. Refer https://www.kubeflow.org/docs/started/getting-started/ for more info.
  11. Demo - Kubeflow tutorial using a sequence-to-sequence model - Based

    on Hamel Husain’s wonderful post: How to create data products that are magical using sequence-to-sequence models - Github repo: https://github.com/kubeflow/examples/tree/master/github_issue_summarization - Let’s get started!
  12. Road ahead - Get the entry (bar)rier lower - Multi-tenancy

    on Kubernetes - Support for different ML libraries/packages - PyTorch - Caffe2 - Mxnet - v1.0 to be launched by December 2018