Upgrade to Pro — share decks privately, control downloads, hide ads and more …

Kubeflow Kale: from Jupyter notebooks to Complex pipelines

Kubeflow Kale: from Jupyter notebooks to Complex pipelines

`Kubeflow Kale` lets you deploy Jupyter Notebooks that run on your laptop to Kubeflow Pipelines, without requiring any of the Kubeflow SDK boilerplate.

You can define pipelines by annotating notebook’s code cells and clicking a deployment button in the Jupyter UI.

Kale will take care of converting the Notebook to a valid Kubeflow Pipelines deployment, taking care of resolving data dependencies and managing the pipeline’s lifecycle.

Valerio Maggio

September 05, 2019
Tweet

More Decks by Valerio Maggio

Other Decks in Programming

Transcript

  1. “toy” ML Pipeline Data Loading Preprocessing Model 
 learning API

    Interface
 (Model Serving) Adapted from “What about tests in Machine Learning projects?” Sarah Diot-Girard - EuroSciPy 2019
  2. Data Analysis Data Transformation Data Validation Data Splitting Model Building

    Model Validation Train 
 at scale Roll-out Serving Monitoring Data Loading Logging Preprocessing Model Learning Model Serving Adapted from “Managing Machine Learning in Production with Kubeflow and MLOps” David Aronchick - KubeCon EU 2019
  3. Cowboys and Rangers can be friends Adapted from “Managing Machine

    Learning in Production with Kubeflow and MLOps” David Aronchick - KubeCon EU 2019
  4. State of the Art Technologies Support Machine Learning workloads in

    the Cloud Cloud
 Agnostic Polyaxon Kubeflow Kubernetes
  5. Kubernetes Kubernetes Master API Server Scheduler Controllers Cluster State Kubernetes

    Node1 Node2 Node3 Remote Configuration Cluster Settings Declarative API Kubernetes: Container Orchestration Engine Developed internally at Google and open sourced in 2014
  6. Extending Core Components Kubernetes Supported Languages: • GoLang • Python

    • Javascript • Java • Rust • … Remote Configuration Build custom component Kubernetes Master API Server Scheduler Controllers Cluster State CustomComp Automate Application specific services and tasks with custom components
  7. Polyaxon Platform for managing the whole lifecycle of large scale

    deep learning and machine learning applications. https://github.com/polyaxon/polyaxon
  8. Distributed Training Hyperparameter Tuning Model Serving Jupyter Notebook Kubernetes Kubeflow

    Infrastructure Infrastructure management via declarative API Kubeflow: Developed internally at Google and open sourced in 2016 https://github.com/kubeflow/kubeflow Kubeflow Kubernetes Components for Machine Learning
  9. Distributed DL Training + Example of distributed deep learning training

    on Kubeflow Worker1 TF Distributed Training model.py DL Training Controller Cluster Settings Deploy Kubernetes Master API Server Similar process for other DL frameworks: PyTorch, MXNet, Caffe, … Worker2 Parameter Server User provided model
  10. Introducing Kubeflow Pipelines: announced by Google Cloud. 
 A platform

    for building and deploying portable and scalable end-to-end ML workflows, based on containers. pipelines The Kubeflow Pipelines platform consists of: • User interface for managing and tracking experiments, jobs, and runs • Engine for scheduling multi-step ML workflows • SDK for defining and manipulating pipelines and components • Jupyter Notebooks for interacting with the system using the SDK • Integration with the other tools in the Kubeflow Toolkit 
 (es: tf-operator for distributed training) https://github.com/kubeflow/pipelines/
  11. @dsl.pipeline( name=‘ML Workflow', ) def xgb_train_pipeline( output, project, region='us-central1', train_data='gs://ml-pipeline-playground/sfpd/train.csv',

    …, ): with dsl.ExitHandler(exit_op=delete_cluster_op): create_cluster_op = CreateClusterOp('create-cluster', project, region, output) analyze_op = AnalyzeOp('analyze', project, region, create_cluster_op.output, schema, train_data, '%s/{{workflow.name}}/analysis' % output) transform_op = TransformOp('transform', project, region, create_cluster_op.output, train_data, eval_data, target, analyze_op.output, '%s/{{workflow.name}}/transform' % output) train_op = TrainerOp('train', project, region, create_cluster_op.output, transform_op.outputs['train'],transform_op.outputs['eval'], target, analyze_op.output, workers, rounds, '%s/{{workflow.name}}/model' % output) predict_op = PredictOp('predict', project, region, create_cluster_op.output, transform_op.outputs['eval'], train_op.output, target, analyze_op.output, '%s/{{workflow.name}}/predict' % output) cm_op = ConfusionMatrixOp('confusion-matrix', predict_op.output, '%s/{{workflow.name}}/confusionmatrix' % output) roc_op = RocOp('roc', predict_op.output, true_label, '%s/{{workflow.name}}/roc' % output)
  12. KALE Abstracting from KubeFlow Pipelines SDK Kale Kale From (local)

    Jupyter Notebooks to (remote) Kubeflow Pipelines Kubeflow
  13. Jupyter Notebook to KFP Local Jupyter Notebook On cloud Kubeflow

    Pipeline Kale transpose matmul create-matrices
  14. nbparser static_analyzer marshal codegen Kale main components Kale Derive pipeline

    structure Identify dependencies Inject data objects Generate & Deploy Pipeline networkx Pyflakes Odo dill Jinja2 support for HyperParam Tuning
  15. static_analysis Pipeline Step create-matrices Pipeline Step transpose Pipeline Step matmul

    Kale Pipeline Step create-matrices Pipeline Step transpose Pipeline Step matmul marshal