Kubeflow Kale: from Jupyter notebooks to Complex pipelines

Kubeﬂow Kale @leriomaggio [email protected] @leriomaggio Valerio Maggio Scaling Jupyter Notebooks
to Complex Pipelines Stefano Fioravanzo

ML + Dev + Ops

Reference for the Introductory (MLOps) part

ML is hard!

“toy” ML Pipeline Data Loading Preprocessing Model   learning API
Interface  (Model Serving) Adapted from “What about tests in Machine Learning projects?” Sarah Diot-Girard - EuroSciPy 2019

Data Analysis Data Transformation Data Validation Data Splitting Model Building
Model Validation Train   at scale Roll-out Serving Monitoring Data Loading Logging Preprocessing Model Learning Model Serving Adapted from “Managing Machine Learning in Production with Kubeﬂow and MLOps” David Aronchick - KubeCon EU 2019

Adapted from “Managing Machine Learning in Production with Kubeﬂow and
MLOps” David Aronchick - KubeCon EU 2019

Cowboys and Rangers can be friends Adapted from “Managing Machine
Learning in Production with Kubeﬂow and MLOps” David Aronchick - KubeCon EU 2019

Adapted from “Managing Machine Learning in Production with Kubeﬂow and
MLOps” David Aronchick - KubeCon EU 2019

ML/DL Specialised Infrastructure 1. Specialised Chipset 2. Specialised Software 3.
(Specialised?) Computing Platform

State of the Art Technologies Support Machine Learning workloads in
the Cloud Cloud  Agnostic Polyaxon Kubeﬂow Kubernetes

Kubernetes Kubernetes Master API Server Scheduler Controllers Cluster State Kubernetes
Node1 Node2 Node3 Remote Conﬁguration Cluster Settings Declarative API Kubernetes: Container Orchestration Engine Developed internally at Google and open sourced in 2014

Extending Core Components Kubernetes Supported Languages: • GoLang • Python
• Javascript • Java • Rust • … Remote Conﬁguration Build custom component Kubernetes Master API Server Scheduler Controllers Cluster State CustomComp Automate Application speciﬁc services and tasks with custom components

Polyaxon Platform for managing the whole lifecycle of large scale
deep learning and machine learning applications. https://github.com/polyaxon/polyaxon

Distributed Training Hyperparameter Tuning Model Serving Jupyter Notebook Kubernetes Kubeflow
Infrastructure Infrastructure management via declarative API Kubeflow: Developed internally at Google and open sourced in 2016 https://github.com/kubeflow/kubeflow Kubeflow Kubernetes Components for Machine Learning

Distributed DL Training + Example of distributed deep learning training
on Kubeﬂow Worker1 TF Distributed Training model.py DL Training Controller Cluster Settings Deploy Kubernetes Master API Server Similar process for other DL frameworks: PyTorch, MXNet, Caﬀe, … Worker2 Parameter Server User provided model

Introducing Kubeflow Pipelines: announced by Google Cloud.   A platform
for building and deploying portable and scalable end-to-end ML workflows, based on containers. pipelines The Kubeflow Pipelines platform consists of: • User interface for managing and tracking experiments, jobs, and runs • Engine for scheduling multi-step ML workflows • SDK for defining and manipulating pipelines and components • Jupyter Notebooks for interacting with the system using the SDK • Integration with the other tools in the Kubeflow Toolkit   (es: tf-operator for distributed training) https://github.com/kubeflow/pipelines/

@dsl.pipeline( name=‘ML Workflow', ) def xgb_train_pipeline( output, project, region='us-central1', train_data='gs://ml-pipeline-playground/sfpd/train.csv',
…, ): with dsl.ExitHandler(exit_op=delete_cluster_op): create_cluster_op = CreateClusterOp('create-cluster', project, region, output) analyze_op = AnalyzeOp('analyze', project, region, create_cluster_op.output, schema, train_data, '%s/{{workflow.name}}/analysis' % output) transform_op = TransformOp('transform', project, region, create_cluster_op.output, train_data, eval_data, target, analyze_op.output, '%s/{{workflow.name}}/transform' % output) train_op = TrainerOp('train', project, region, create_cluster_op.output, transform_op.outputs['train'],transform_op.outputs['eval'], target, analyze_op.output, workers, rounds, '%s/{{workflow.name}}/model' % output) predict_op = PredictOp('predict', project, region, create_cluster_op.output, transform_op.outputs['eval'], train_op.output, target, analyze_op.output, '%s/{{workflow.name}}/predict' % output) cm_op = ConfusionMatrixOp('confusion-matrix', predict_op.output, '%s/{{workflow.name}}/confusionmatrix' % output) roc_op = RocOp('roc', predict_op.output, true_label, '%s/{{workflow.name}}/roc' % output)

Kubeﬂow Python SDK Standalone Python function Create Lightweight Components as
Python Standalone functions transpose SDK

create-matrices transpose matmul Authoring Pipelines SDK

KALE (/ˈkeɪliː/) Kubeﬂow Automated PipeLines Engine https://kubeﬂow-kale.github.io

KALE Abstracting from KubeFlow Pipelines SDK Kale Kale From (local)
Jupyter Notebooks to (remote) Kubeﬂow Pipelines Kubeﬂow

Jupyter Notebook to KFP Local Jupyter Notebook On cloud Kubeﬂow
Pipeline Kale transpose matmul create-matrices

Kale recognised list of tags https://github.com/kubeﬂow-kale/jupyterlab-kubeﬂow-kale Jupyter-lab extension

nbparser static_analyzer marshal codegen Kale main components Kale Derive pipeline
structure Identify dependencies Inject data objects Generate & Deploy Pipeline networkx Pyflakes Odo dill Jinja2 support for HyperParam Tuning

static_analysis Pipeline Step create-matrices Pipeline Step transpose Pipeline Step matmul
Kale Pipeline Step create-matrices Pipeline Step transpose Pipeline Step matmul marshal

Marshalling PatternDispatcher TypeDispatcher

codegen Sample template used to generate a standalone function Kale
Code generation via templates

Examples https://github.com/kubeﬂow-kale/kale/tree/master/examples https://github.com/kubeﬂow-kale/examples

Thank you very much   for your kind attention @leriomaggio
[email protected] @leriomaggio https://kubeﬂow-kale.github.io

Kubeflow Kale: from Jupyter notebooks to Comple...

Kubeflow Kale: from Jupyter notebooks to Complex pipelines

Valerio Maggio

More Decks by Valerio Maggio

Other Decks in Programming

Featured

Transcript

Kubeﬂow Kale @leriomaggio [email protected] @leriomaggio Valerio Maggio Scaling Jupyter Notebooks

ML + Dev + Ops

Reference for the Introductory (MLOps) part

ML is hard!

“toy” ML Pipeline Data Loading Preprocessing Model   learning API

Data Analysis Data Transformation Data Validation Data Splitting Model Building

Adapted from “Managing Machine Learning in Production with Kubeﬂow and

Cowboys and Rangers can be friends Adapted from “Managing Machine

Adapted from “Managing Machine Learning in Production with Kubeﬂow and

ML/DL Specialised Infrastructure 1. Specialised Chipset 2. Specialised Software 3.

State of the Art Technologies Support Machine Learning workloads in

Kubernetes Kubernetes Master API Server Scheduler Controllers Cluster State Kubernetes

Extending Core Components Kubernetes Supported Languages: • GoLang • Python

Polyaxon Platform for managing the whole lifecycle of large scale

Distributed Training Hyperparameter Tuning Model Serving Jupyter Notebook Kubernetes Kubeﬂow

Distributed DL Training + Example of distributed deep learning training

Introducing Kubeﬂow Pipelines: announced by Google Cloud.   A platform

@dsl.pipeline( name=‘ML Workflow', ) def xgb_train_pipeline( output, project, region='us-central1', train_data='gs://ml-pipeline-playground/sfpd/train.csv',

Kubeﬂow Python SDK Standalone Python function Create Lightweight Components as

create-matrices transpose matmul Authoring Pipelines SDK

KALE (/ˈkeɪliː/) Kubeﬂow Automated PipeLines Engine https://kubeﬂow-kale.github.io

KALE Abstracting from KubeFlow Pipelines SDK Kale Kale From (local)

Jupyter Notebook to KFP Local Jupyter Notebook On cloud Kubeﬂow

Kale recognised list of tags https://github.com/kubeﬂow-kale/jupyterlab-kubeﬂow-kale Jupyter-lab extension

nbparser static_analyzer marshal codegen Kale main components Kale Derive pipeline

static_analysis Pipeline Step create-matrices Pipeline Step transpose Pipeline Step matmul

Marshalling PatternDispatcher TypeDispatcher

codegen Sample template used to generate a standalone function Kale

Examples https://github.com/kubeﬂow-kale/kale/tree/master/examples https://github.com/kubeﬂow-kale/examples

Thank you very much   for your kind attention @leriomaggio