Slide 1

Slide 1 text

MLflow and the Machine Learning Lifecycle 1 Giulia BIANCHI Fares OUESLATI Loïc DIVAD @XebiaFr

Slide 2

Slide 2 text

Giulia BIANCHI Data Scientist @XebiaFr @Giuliabianchl Fares OUESLATI ML Engineer @XebiaFr @fares_oueslati Loïc DIVAD Software Engineer @XebiaFr @LoicMDivad 2

Slide 3

Slide 3 text

3 @Xebiconfr #Xebicon18 @Giuliabianchl

Slide 4

Slide 4 text

Raw data Model training Model serving Data processing 4 @Xebiconfr #Xebicon18 @Giuliabianchl

Slide 5

Slide 5 text

5 tools tuning scale model exchange @Xebiconfr #Xebicon18 @Giuliabianchl

Slide 6

Slide 6 text

6 @Xebiconfr #Xebicon18 @Giuliabianchl

Slide 7

Slide 7 text

7 @Xebiconfr #Xebicon18 @Giuliabianchl

Slide 8

Slide 8 text

8 @Xebiconfr #Xebicon18 @Giuliabianchl

Slide 9

Slide 9 text

9 @Xebiconfr #Xebicon18 @Giuliabianchl

Slide 10

Slide 10 text

A simple Machine Learning example 10 @Xebiconfr #Xebicon18 @Giuliabianchl

Slide 11

Slide 11 text

● Dataset from scikit-learn ● Predict diabetes progression given ○ Age ○ Sex ○ Body Mass Index (BMI - indice de masse corporelle) ○ Blood Pressure (BP - tension) ○ Other blood measurements 11 @Xebiconfr #Xebicon18 @Giuliabianchl LARS paper

Slide 12

Slide 12 text

12 @Xebiconfr #Xebicon18 @Giuliabianchl tuning

Slide 13

Slide 13 text

Before MLflow 13 @Xebiconfr #Xebicon18 @Giuliabianchl

Slide 14

Slide 14 text

14 @Xebiconfr #Xebicon18 @Giuliabianchl

Slide 15

Slide 15 text

15 @Xebiconfr #Xebicon18 @Giuliabianchl import mlflow import mlflow.sklearn with mlflow.start_run(): model = Model(param_1, param_2) model.fit(train_data, label) prediction = model.predict(test_data) (rmse, mae, r2) = eval_metrics(test_label, prediction) mlflow.log_param("param_1", param_1) mlflow.log_param("param_2", param_2) mlflow.log_metric("rmse", rmse) mlflow.log_metric("r2", r2) mlflow.log_metric("mae", mae) mlflow.sklearn.log_model(model, "model")

Slide 16

Slide 16 text

Tracking Server Local Apps Notebooks Cloud Jobs UI 16 @Xebiconfr #Xebicon18 @Giuliabianchl _ REST API

Slide 17

Slide 17 text

17 @Xebiconfr #Xebicon18 @fares_oueslati scale

Slide 18

Slide 18 text

18 @Xebiconfr #Xebicon18 @fares_oueslati

Slide 19

Slide 19 text

19 @Xebiconfr #Xebicon18 @fares_oueslati # MLproject file name: My Project conda_env: my_env.yaml entry_points: main: parameters: data_file: regularization: {type: float, default: 0.1} command: "python train.py -r {regularization} {data_file}" validate: parameters: data_file: command: "python validate.py {data_file}"

Slide 20

Slide 20 text

20 @Xebiconfr #Xebicon18 @fares_oueslati # Remote Run $ mlflow run [email protected]:mlflow/mlflow-example.git -P alpha=0.5 # Local Run $ mlflow run . -P alpha=0.5

Slide 21

Slide 21 text

21 @Xebiconfr #Xebicon18 @fares_oueslati model exchange

Slide 22

Slide 22 text

22 @Xebiconfr #Xebicon18 @fares_oueslati

Slide 23

Slide 23 text

Model Format flavor 2 flavor 1 Cloud Serving Tools Batch & Stream Scoring Inference Code 23 ML Frameworks @Xebiconfr #Xebicon18 @fares_oueslati

Slide 24

Slide 24 text

24 @Xebiconfr #Xebicon18 @fares_oueslati # MLmodel file artifact_path: model flavors: python_function: data: model.pkl loader_module: mlflow.sklearn python_version: 2.7.15 sklearn: pickled_model: model.pkl sklearn_version: 0.19.2 run_id: 32357c10ae854113b0503e880e7433c1 utc_time_created: '2018-10-29 11:31:01.434417' Usable by any tool that can run Python (Docker, Spark etc.) Usable by tools that understand Sklearn model format

Slide 25

Slide 25 text

MLflow built-in flavors pyfunc rfunc h2o keras sklearn spark mleap pytorch tensorflow 25 @Xebiconfr #Xebicon18 @fares_oueslati

Slide 26

Slide 26 text

Generic & self-contained flavor that describes how to run the model as a lambda function pyfunc rfunc 26 h2o keras sklearn spark mleap pytorch tensorflow @Xebiconfr #Xebicon18 @fares_oueslati

Slide 27

Slide 27 text

MLflow built-in flavors pyfunc rfunc h2o keras sklearn spark mleap pytorch tensorflow 27 @Xebiconfr #Xebicon18 @fares_oueslati

Slide 28

Slide 28 text

package Azureml package Sagemaker package Sparkml 28 @Xebiconfr #Xebicon18 @fares_oueslati

Slide 29

Slide 29 text

MLflow API example: sklearn flavours mlflow flavor api \ save_model \ log_model \ load_model 29 @Xebiconfr #Xebicon18 @fares_oueslati

Slide 30

Slide 30 text

30 @Xebiconfr #Xebicon18 @LoicMDivad

Slide 31

Slide 31 text

The MLflow project structure $ tree . ├── R ├── azureml ├── entities ├── java ├── models ├── projects ├── protos ├── sagemaker ├── server ├── store ├── tracking └── utils 31 @Xebiconfr #Xebicon18 @LoicMDivad

Slide 32

Slide 32 text

The three main modules are materialized by python packages $ tree . ├── R ├── azureml ├── entities ├── java ├── models ├── projects ├── protos ├── sagemaker ├── server ├── store ├── tracking └── utils The MLflow project structure 32 @Xebiconfr #Xebicon18 @LoicMDivad

Slide 33

Slide 33 text

$ tree . ├── R ├── azureml ├── entities ├── java ├── models ├── projects ├── protos ├── sagemaker ├── server ├── store ├── tracking └── utils The MLflow project structure Managed solutions from cloud providers have their own package The three main modules are materialized by python packages 33 @Xebiconfr #Xebicon18 @LoicMDivad

Slide 34

Slide 34 text

Other programming languages than python have their subproject $ tree . ├── R ├── azureml ├── entities ├── java ├── models ├── projects ├── protos ├── sagemaker ├── server ├── store ├── tracking └── utils Managed solutions from cloud providers have their own package The three main modules are materialized by python packages The MLflow project structure 34 @Xebiconfr #Xebicon18 @LoicMDivad

Slide 35

Slide 35 text

Other programming languages than python have their subproject $ tree . ├── R ├── azureml ├── entities ├── java ├── models ├── projects ├── protos ├── sagemaker ├── server ├── store ├── tracking └── utils Managed solutions from cloud providers have their own package The three main modules are materialized by python packages The MLflow project structure 35 @Xebiconfr #Xebicon18 @LoicMDivad

Slide 36

Slide 36 text

Experiment Run Param RunData Metric MLflowObject 36 @Xebiconfr #Xebicon18 @LoicMDivad

Slide 37

Slide 37 text

Experiment Run Param RunData Metric 37 @Xebiconfr #Xebicon18 @LoicMDivad artifact_path: model flavors: python_function: data: model.pkl loader_module: mlflow.sklearn sklearn: pickled_model: model.pkl sklearn_version: 0.19.1 run_id: cf5db2cc7c0d4074bbccd970d912e1c8 utc_time_created: '2018-07-28 15:49:49.055985' MLflowObject

Slide 38

Slide 38 text

Experiment Run Param RunData Metric 38 @Xebiconfr #Xebicon18 @LoicMDivad artifact_path: model flavors: python_function: data: model.pkl loader_module: mlflow.sklearn sklearn: pickled_model: model.pkl sklearn_version: 0.19.1 run_id: cf5db2cc7c0d4074bbccd970d912e1c8 utc_time_created: '2018-07-28 15:49:49.055985' MLflowObject

Slide 39

Slide 39 text

$ tree . ├── azureml ├── entities ├── ... ├── java └── store ├── abstract_store.py ├── artifact_repo.py ├── azure_blob_artifact_repo.py ├── dbfs_artifact_repo.py ├── file_store.py ├── gcs_artifact_repo.py ├── local_artifact_repo.py ├── rest_store.py ├── s3_artifact_repo.py └── sftp_artifact_repo.py 39 @Xebiconfr #Xebicon18 @LoicMDivad

Slide 40

Slide 40 text

Experiment Metadata ● Runs ● Parameters ● Metrics ... Large artefacts ● Datasets ● Models ● Images ... ArtifactRepository AbstractStore FileStore RestStore S3 GCS DBFS ... 40 @Xebiconfr #Xebicon18 @LoicMDivad

Slide 41

Slide 41 text

if artifact_uri.startswith("s3:/"): import S3ArtifactRepository elif artifact_uri.startswith("gs:/"): import GCSArtifactRepository elif artifact_uri.startswith("wasbs:/"): import AzureBlobArtifactRepository elif artifact_uri.startswith("sftp:/"): import SFTPArtifactRepository elif artifact_uri.startswith("dbfs:/"): import DbfsArtifactRepository else: import LocalArtifactRepository 41 @Xebiconfr #Xebicon18 @LoicMDivad

Slide 42

Slide 42 text

if artifact_uri.startswith("s3:/"): import S3ArtifactRepository elif artifact_uri.startswith("gs:/"): import GCSArtifactRepository elif artifact_uri.startswith("wasbs:/"): import AzureBlobArtifactRepository elif artifact_uri.startswith("sftp:/"): import SFTPArtifactRepository elif artifact_uri.startswith("dbfs:/"): import DbfsArtifactRepository else: import LocalArtifactRepository 42 from google.cloud import storage as gcs_storage # or from azure.storage.blob import BlockBlobService # requires # GOOGLE_APPLICATION_CREDENTIALS # AWS_SECRET_ACCESS_KEY # ETC ... @Xebiconfr #Xebicon18 @LoicMDivad

Slide 43

Slide 43 text

if artifact_uri.startswith("s3:/"): import S3ArtifactRepository elif artifact_uri.startswith("gs:/"): import GCSArtifactRepository elif artifact_uri.startswith("wasbs:/"): import AzureBlobArtifactRepository elif artifact_uri.startswith("sftp:/"): import SFTPArtifactRepository elif artifact_uri.startswith("dbfs:/"): import DbfsArtifactRepository else: import LocalArtifactRepository 43 from google.cloud import storage as gcs_storage # or from azure.storage.blob import BlockBlobService # requires # GOOGLE_APPLICATION_CREDENTIALS # AWS_SECRET_ACCESS_KEY # ETC ... @Xebiconfr #Xebicon18 @LoicMDivad

Slide 44

Slide 44 text

. └── EXPERIMENT-0 │ ├── RUN-4a59d6 │ ├── artifacts │ ├── meta.yaml │ ├── metrics │ └── params │ │ └── RUN-99d663 ├── artifacts ├── meta.yaml ├── metrics └── params Server Tracking Store 44 @Xebiconfr #Xebicon18 @LoicMDivad

Slide 45

Slide 45 text

. └── EXPERIMENT-0 │ ├── RUN-4a59d6 │ ├── artifacts │ ├── meta.yaml │ ├── metrics │ └── params │ │ └── RUN-99d663 ├── artifacts ├── meta.yaml ├── metrics └── params Server Tracking Store 45 $ mlflow server Launch a Flask server with four workers by default. It receives Byte Streams from client API and serializes the result in an artifact repo @Xebiconfr #Xebicon18 @LoicMDivad

Slide 46

Slide 46 text

1. Give access to MLflow for all JVM users 2. A CRUD interface to MLflow available as Maven artifact 3. Come with MLeap flavour to save a Spark model in a SparkML format or MLeap format 46 @Xebiconfr #Xebicon18 @LoicMDivad MlflowClient client = new MlflowClient(); long expId = client.createExperiment(expName); // ... RunInfo runCreated = client.createRun(expId, sourceFile); client.logParam(runId, "min_samples_leaf", "2"); client.logParam(runId, "max_depth", "3"); // Log metrics client.logMetric(runId, "auc", 2.12F); client.logMetric(runId, "accuracy_score", 3.12F); client.logMetric(runId, "zero_one_loss", 4.12F); // Finished run client.setTerminated(runId, RunStatus.FINISHED);

Slide 47

Slide 47 text

MLflow v0.8.0 is still in alpha: ➢ Load data from diverse formats (e.g. CSV vs Parquet) ➢ Database backend tracking store ➢ Common hyperparameters tuning libraries integration ➢ Built-in Spark MLlib & PyTorch integration ➢ Support HDFS Artifact Repository ➢ ... 47 @Xebiconfr #Xebicon18 @LoicMDivad

Slide 48

Slide 48 text

➢ MLflow allows to keep track of results and make them reproducible ○ So you iterate faster and run through the machine learning life cycle ➢ The goal of MLflow is to make it easier to switch between tools ➢ MLflow is open source and open interface solution ➢ The Machine Learning platform is tool to unify Data Science and Engineering 48 @Xebiconfr #Xebicon18 @LoicMDivad

Slide 49

Slide 49 text

49