GREAT YOU BUILT THE MODEL NOW WHAT?

Dipika Baad 11 Feb 2021 Great you built the model
now what?

Agenda • Expectations Vs Reality • Roles around MLOPs •
ML in production

• Consultant at Netlight • Masters in Data Science •
Love to share ideas and write blogs Some background on Me

How many of you have built a model?

How many of you have deployed it in Cloud/Server?

How many of you are maintaining the model in production?

Let’s talk about numbers • Google Search Results • How
to build ML model blog ->   193,000,000 • How to do MLOPs in production blog ->   45,400

How did my school life looked like? • Take open
dataset & Clean = %10 • Train data using cool & new models ( try few models manually ) = 80% • Comparing metrics = 10% 80 % 10 % 10 % Compare Clean ML

How does my life look like now? • SQL SQL
SQL SQL 30% • Clean Clean Clean Clean and more Clean 20% • Plan the MLOPs Pipeline, reading GDPR rules, etc. 40% • Finally start experimenting models 5% • Choose which experiments go for actual production 5% 40 % 30 % 20 % 5 % 5 % Compare ML Clean SQL MLOps

How does a generic MLOPs look like?

What is MLOPs? - MLOps is DevOps principles applied to
ML Systems. https://cloud.google.com/solutions/machine-learning/mlops-continuous-delivery-and-automation-pipelines-in-machine-learning - Entails implementing automation and monitoring at all steps of ML system construction, including integration, testing, releasing, deployment and infrastructure management - With ML Systems CT ( Continuous retraining and serving model) is new and also the CI part involves testing of the data schema, models etc. on top of usual unit testing. - DevOps is a popular practice in developing and operating large-scale   software systems. It involves CI (Continuous Integration) and   CD ( Continuous Delivery)

Combo of DevOps, Data & ML Pipeline Source Code CI:
Test, build & package pipeline components CD: Pipeline Deployment Data Extraction Data Preparation Model Training Model Evaluation CD: Model Serving Data Pipeline ML Pipeline 1 2 3 4 https://cloud.google.com/solutions/machine-learning/mlops-continuous-delivery-and-automation-pipelines-in-machine-learning

What roles are involved in this all?

One cannot exist without the other • Data Engineering •
Machine learning models • ML needs big data (No brainer) • Big Data processing needs scalable and reliable pipelines to process ( Well hello Data Engineer) • MLOPs pipeline needs to automate the Preprocessing of data for training and at the time of prediction ( ML is magic only if data is not garbage) • ML Pipelines are needed to keep ML up to date and easy integration ( Who needs a Stale Model? )

What is it like to be Data Engineer? • Be
Reeeaaaal goooood at SQL • Develops code that preprocesses data e ffi ciently ( Pandas, PySpark etc. ) • Be familiar with data pipeline products that can scale ( Air fl ow, Apache NiFi etc. ) • Setting up data 24*7 monitoring • Know how to set up DevOps pipeline or at least be able to work with it.

Data Monster’s Public Image :P

What is it like to be ML Engineer? • Builds
reliable ML Pipeline with training, testing and prediction automated on changes to code or data • Know how to use di ff erent data versioning tools and Machine learning pipeline technologies ( Sagemaker, DVC, Splitsgraph, etc. ) • Build ML models and know various frameworks ( PyTorch, Tensor fl ow, Keras etc. ) • Be able to stitch Data Engineer Pipeline with Machine learning Pipeline • Maintain the production model

Let’s go deeper into ML Pipeline!

Well what do I mean by ML Pipeline? App Developer
Data Scientist ML Engineer Triggers pipeline Training Model Create ML Endpoints Packaging & deploying model Monitoring model Retrain model

ML Pipeline building with Google Cloud Platform Example

Training the Model App Developer Data Scientist ML Engineer Triggers
pipeline Training Model Create ML Endpoints Packaging & deploying model Monitoring model Retrain model Let’s focus on this fi rst

Setting up training on Cloud GCP Example •Create Storage bucket
for storing python package containing train code & storing the model •Project Setup for example in GCP: •Training code is structured as Python Package in /trainer subdirectory. def _train_and_evaluate(estimator, dataset, output_dir) : """Runs model training and evaluation. Args: estimator: (pipeline.Pipeline), Pipeline instance, assemble pre-processing steps and model training dataset: (pandas.DataFrame), DataFrame containing training data output_dir: (string), directory that the trained model will be exported Returns: None """ x_train, y_train, x_val, y_val = utils.data_train_test_split(dataset ) estimator.fit(x_train, y_train ) # Note: for now, use `cross_val_score` defaults (i.e. 3-fold) scores = model_selection.cross_val_score(estimator, x_val, y_val, cv=3 ) logging.info(scores ) # Write model and eval metrics to `output_dir` model_output_path = os.path.join ( output_dir, 'model', metadata.MODEL_FILE_NAME ) metric_output_path = os.path.join ( output_dir, 'experiment', metadata.METRIC_FILE_NAME ) utils.dump_object(estimator, model_output_path ) utils.dump_object(scores, metric_output_path ) Reference: https://github.com/GoogleCloudPlatform/cloudml-samples/tree/master/sklearn/sklearn-template/template/trainer Reference:https://cloud.google.com/ai-platform/docs/getting-started-keras

Deploying Training Job GCP Example Run the following command to
deploy the model   Reference:https://cloud.google.com/ai-platform/training/docs/overview Python package path &   assign module that AI platform  will run Job-dir is path to google storage  where intermediate and   outputs are stored. Number of nodes on which   the job will be ran

Packaging and deploying model App Developer Data Scientist ML Engineer
Triggers pipeline Training Model Create ML Endpoints Packaging & deploying model Monitoring model Retrain model Let’s focus   on this next

Serving the model GCP Example 1. First create a model
resource in AI Platform    2. Create a model version and serve  3. Once the service is available you can invoke as shown in the code. Reference:https://github.com/GoogleCloudPlatform/cloudml-samples/blob/master/sklearn/sklearn-template/template/scripts/predict.py import googleapiclient.discovery def predict(project, model, data, version=None) : """Run predictions on a list of instances. Args: project: (str), project where the Cloud ML Engine Model is deployed. model: (str), model name. data: ([[any]]), list of input instances, where each input instance is a list of attributes. version: str, version of the model to target. Returns: Mapping[str: any]: dictionary of prediction results defined by the model. """ service = googleapiclient.discovery.build('ml', 'v1' ) name = 'projects/{}/models/{}'.format(project, model ) if version is not None : name += '/versions/{}'.format(version ) response = service.projects().predict ( name=name, body= { 'instances': dat a }).execute( ) if 'error' in response : raise RuntimeError(response['error'] ) return response['predictions' ]

Packaging and deploying model App Developer Data Scientist ML Engineer
Triggers pipeline Training Model Create ML Endpoints Packaging & deploying model Monitoring model Retrain model Finally this one

Monitoring GCP Example Reference: https://cloud.google.com/ai-platform/prediction/docs/monitor-prediction

Combo of DevOps, Data & ML Pipeline Source Code CI:
Test, build & package pipeline components CD: Pipeline Deployment Data Extraction Data Preparation Model Training Model Evaluation CD: Model Serving Data Pipeline ML Pipeline 1 2 3 4 https://cloud.google.com/solutions/machine-learning/mlops-continuous-delivery-and-automation-pipelines-in-machine-learning

One bite at a Time! • Learn Cloud Technologies (
Choose one of your choice, GCP is my personal fav ) • Learn Data Pipeline tools • Experiment data pipeline tools in Cloud • Use CICD (DevOPs) pipeline in your projects • Learn about ML Pipeline tools • Experiment ML pipeline tools in Cloud • Learn to integrate Data pipeline with ML Pipeline in Cloud

Thank You! https://medium.com/@dipikabaad https://www.linkedin.com/in/dipika-baad-154a2858/

GREAT YOU BUILT THE MODEL NOW WHAT?

GREAT YOU BUILT THE MODEL NOW WHAT?

Dipika Baad

More Decks by Dipika Baad

Other Decks in Technology

Featured

Transcript

Dipika Baad 11 Feb 2021 Great you built the model

Agenda • Expectations Vs Reality • Roles around MLOPs •

• Consultant at Netlight • Masters in Data Science •

How many of you have built a model?

How many of you have deployed it in Cloud/Server?

How many of you are maintaining the model in production?

Let’s talk about numbers • Google Search Results • How

How did my school life looked like? • Take open

How does my life look like now? • SQL SQL

How does a generic MLOPs look like?

What is MLOPs? - MLOps is DevOps principles applied to

Combo of DevOps, Data & ML Pipeline Source Code CI:

What roles are involved in this all?

One cannot exist without the other • Data Engineering •

What is it like to be Data Engineer? • Be

Data Monster’s Public Image :P

What is it like to be ML Engineer? • Builds

Let’s go deeper into ML Pipeline!

Well what do I mean by ML Pipeline? App Developer

ML Pipeline building with Google Cloud Platform Example

Training the Model App Developer Data Scientist ML Engineer Triggers

Setting up training on Cloud GCP Example •Create Storage bucket

Deploying Training Job GCP Example Run the following command to

Packaging and deploying model App Developer Data Scientist ML Engineer

Serving the model GCP Example 1. First create a model

Packaging and deploying model App Developer Data Scientist ML Engineer

Monitoring GCP Example Reference: https://cloud.google.com/ai-platform/prediction/docs/monitor-prediction

Combo of DevOps, Data & ML Pipeline Source Code CI:

One bite at a Time! • Learn Cloud Technologies (

Thank You! https://medium.com/@dipikabaad https://www.linkedin.com/in/dipika-baad-154a2858/