Slide 1

Slide 1 text

From notebook to production in Vertex AI Wessel Huising & Daniel van der Ende

Slide 2

Slide 2 text

● Data & AI Consultancy and Training ● We help organizations be successful with Data and AI ● Mix of Data Engineers, Data Scientists, Machine Learning Engineers, Analytics Engineers & Analytics Translators ● Based in Amsterdam & Eindhoven ● Part of Xebia Data & AI ● The best Payment Service Provider out there ● Founded in 2004 by Adriaan Mol ● Mission to simplify financial services by creating world-class products ● Currently active for merchants in European Economic Area (EEA), Switzerland, and the United Kingdom

Slide 3

Slide 3 text

The challenge

Slide 4

Slide 4 text

● Low effort ● Highly depending on one specific Data Scientist ● Prone to human errors ● Labor intensive Manual predictions are convenient Data Scientists ≠ Software Engineers ● Data Scientist tend not to have traditional Software Engineering background ● Tend to lack understanding of DevOps ML models are something else ● Different than regular software artifacts ● There is significant overlap Machine Learning models to production is hard

Slide 5

Slide 5 text

The classic ML routine

Slide 6

Slide 6 text

INPUT DATA The classic ML routine

Slide 7

Slide 7 text

INPUT DATA The classic ML routine Lack of a centralized data source

Slide 8

Slide 8 text

INPUT DATA The classic ML routine Code not living in a Python package

Slide 9

Slide 9 text

INPUT DATA The classic ML routine Missing lineage tracking of artifacts

Slide 10

Slide 10 text

INPUT DATA The classic ML routine No version control of the code

Slide 11

Slide 11 text

INPUT DATA The classic ML routine No automatic or scheduled predictions

Slide 12

Slide 12 text

INPUT DATA The classic ML routine Predictions all over the place

Slide 13

Slide 13 text

What is MLOps? and should you want it?

Slide 14

Slide 14 text

● Bringing a model into production state ● Reliable & automated service but comes with extra costs ● MLOps is a End-to-End process ● Results in great ML software MLOps = Data + ML + DevOps

Slide 15

Slide 15 text

The big decision

Slide 16

Slide 16 text

No content

Slide 17

Slide 17 text

The big decision Open Source solutions Managed solutions

Slide 18

Slide 18 text

What is Vertex AI

Slide 19

Slide 19 text

“Build, deploy, and scale ML models faster, with pre-trained and custom tooling within a unified artificial intelligence platform.” What does Google tell us?

Slide 20

Slide 20 text

Vertex AI has a few (important) components The Buzzword Bingo And more… Metadata Endpoints Models Pipelines Workbench Features Datasets

Slide 21

Slide 21 text

Vertex AI has a few (important) components The Buzzword Bingo And more… Metadata Endpoints Models Pipelines Workbench Features Datasets

Slide 22

Slide 22 text

From notebook to production

Slide 23

Slide 23 text

No content

Slide 24

Slide 24 text

Goal: Use machine learning to create a model that predicts which passengers survived the Titanic shipwreck

Slide 25

Slide 25 text

In a Workbench we can do “traditional” Data Science without any limitations or strings attached. Each DS has their own Virtual Machine Each VM: ● Has access to data ● Is persistent ● Can be configured to work with VSCode or PyCharm on your local machine ● Can have the specs you need/want Step 1: Let’s explore the problem! Workbench

Slide 26

Slide 26 text

Time to move our code out of notebooks! In this step, we’ll: ● Use the Pipeline components that Google provides out of the box to setup a training pipeline. ● Train our model and output Datasets and Models. ● Deploy our model to an Endpoint so our model is available for consumption by downstream users. Step 2: Deploy it as if you’re Google Metadata Models Pipelines Datasets Endpoints

Slide 27

Slide 27 text

Let’s take a step back… Good Not so good ❌ Automated Train/Test splits are nice, but also opaque and not very configurable ❌ Where is the model evaluation step? ❌ Everything disappears into one big “train the model” step, including preprocessing. ✅ We have an ML Pipeline ✅ It’s all code ✅ It can be scheduled and kicked off automatically ✅ Everything now is traceable

Slide 28

Slide 28 text

To mitigate the downsides, while keeping the upsides, we use the Kubeflow API It integrates well with Vertex AI and can be easily customized Mollievert is our package with customized components to simplify and clarify our ML Pipelines Step 3: Now Mollie-fy it! Metadata Models Pipelines Datasets Endpoints

Slide 29

Slide 29 text

Wrapping up Mollie’s ML Platform enables robust MLOps practices by: ● Making it easier to go to production without lowering the bar ● Empowering Data Scientists and ML Engineers with tooling ● Defining a ‘Golden Path’ to production, but allowing customization if desired.

Slide 30

Slide 30 text

No content