Upgrade to Pro — share decks privately, control downloads, hide ads and more …

GDD X Mollie

Marketing OGZ
September 19, 2022
120

GDD X Mollie

Marketing OGZ

September 19, 2022
Tweet

Transcript

  1. • Data & AI Consultancy and Training • We help

    organizations be successful with Data and AI • Mix of Data Engineers, Data Scientists, Machine Learning Engineers, Analytics Engineers & Analytics Translators • Based in Amsterdam & Eindhoven • Part of Xebia Data & AI • The best Payment Service Provider out there • Founded in 2004 by Adriaan Mol • Mission to simplify financial services by creating world-class products • Currently active for merchants in European Economic Area (EEA), Switzerland, and the United Kingdom
  2. • Low effort • Highly depending on one specific Data

    Scientist • Prone to human errors • Labor intensive Manual predictions are convenient Data Scientists ≠ Software Engineers • Data Scientist tend not to have traditional Software Engineering background • Tend to lack understanding of DevOps ML models are something else • Different than regular software artifacts • There is significant overlap Machine Learning models to production is hard
  3. • Bringing a model into production state • Reliable &

    automated service but comes with extra costs • MLOps is a End-to-End process • Results in great ML software MLOps = Data + ML + DevOps
  4. “Build, deploy, and scale ML models faster, with pre-trained and

    custom tooling within a unified artificial intelligence platform.” What does Google tell us?
  5. Vertex AI has a few (important) components The Buzzword Bingo

    And more… Metadata Endpoints Models Pipelines Workbench Features Datasets
  6. Vertex AI has a few (important) components The Buzzword Bingo

    And more… Metadata Endpoints Models Pipelines Workbench Features Datasets
  7. Goal: Use machine learning to create a model that predicts

    which passengers survived the Titanic shipwreck
  8. In a Workbench we can do “traditional” Data Science without

    any limitations or strings attached. Each DS has their own Virtual Machine Each VM: • Has access to data • Is persistent • Can be configured to work with VSCode or PyCharm on your local machine • Can have the specs you need/want Step 1: Let’s explore the problem! Workbench
  9. Time to move our code out of notebooks! In this

    step, we’ll: • Use the Pipeline components that Google provides out of the box to setup a training pipeline. • Train our model and output Datasets and Models. • Deploy our model to an Endpoint so our model is available for consumption by downstream users. Step 2: Deploy it as if you’re Google Metadata Models Pipelines Datasets Endpoints
  10. Let’s take a step back… Good Not so good ❌

    Automated Train/Test splits are nice, but also opaque and not very configurable ❌ Where is the model evaluation step? ❌ Everything disappears into one big “train the model” step, including preprocessing. ✅ We have an ML Pipeline ✅ It’s all code ✅ It can be scheduled and kicked off automatically ✅ Everything now is traceable
  11. To mitigate the downsides, while keeping the upsides, we use

    the Kubeflow API It integrates well with Vertex AI and can be easily customized Mollievert is our package with customized components to simplify and clarify our ML Pipelines Step 3: Now Mollie-fy it! Metadata Models Pipelines Datasets Endpoints
  12. Wrapping up Mollie’s ML Platform enables robust MLOps practices by:

    • Making it easier to go to production without lowering the bar • Empowering Data Scientists and ML Engineers with tooling • Defining a ‘Golden Path’ to production, but allowing customization if desired.