Slide 1

Slide 1 text

No content

Slide 2

Slide 2 text

Shotaro Kohama, Mercari Efficient Model Exploring and Continuous Delivery With Polyaxon + Kubeflow

Slide 3

Slide 3 text

What is Mercari? ● Mercari is a customer-to-customer marketplace where individuals can give new life to items around them—that have fallen out of use—by selling them to other customers. ● Our unique challenge is around pricing - since items are in various conditions, we need to help customers find the right price to sell their items, which is where we leverage ML. ● Today, I'm going to talk about the operations side of our ML pipeline to enable features like our Price Guidance System.

Slide 4

Slide 4 text

Machine Learning at Mercari US Price Guidance System ● The Price Suggestion feature recommends a viable price range for an item during the listing process. ● The Smart Pricing feature continuously updates the listing price until it hits a user-specified floor price or the item gets sold. ● An ML model takes item’s title, description, category, brand, and condition to suggest the listing and floor prices in real-time. Mercari engineering | Price Guidance System leveraging Artificial Intelligence Techniques https://medium.com/mercari-engineering/price-guidance-system-74358bd96081

Slide 5

Slide 5 text

Agenda Machine Learning Development Lifecycle Model Exploration with Polyaxon Continuous Training with Kubeflow Pipelines What we built to accelerate ML project iterations 0 1 2 3

Slide 6

Slide 6 text

Agenda Machine Learning Development Lifecycle Model Exploration with Polyaxon Continuous Training with Kubeflow Pipelines What we built to accelerate ML project iterations 0 1 2 3

Slide 7

Slide 7 text

ML Development Lifecycle ML Projects are highly iterative. How to accelerate the iteration is the key to the success of projects. We can be able to accelerate iterations by automating manual processes with open-source MLOps and DevOps tools Organizing machine learning projects: project management guidelines. https://www.jeremyjordan.me/ml-projects-guide/

Slide 8

Slide 8 text

ML Project Lifecycle at Mercari US Model Exploration with Polyaxon Polyaxon is an ML Ops tool to support scalable and reproducible model exploration. Continuous Training with Kubeflow Pipelines Kubeflow Pipelines is an open source for ML training pipeline management. We set up a scheduled job to build a docker image to serve a new trained model on top of it. Continuous Delivery with Spinnaker Spinnaker is an open source for Continuous Delivery. Spinnaker is able to trigger a deploy pipeline when an image is pushed to an image registry. Organizing machine learning projects: project management guidelines. https://www.jeremyjordan.me/ml-projects-guide/

Slide 9

Slide 9 text

Agenda Machine Learning Development Lifecycle Model Exploration with Polyaxon Continuous Training with Kubeflow Pipelines What we built to accelerate ML project iterations 0 1 2 3

Slide 10

Slide 10 text

What is Polyaxon? Scalable and Reproducible Experiments ● Polyaxon is an MLOps tool to support scalable and reproducible model exploration. ● Polyaxon provides a yaml specification to run hyperparameter tuning jobs on Kubernetes. The tuning jobs will run parallelly and scalably on top of a cluster autoscaler. ● The yaml specification enables other developers to reproduce the experiment easily. 1

Slide 11

Slide 11 text

How to run a job on Polyaxon Define Polyaxonfile to run a parameter tuning job. Create code to train an ML model. Upload Polyaxonfile and the code with Polyaxon CLI. Experiments will run on Kubernetes. 1 2 3 4 My Favorite Point Polyaxon builds a docker image to run training code. A developer doesn’t have to wait for CI to build a docker image every time the developer modifies the code. That prevents an interruption from happening in a development flow.

Slide 12

Slide 12 text

Polyaxon at Mercari US How long we’ve been using Polyaxon ● For about 2 years since Feb, 2019. How many projects/experiments we’ve run ● 175 projects ● About 87,000 experiments What infrastructure we’ve been using ● Google Cloud Kubernetes Engine ● Google Cloud Storage for logs, data, and artifacts ● Regular, Preemptible x CPU, GPU node-pools ● Google Filestore as NFS Persistent Volume

Slide 13

Slide 13 text

What is Kubeflow Pipelines? Kubeflow Pipelines (KFP) is a Machine Learning Workflow Engine ● Kubeflow Pipelines is an open source to manage end-to-end machine learning pipelines. ● Kubeflow Pipelines has an integrated metadata store. The inputs and outputs of a stage will be automatically stored in the metadata store. ● Kubeflow Pipelines allows a developer to implement easily a reusable component based on Python SDK. 2

Slide 14

Slide 14 text

Kubeflow Pipelines at Mercari US KFP connects Polyaxon and Spinnaker for Continuous Model Deployment ● KFP Web UI enables a developer to set up a scheduled job. ● A pipeline submits a training job on Polyaxon and builds a docker image to serve a new trained model. ● Spinnaker can trigger a deployment pipeline automatically when a new docker image is pushed to a docker registry. Mercari engineering | Continuous delivery and automation pipelines in machine learning with Polyaxon and Kubeflow Pipelines https://medium.com/mercari-engineering/continuous-delivery-and-automation-pipelines-in-machine-learning-with-polyaxon-and-kubeflow-d6a3668715de

Slide 15

Slide 15 text

What we built to accelerate iterations Monorepo for Kubeflow Pipelines We built a monorepo to manage pipeline versions in a git workflow and to share best practices. Manifests to manage projects on Polyaxon and KFP We defined a manifest to prepare resources on Kubeflow Pipelines and Polyaxon like infrastructure as code. A KFP component to submit a Polyaxon job We developed a Kubeflow Pipelines component to submit a job from KFP to Polyaxon. 3

Slide 16

Slide 16 text

Monorepo for Kubeflow Pipelines KFP + Continuous Integration (CI) ● Monorepo contains KFP components and a python package to define lightweight KFP components. ● CI will detect modified pipelines, and compiles and uploads them as the version: branch_name + commit_hash. ● When a branch is merged into the main branch, CI will upload the updated pipelines to the production cluster. $ tree mercari-us-kubeflow-pipelines mercari-us-kubeflow-pipelines ├── components # directory for KFP components ├── docs # directory for documents ├── package │ └── merkfp # python package for lightweight KFP components ├── pipelines # directory for each project pipelines │ └── mercari-us-ml-price-suggestion │ └── train_model.py ├── projects # directory for “project” manifests │ └── mercari-us-ml-price-suggestion.yml └── scripts # directory for scripts on continuous integration

Slide 17

Slide 17 text

“Project” Manifest for KFP and Polyaxon Continuous Integration (CI) creates resources like Infrastructure as Code ● CI will create KFP experiments and Polyaxon projects for the development and production environments to keep consistency. ● CI will generate Github code owners based on “owners”. It allows each team to approve pull requests to modify project-related code. --- kind: Project name: mercari-us-ml-price-suggestion experiments: - name: “Default” - name: “Sneakers” - name: “Trading Cards” owners: - github: "@kouzoh/mercari-price-suggest-us-prod" mercari-ml-price-suggestion-us.yml

Slide 18

Slide 18 text

Polyaxon Kubeflow Pipelines Component An init container clones a private repo with a secret. The main container logs a user in to Polyaxon with a secret. The main container submits a training job through Polyaxon API. The main container trails the logs until the job ends. The component outputs Project, User, Job ID, Status for the next step. 1 2 3 4 5

Slide 19

Slide 19 text

Continuous Training with Polyaxon + KFP Mercari engineering | Continuous delivery and automation pipelines in machine learning with Polyaxon and Kubeflow Pipelines https://medium.com/mercari-engineering/continuous-delivery-and-automation-pipelines-in-machine-learning-with-polyaxon-and-kubeflow-d6a3668715de

Slide 20

Slide 20 text

Takeaways Polyaxon suits the model exploration in a scalable and reproducible way Monorepo + CI for KFP works well to keep high efficiency and consistency A custom KFP component for Polyaxon enables us to move forward seamlessly 1 2 3

Slide 21

Slide 21 text

Thanks!