Upgrade to Pro — share decks privately, control downloads, hide ads and more …

Polyaxon + Kubeflow を利用した効率的な継続的モデルインテグレーション ...

Polyaxon + Kubeflow を利用した効率的な継続的モデルインテグレーション / Continuous ML Model Integration with Polyaxon and Kubefolow Pipelines

第9回 MLOps 勉強会 Tokyo (Online): https://mlops.connpass.com/event/215133/ でトークした際の資料です

Shotaro Kohama

July 13, 2021
Tweet

More Decks by Shotaro Kohama

Other Decks in Programming

Transcript

  1. Confidential & Proprietary 2021 Machine Learning at Mercari US Mercari

    engineering | Price Guidance System leveraging Artificial Intelligence Techniques https://medium.com/mercari-engineering/price-guidance-system-74358bd96081 Price Suggestion Feature Smart Pricing Feature
  2. Confidential & Proprietary 2021 Agenda Machine Learning Development Lifecycle Model

    Exploration with Polyaxon Continuous Training with Kubeflow Pipelines What we built to accelerate ML project iterations 0 1 2 3
  3. Confidential & Proprietary 2021 Agenda Machine Learning Development Lifecycle Model

    Exploration with Polyaxon Continuous Training with Kubeflow Pipelines What we built to accelerate ML project iterations 0 1 2 3
  4. Confidential & Proprietary 2021 ML Development Lifecycle ML Projects are

    highly iterative. How to accelerate the iteration is the key to the success of projects. We are able to accelerate iterations by automating manual processes with open-source MLOps and DevOps tools Organizing machine learning projects: project management guidelines. https://www.jeremyjordan.me/ml-projects-guide/
  5. Confidential & Proprietary 2021 ML Development Lifecycle at Mercari US

    Model Exploration with Polyaxon Continuous Training with Kubeflow Pipelines (KFP) Continuous Delivery with Spinnaker Organizing machine learning projects: project management guidelines. https://www.jeremyjordan.me/ml-projects-guide/
  6. Confidential & Proprietary 2021 Agenda Machine Learning Development Lifecycle Model

    Exploration with Polyaxon Continuous Training with Kubeflow Pipelines What we built to accelerate ML project iterations 0 1 2 3
  7. Confidential & Proprietary 2021 Model Exploration with Polyaxon [NOTE] Poyaxon

    v0.6.1 の UI. Polyaxon v1.x では UI は異なる.
  8. Confidential & Proprietary 2021 Model Exploration with Polyaxon --- version:

    1 kind: group hptuning: concurrency: 100 matrix: learning_rate: linspace: 0.001:0.1:5 dropout: values: [0.25, 0.3] activation: values: [relu, sigmoid] declarations: batch_size: 128 num_steps: 500 num_epochs: 1 build: image: tensorflow/tensorflow:2.4.2-py3 build_steps: - pip3 install --no-cache-dir -U polyaxon-helper run: cmd: python3 model.py --batch_size={{ batch_size }} \ --num_steps={{ num_steps }} \ --learning_rate={{ learning_rate }} \ --dropout={{ dropout }} \ --num_epochs={{ num_epochs }} \ --activation={{ activation }} $ polyaxon run -u -f polyaxon_gridsearch.yml ... Creating an experiment group with the following definition: ---------------- ----------------- Search algorithm grid Concurrency 5 concurrent runs Early stopping deactivated ---------------- ----------------- Experiment group 1 was created [NOTE] Poyaxon v0.6.1 の Specification. Polyaxon v1.x では Specification は異なる.
  9. Confidential & Proprietary 2021 How to run an experiment on

    Polyaxon Model Training 用の code を用意する Hyperparemeter Tuning Job を Polyaxonfile で定義する Polyaxon CLI を使って Polyaxonfile と Code を Upload する Polyaxon が各 job を Kubernetes 上で実行する 実験結果を UI 上で可視化する 1 2 3 4 5 My Favorite Point Code も一緒に Upload することで変更した後に すぐに Interactive に実行出来て便利
  10. Confidential & Proprietary 2021 Polyaxon at Mercari US 使用期間 •

    2019 年の 2月頃から使い始めて、だいたい2年半くらい プロジェクト・実験数 (2021年5月時点) • 175 Projects • 約 870,000 Experiments 利用しているインフラ • Google Cloud Kubernetes Engine • Google Cloud Storage for logs, data, and artifacts • Regular, Preemptible x CPU, GPU node-pools • Google Filestore as NFS Persistent Volume
  11. Confidential & Proprietary 2021 Continuous Training with Kubeflow Pipelines Kubeflow

    Pipelines (KFP) is a Machine Learning Workflow Engine • KFP は Kubernetes 上で動く container based workflow engine • KFP は metadata store を持っていて 各 step の input/output を保存する • Python SDK を使って pipeline を DSL で 書くことができる
  12. Confidential & Proprietary 2021 Continuous Model Delivery with Spinnaker KFP

    connects Polyaxon and Spinnaker for CD • KFP から Polyaxon Job を定期実行、 新しい model を serve するための docker image を作成 • Spinnaker は新しい docker image が レジストリに作成されるとデプロイを自 動的に実行 Mercari engineering | Continuous delivery and automation pipelines in machine learning with Polyaxon and Kubeflow Pipelines https://medium.com/mercari-engineering/continuous-delivery-and-automation-pipelines-in-machine-learning-with-polyaxon-and-kubeflow-d6a3668715de
  13. Confidential & Proprietary 2021 Agenda Machine Learning Development Lifecycle Model

    Exploration with Polyaxon Continuous Training with Kubeflow Pipelines What we built to accelerate ML project iterations 0 1 2 3
  14. Confidential & Proprietary 2021 What we built to accelerate Iterations

    Monorepo for Kubeflow Pipelines Monorepo を使うことで pipeline の version を CI で管理したり、ベストプラクティスを共有可能に Manifests to manage projects on Polyaxon and KFP Yaml で KFP と Polyaxon のリソースを定義できるようにし Instrastructure as Code のように管理可能に A KFP component to submit a Polyaxon Job KFP component を利用して、簡単に KFP から Polyaxon Job を submit 可能に
  15. Confidential & Proprietary 2021 Monorepo for Kubeflow Pipelines $ tree

    mercari-us-kubeflow-pipelines mercari-us-kubeflow-pipelines ├── components # directory for KFP components ├── docs # directory for documents ├── package │ └── merkfp # python package for lightweight KFP components ├── pipelines # directory for each project pipelines │ └── mercari-us-ml-price-suggestion │ └── train_model.py ├── projects # directory for “project” manifests │ └── mercari-us-ml-price-suggestion.yml └── scripts # directory for scripts on continuous integration KFP + Continuous Integration • KFP Lightweight component や Secret 名などの定数を定義する python package を用意 • branch_name + commit_hash で pipeline version を管理 • 修正された pipeline のみを CI で compile して upload する
  16. Confidential & Proprietary 2021 “Project” Manifest for KFP and Polyaxon

    CI creates resources like Infrastructure as code • Yaml で KFP の Experiments や Polyaxon の Project を定義可能に • Dev と Prod の一貫性を保つために Yaml を元に CI から作成 • GitHub Codeowners も生成 --- kind: Project name: mercari-us-ml-price-suggestion experiments: - name: “Default” - name: “Sneakers” - name: “Trading Cards” owners: - github: "@kouzoh/mercari-price-suggest-us-prod" mercari-ml-price-suggestion-us.yml
  17. Confidential & Proprietary 2021 Polyaxon Kubeflow Pipelines Component 1 2

    3 4 5 Init container で secret を使って private repo を clone Main container で secret を使って Polyaxon に login Training job を Polyaxon API を使って submit Log を tail しながら Job が終わるのを待つ Project, Job ID, Status などを次のステップに output
  18. Confidential & Proprietary 2021 Takeaways Polyaxon helps us to achieve

    a scalable and reproducible model exploration Monorepo + CI for KFP works to keep consistency and to spread best practices A custom KFP component for Polyaxon enables us to move forward seamlessly 1 2 3