Argo Workflow によるMLジョブ管理

Argo Workflow によるMLジョブ管理

MACHINE LEARNING Meetup KANSAI #4
2019/3/27

E60aa4f80303f3f386898546ddb3686a?s=128

Livesense Inc.

March 27, 2019
Tweet

Transcript

  1. Argo Workflow ʹΑΔMLδϣϒ؅ཧ Shotaro Tanaka / @yubessy / Ϧϒηϯε (ژ౎ΦϑΟε)

    MACHINE LEARNING Meetup KANSAI #4 LT
  2. ͜Εͷ঺հ͠·͢

  3. https://argoproj.github.io/

  4. Կ͕Ͱ͖Δͷ͔ "Container native workflow engine for Kubernetes" • ෳ਺ͷίϯςφΛ௚ྻ/ฒྻ࣮ߦ͢ΔϫʔΫϑϩʔΛఆٛͰ͖Δ •

    σʔλύΠϓϥΠϯ, CI/CD ͳͲͷར༻Λ૝ఆ • ৽όʔδϣϯͰ͸ DAG ΋αϙʔτ • Argo ϕʔεͷ༷ʑͳϓϩμΫτ • Argo CD: GitOps ʹΑΔ CD Λ࣮ݱ • Argo Event: ϫʔΫϑϩʔͷτϦΨ
  5. apiVersion: argoproj.io/v1alpha1 kind: Workflow metadata: generateName: ml-workflow- spec: entrypoint: main

    templates: - name: main steps: - - name: load-dataset template: load-dataset - - name: train-model-1 template: train-model arguments: parameters: [{name: model, value: model1}] - name: train-model-2 template: train-model arguments: parameters: [{name: model, value: model2}] ...
  6. ... - name: load-dataset container: image: postgres:latest command: [sh, -c]

    args: ["psql db -c 'SELECT * FROM dataset' -A -F, > dataset.csv"] - name: train-model inputs: parameters: [{name: model}] container: image: train-model command: [sh -c] args: ["python train_model.py --model={{inputs.parameters.model}}"]
  7. None
  8. ͳͥ࢖͏ͷ͔ ʮϞσϧ͕Ͱ͖ͨͷͰɺαΫοͱӡ༻ʹ৐͍ͤͨʯ • MLϞσϧͷ։ൃऀ • SQL Ͱσʔλऔಘ ʙ Ϟσϧ΍༧ଌ஋ΛϑΝΠϧʹग़ྗ •

    Docker Ͱಈ͘Α͏ʹ͓ͯ͘͠ • MLγεςϜͷ։ൃऀ • DBIO ΍Ϟσϧɾ༧ଌ݁ՌͷσϦόϦॲཧΛ࣮૷ • Argo Ͱ͢΂ͯΛ૊Έ߹ΘͤͨϫʔΫϑϩʔΛ࡞Δ → ίϯςφ୯ҐͰ໾ׂ෼୲
  9. ϦϒηϯεͰͷར༻ྫ • ग़ྗͷDBॻ͖ࠐΈॲཧͷ෼཭ • Ϟσϧͷ Continuous Delivery • ฒߦॲཧ

  10. ग़ྗͷDBॻ͖ࠐΈॲཧͷ෼཭ • ٻਓαΠτͷݕࡧॱҐ੍ޚ༻༧ଌϞσϧ • όονͰֶशɾ༧ଌ͠ग़ྗΛDBʹॻ͖ࠐΈ • Ϟσϧͷ։ൃऀ͸CSVग़ྗ·Ͱ࣮૷ͯ͠ Docker Խ͓ͯ͘͠ •

    ॻ͖ࠐΈॲཧ΍ΫϨσϯγϟϧ؅ཧ͸γεςϜͷ։ൃऀ͕࣮૷ steps: - - name: train-model # MLϞσϧͷ։ൃऀ͕࣮૷ - - name: predict-rates # MLϞσϧͷ։ൃऀ͕࣮૷ (ग़ྗ͸CSV) - - name: import-to-db # MLγεςϜͷ։ൃऀ͕࣮૷ # ※ग़ྗϑΝΠϧ͸ڞ༗ϘϦϡʔϜͰड͚౉͠
  11. Ϟσϧͷ Continuous Delivery • Ӧۀઓུɾ޿ࠂग़ߘΛ૝ఆͨ͠ٻਓޮՌਪఆϞσϧ • ϚʔέςΟϯά୲౰ऀ޲͚ͷϏϡʔϫΛ R-Shiny Ͱ։ൃɾӡ༻ •

    ਪఆॲཧ͕׬ྃ͢ΔͨͼʹϏϡʔϫΛσϓϩΠͯ͠ϞσϧΛߋ৽ steps: - - name: estimate # ਪఆॲཧ - - name: upload-model # ࡞੒͞ΕͨϞσϧΛετϨʔδʹอଘ - - name: update-viewer # ϏϡʔϫΛσϓϩΠ͠௚͢
  12. Ϟσϧͷ Continuous Delivery (ଓ͖) • Ϗϡʔϫ΋ಉ͡ Kubernetes ΫϥελͰ Deployment ͱ͍ͯಈ͍͍ͯΔ

    • kubectl set env Ͱ Deployment Λߋ৽͢Δ͜ͱͰ৽͍͠ϞσϧΛಡΈࠐΉ • Rolling Update ʹΑΓμ΢ϯλΠϜແ͠ͷϞσϧߋ৽΋Մೳ - name: update-viewer container: image: kubectl command: ["sh", "-c"] args: ["kubectl set env deployment/viewer-app MODEL={{workflow.parameters.model}}"]
  13. ฒߦॲཧ • Webςετͷଟ࿹όϯσΟοτ࠷దԽͷॏΈߋ৽δϣϒ • ෳ਺ͷςετ͕૸͓ͬͯΓɺ֤ςετͷਪఆॲཧ͸ฒߦ࣮ߦ͍ͨ͠ steps: - - name: list-experiments

    # ਪఆॲཧ͕ඞཁͳςετΛϦετΞοϓ - - name: calc-weights # ͜ΕΛϦετΞοϓ͞Εͨςετͷ਺͚ͩฒߦ࣮ߦ͢Δ # ग़ྗύϥϝʔλͷϦετΛ౉͢ͱͦͷ਺͚ͩίϯςφ্ཱ͕͕ͪΔ # Ϧετ͸ [{"experimentId": 1}, {"experimentId": 2}] ͷΑ͏ͳ JSON withParams: "{{steps.list-experiments.outputs.parameters.experiments}}" # Ϧετͷ֤ΞΠςϜ͔ΒύϥϝʔλΛऔΓग़ͯ͠౉͢ arguments: parameters: [{name: experimentId, value: "{{item.experimentId}}"}]
  14. ฒߦॲཧ (ଓ͖) templates: - name: list-experiments container: ... outputs: parameters:

    - name: experiments # ग़ྗύϥϝʔλͷϦετΛϑΝΠϧࢦఆ valueFrom: {path: /output/experiments.json} - name: calc-weights container: ... inputs: parameters: # ύϥϝʔλΛೖྗ஋ͱͯ͠ड͚औΔ - name: experimentId
  15. None
  16. ·ͱΊ • ෳ਺ίϯςφ͔ΒͳΔϫʔΫϑϩʔΛ؆୯ʹ૊ΊΔ • ͭͬͨ͘MLϞσϧΛ͢͹΍͘ӡ༻͍ͨ͠ͱ͖ʹศར هࣄ΋͋Γ·͢: Argo ʹΑΔίϯςφωΠςΟϒͳσʔλύΠϓϥΠϯͷϫʔΫϑϩʔ؅ཧ