Upgrade to Pro — share decks privately, control downloads, hide ads and more …

mlct.pdf

 mlct.pdf

More Decks by Hirofumi Nakagawa/中河 宏文

Other Decks in Programming

Transcript

  1. ϝϧΧϦͷMLج൫
    MLCT vol.5


    hnakagawa


    View Slide

  2. ࣗݾ঺հ
    • Hirofumi Nakagawa (hnakagawa)
    • 2017೥7݄ೖࣾ
    • ॴଐ͸SRE
    • σόΠευϥΠό։ൃ͔Βϑϩϯ
    τΤϯυ։ൃ·Ͱ΍ΔԿͰ΋԰
    • NOT σʔλαΠΤϯςΟετ
    • https://github.com/hnakagawa

    View Slide

  3. ͓࢓ࣄ
    • ML Platform։ൃ
    • σʔλαΠΤϯςΟετͱSREͷεΩϧΪϟο
    ϓΛຒΊΔ
    • ML Reliability, SysML?, MLOps?
    • SREͷཱ৔͔ΒMLγεςϜͷࣗಈԽΛߦ͏

    View Slide

  4. ML Platform
    • ಺੡ͷML Platform
    • kubernetesϕʔε
    • طଘͷML FrameworkΛ࢖༻͠
    ؆୯ʹTraining/ServingΛߦ͏
    ؀ڥΛఏڙ

    View Slide

  5. ͦͷ͏ͪOSSͰެ։༧ఆ(ଟ෼

    View Slide

  6. ϝϧΧϦͷMLར༻ࣄྫ
    • ײಈग़඼
    • ҧ൓ग़඼ݕ஌
    • Ձ֨αδΣετ
    • ΢ΤΠταδΣετ

    ౳ʑ…
    ̍೔਺ઍສpredictionΛߦ͍ͬͯΔ

    View Slide

  7. ML Platform Architecture
    ,VCFSOFUFT
    $POUSPMMFS $-*

    $MVTUFS8PSLGMPX
    %BTICPBSE
    4UPSBHF(BUFXBZ
    .FUSJDT
    3VOOFS
    $PNQPOFOU
    .FSDBSJ.-
    $PNQPOFOU
    &YUFSOBM
    .JEEMFXBSF

    View Slide

  8. Model Training & Serving

    Workflow

    View Slide

  9. .-1MBUGPSN USBJOJOHDMVTUFS

    Workflow for Production
    $*
    .-1MBUGPSN TFSWJOHDMVTUFSGPSUFTU

    .PEFM3FHJTUSZ
    +PC +PC
    ɾɾ
    3&45

    "1*
    4USFBNJOH
    5'4FSW
    JOH
    ɾɾɾ

    View Slide

  10. .-1MBUGPSN USBJOJOHDMVTUFS

    Training Workflow
    $*
    .PEFM3FHJTUSZ
    +PC +PC ɾɾɾ
    1. GitHub΁ͷpushΛτϦΨʹtrainingΛىಈ
    2. Training͞ΕͨModel͸Model Registry

    ΁্͕Δ

    View Slide

  11. Serving Workflow
    .-1MBUGPSN TFSWJOHDMVTUFSGPSUFTU

    .PEFM3FHJTUSZ ɾɾ
    3&45

    "1*
    4USFBNJOH
    5'
    4FSWJOH
    1. Model RegistryΛ؂ࢹͯࣗ͠ಈͰModel ΛServing
    2. Serving&Test͕੒ޭ͢Δͱຊ൪༻k8s manifestΛग़ྗ

    View Slide

  12. Container Workflow
    %BUB4PVSDF

    *NBHF
    5FYUɹ
    1SFQSPDFT
    TJOH
    *NBHF
    &TUJNBUPS

    *NBHF
    17
    17
    1JDUVSF
    1SFQSPDFT
    TJOH
    *NBHF
    17
    It’s own implementation

    View Slide

  13. Model Serving APIͷߏ੒ྫ
    5FOTPS'MPX

    4FSWJOH
    5'
    .PEFM
    5'
    .PEFM
    'MBTL
    4,

    .PEFM
    4,

    .PEFM
    4,

    .PEFM
    gRPC
    .FSDBSJ"1*
    REST
    FlaskͰલॲཧΛߦ͍

    ཪͷTensorFlow Servingʹ౤͍͛ͯΔ

    View Slide

  14. Model Serving API

    Streaming ver ͷߏ੒ྫ
    5FOTPS'MPX

    4FSWJOH
    5'
    .PEFM
    5'
    .PEFM
    .-1MBUGPSN
    'SBNFXPSL

    PS

    "QBDIF#FBN

    4,

    .PEFM
    4,

    .PEFM
    4,

    .PEFM
    gRPC
    PubSub

    View Slide

  15. ModelͱίϯςφɾΠϝʔδ
    • ڊେͳML ModelΛίϯςφɾΠϝʔδʹؚΊ
    Δ͔൱͔
    • ؚΊͳ͍ͷͰ͋Ε͹Կॲʹ഑ஔ͢Δ͔
    • ϙʔλϏϦςΟੑͱϩʔυ࣌ؒͷτϨʔυΦϑ
    • ྑ͍ΞΠσΟΞ͕͋Ε͹ڭ͑ͯԼ͍͞…

    View Slide

  16. ௨ৗͷAPIͱಛੑ͕ҧ͏
    • ѻ͏ϦιʔεɺModelαΠζ͕େ͖͘ͳΔ৔
    ߹͕ଟ͍(਺ඦMBʙ਺GB)
    • CPUɾϝϞϦϦιʔεͷফඅ͕ܹ͍͠
    • ৔߹ʹΑͬͯ͸GPU΋࢖͏

    View Slide

  17. ϝϞϦফඅ໰୊
    • ҧ൓ݕ஌γεςϜͷPython࣮૷෦෼͸࣮ߦ࣌
    ʹ໿2GBϝϞϦΛফඅ͢Δˠࠓޙ͞Βʹ૿͑
    Δ༧ఆ΋͋Δ
    • Scikit-learnͰهड़͞Εͨલॲཧ෦෼͕େ͖͘
    ͳΓ͕ͪ

    View Slide

  18. Pythonͱฒྻੑ
    • ౰વThread͕࢖͑ͳ͍(GILͷͨΊ)
    • ϓϩηεຖʹModelΛϩʔυ͢Δͱඞཁͳϝ
    ϞϦαΠζ͕େ͖͘ͳΔˠ Blue-Green
    Deployͷো֐ʹͳΔ

    View Slide

  19. ਖ਼௚PythonͰͷServing͸

    Πϯϑϥతʹਏ͍ࣄ͕ଟ͍…

    View Slide

  20. ϝϞϦΛݡ͘࢖͏
    • fork͢ΔલʹmodelΛϩʔυ͠Copy on Write
    Λޮ͔͢
    • k8sͷone process per containerηΦϦ͸͋
    ͑ͯഁ͍ͬͯΔ

    View Slide

  21. Copy On Writeͷ෮श
    ϝϞϦ
    ਌ϓϩηε ࢠϓϩηε
    2.fork
    1BHF"
    1.allocation ಉ͡ྖҬΛࢀর

    View Slide

  22. ϓϩηε͕ϝϞϦͷ಺༰Λ

    ॻ͖׵͑Δͱ…
    ϝϞϦ
    ਌ϓϩηε ࢠϓϩηε
    1BHF" 1BHF#
    OS͕ผͷྖҬΛAllocationͯ͠ݩσʔλΛίϐʔ͢Δ
    ผͷྖҬΛࢀর

    View Slide

  23. Current Issues

    View Slide

  24. ߴ౓ͳܧଓతϝϯςφϯε͕ඞཁ
    • MLػೳ͸σʔλͷ܏޲͕มΘͬͨΓɺ༧૝֎
    ͷ໰୊͕ൃੜͨ͠Γͯ͠ɺͦΕΒʹରԠ͠ଓ
    ͚Δඞཁ͕͋Δ
    MLػೳ͸ϦϦʔεޙ΋େ͖ͳ
    ίετ͕͔͔Γଓ͚Δ

    View Slide

  25. େ෯ͳࣗಈԽ͕ඞਢ

    View Slide

  26. In Progress

    View Slide

  27. ߴ౓ͳࣗಈԽ
    • ࣾ಺ͷσʔλ͔ΒFeature Extraction͢Δ࣮૷
    ΛίϯϙʔωϯτԽ
    • ಛఆͷ໰୊Λղܾ͢ΔϞσϧߏஙΛ͋Δఔ౓
    ࣗಈԽ
    • ϦϦʔεޙͷRe-TrainingɺHyper parameter
    optimizationɺDeploy౳ΛࣗಈԽ

    View Slide

  28. AutoFlow
    'FBUVSF&YUSBDUJPO
    $PNQPOFOUT
    $MBTTJGJDBUJPO
    $PNQPOFOUT
    $PODBUFOBUJPO

    $PNQPOFOUT
    .PEFM
    #VJMEFS
    $PNQPOFOUT
    3FHJTUSZ
    Ϋϥελ্ͰϞσϧͷ൒ࣗಈߏஙͱϋΠύʔύϥ
    ϝʔλͷࣗಈௐ੔Λߦ͏

    View Slide

  29. AutoServing
    %FQMPZ
    ϦϦʔεޙͷਫ਼౓؂ࢹɾRe-TrainingɾRe-Deploy౳
    ΛࣗಈͰߦ͏
    .POJUPSJOH
    &WBMVBUJPO
    )ZQFS
    QBSBNFUFS
    PQUJNJ[BUJPO
    3F5SBJOJOH

    View Slide

  30. ·ͱΊ
    • MLʹ͸গ͠௨ৗͱҧ͏Πϯϑϥ͕ඞཁʹͳΔ

    ˠ·ͩϕετɾϓϥΫςΟε͸෼͔Βͳ͍
    • ͦ΋ͦ΋MLͳػೳΛຊ֨ӡ༻͠Α͏ͱ͢Δ
    ͱɺେ෯ͳࣗಈԽɾ࢓૊ΈԽΛਐΊͳ͍ͱ্
    ख͘ߦ͔ͳ͍

    View Slide

  31. ͝ਗ਼ௌ͋Γ͕ͱ͏͍͟͝·ͨ͠!!

    View Slide