Build Image Classification service with Amazon ECS and GPU instances

Build Image Classification service with Amazon ECS and GPU instances

7dc8611c26c3ca62c551109c65d04270?s=128

Yuichiro Someya

November 22, 2016
Tweet

Transcript

  1. 2.

    • છ୩ ༔Ұ࿠ [Yuichiro Someya] • ౦޻େେֶӃ ܭࢉ޻ֶઐ߈ म࢜ •

    '16 ৽ଔ @ ΫοΫύου • github.com/ayemos • twitter.com/kumasan_com echo `whoami`
  2. 7.

    • CaffeNetΛ ྉཧʗඇྉཧ ൑ఆ޲͚ʹFine Tuningͨ͠Ϟσϧ • Caffe[1]Ͱֶश͞ΕͨϞσϧΛChainerͷCaffe emulatorͰಡΉ
 ref: http://docs.chainer.org/en/stable/reference/caffe.html

    • ෼ྨΧςΰϦΛ ྉཧʗඇྉཧ ʹมߋ͠ɺΫοΫύου্ͷ
 ྉཧࣸਅΛ࢖ֶͬͯश <>IUUQDB⒎FCFSLFMFZWJTJPOPSH CookpadNet
  3. 19.

    $MJFOU "OESPJE J04 "1*4FSWFS SVCZ $MBTTJpDBUJPO8PSLFS QZUIPO DIBJOFS "NB[PO4 4UPSBHF

    <6QMPBEQIPUPUPDMBTTJGZ> "NB[PO424 2VFVF FORVFVF
 \LFZ@PO@TTUSJOH^ %#
  4. 20.

    $MJFOU "OESPJE J04 "1*4FSWFS SVCZ $MBTTJpDBUJPO8PSLFS QZUIPO DIBJOFS "NB[PO4 4UPSBHF

    <6QMPBEQIPUPUPDMBTTJGZ> "NB[PO424 2VFVF FORVFVF
 \LFZ@PO@TTUSJOH^ EFRVFVF
 \LFZ@PO@TTUSJOH^ %#
  5. 21.

    $MJFOU "OESPJE J04 "1*4FSWFS SVCZ $MBTTJpDBUJPO8PSLFS QZUIPO DIBJOFS "NB[PO4 4UPSBHF

    <6QMPBEQIPUPUPDMBTTJGZ> "NB[PO424 2VFVF FORVFVF
 \LFZ@PO@TTUSJOH^ EFRVFVF
 \LFZ@PO@TTUSJOH^ <%PXOMPBE*NBHF> %#
  6. 22.

    $MJFOU "OESPJE J04 "1*4FSWFS SVCZ $MBTTJpDBUJPO8PSLFS QZUIPO DIBJOFS "NB[PO4 4UPSBHF

    <6QMPBEQIPUPUPDMBTTJGZ> "NB[PO424 2VFVF FORVFVF
 \LFZ@PO@TTUSJOH^ EFRVFVF
 \LFZ@PO@TTUSJOH^ 1045SFTVMU
 \LFZ@PO@TTUSJOH SFTVMU\JT@GPPE <%PXOMPBE*NBHF> %#
  7. 23.

    $MJFOU "OESPJE J04 "1*4FSWFS SVCZ $MBTTJpDBUJPO8PSLFS QZUIPO DIBJOFS "NB[PO4 4UPSBHF

    <6QMPBEQIPUPUPDMBTTJGZ> 1045JT@QIPUP\LFZ@PO@TTUSJOH^ "NB[PO424 2VFVF FORVFVF
 \LFZ@PO@TTUSJOH^ EFRVFVF
 \LFZ@PO@TTUSJOH^ 1045SFTVMU
 \LFZ@PO@TTUSJOH SFTVMU\JT@GPPE <%PXOMPBE*NBHF> %#
  8. 24.

    $MJFOU "OESPJE J04 "1*4FSWFS SVCZ $MBTTJpDBUJPO8PSLFS QZUIPO DIBJOFS SFTVMU\JT@GPPECPPM^ "NB[PO4

    4UPSBHF <6QMPBEQIPUPUPDMBTTJGZ> 1045JT@QIPUP\LFZ@PO@TTUSJOH^ "NB[PO424 2VFVF FORVFVF
 \LFZ@PO@TTUSJOH^ EFRVFVF
 \LFZ@PO@TTUSJOH^ 1045SFTVMU
 \LFZ@PO@TTUSJOH SFTVMU\JT@GPPE <%PXOMPBE*NBHF> %#
  9. 25.

    $MJFOU "OESPJE J04 "1*4FSWFS SVCZ $MBTTJpDBUJPO8PSLFS QZUIPO DIBJOFS SFTVMU\JT@GPPECPPM^ "NB[PO4

    4UPSBHF <6QMPBEQIPUPUPDMBTTJGZ> 1045JT@QIPUP\LFZ@PO@TTUSJOH^ "NB[PO424 2VFVF FORVFVF
 \LFZ@PO@TTUSJOH^ EFRVFVF
 \LFZ@PO@TTUSJOH^ 1045SFTVMU
 \LFZ@PO@TTUSJOH SFTVMU\JT@GPPECPPM^^ <%PXOMPBE*NBHF> ඇಉظʹ൑ఆॲཧ
  10. 27.

    • ECS: Amazon EC2 Container Service • Docker ContainerΛEC2Ͱߏ੒͞ΕͨΫϥελʹ഑ஔ(Task) •

    github.com/eagletmt/hako • ECSͷߏ੒ΛyamlϑΝΠϧͰ؅ཧ ECSͱGPUͱDockerͱ…
  11. 28.

    "8471$ # cookpadnet-worker.yml scheduler: type: ecs region: ap-northeast-1 cluster: hako-production-g2

    desired_count: 1 app: image: cookpadnet-worker-gpu cpu: 128 memory: 3072 memory_reservation: 2048 env: AWS_REGION: ap-northeast-1 COOKPADNET_ENV: production ... %PDLFS3FHJTUSZ ։ൃऀ EPDLFSQVTI IBLPEFQMPZ &$4 EPDLFSQVMM 5BTL DPPLQBEOFUXPSLFS
  12. 29.

    "8471$ # cookpadnet-worker.yml scheduler: type: ecs region: ap-northeast-1 cluster: hako-production-g2

    desired_count: 1 app: image: cookpadnet-worker-gpu cpu: 128 memory: 3072 memory_reservation: 2048 env: AWS_REGION: ap-northeast-1 COOKPADNET_ENV: production ... %PDLFS3FHJTUSZ ։ൃऀ EPDLFSQVTI IBLPEFQMPZ &$4 EPDLFSQVMM 5BTL DPPLQBEOFUXPSLFS DockerԽ͞ΕͨWorkerΛ
 hakoͰσϓϩΠ & ߏ੒؅ཧ
  13. 31.

    • Driver͕ඞཁ • nvidia-driverͷkernel module • ಉ͡όʔδϣϯͷuser-level drivers • Docker

    Container͔ΒGPU devicesΛૢ࡞͢Δҝ
 Containerʹద੾ͳLinux Capabilityͷઃఆ͕ඞཁ Ծ૝Խ v.s. Χʔωϧ
  14. 40.

    • Driver͕ඞཁ • nvidia-driverͷkernel module • ಉ͡όʔδϣϯͷuser-level drivers • Docker

    Container͔ΒGPU devicesΛૢ࡞͢Δҝ
 Containerʹద੾ͳLinux Capabilityͷઃఆ͕ඞཁ Ծ૝Խ v.s. Χʔωϧ
  15. 45.

    Ծ૝Խ v.s. Χʔωϧ EPDLFSSVOQSJWJMFHFEHQVXPSLFS • capability શ։์ • rootͰ࣮ߦ͞Ε͍ͯΔdockerd্ͷcontainerͷதͰrootΛ औ͍ͬͯΔͷͰ৭ʑग़དྷΔ

    EPDLFSSVOQSJWJMFHFEBMQJOFMBUFTUEBUFT • GPUσόΠε͸ಛघͳϑΝΠϧͱͯ͠ଘࡏ • ΞΫηε͢ΔͨΊʹಛఆͷCapabilityઃఆ͕ඞཁ