Upgrade to Pro
— share decks privately, control downloads, hide ads and more …
Speaker Deck
Features
Speaker Deck
PRO
Sign in
Sign up for free
Search
Search
Build Image Classification service with Amazon ...
Search
Sponsored
·
Your Podcast. Everywhere. Effortlessly.
Share. Educate. Inspire. Entertain. You do you. We'll handle the rest.
→
Yuichiro Someya
November 22, 2016
Programming
2.9k
4
Share
Embed
Copy iframe code
Copy JS code
Copy link
Start on current slide
Build Image Classification service with Amazon ECS and GPU instances
Yuichiro Someya
November 22, 2016
More Decks by Yuichiro Someya
See All by Yuichiro Someya
にんげんがさき 基盤はあと / Developers over ML platform
ayemos
0
15k
機械学習をスモールスタートさせる方法 / small machine learning
ayemos
3
2.1k
アットホームな分析基盤の作り方 / Homemade Machine Learning Toolkits
ayemos
1
1k
サービス開発、機械学習、クラウド / the trinity of machine learning
ayemos
0
3.6k
成長を止めない機械学習のやり方 / Don't stop 'til you get enough (data).
ayemos
15
5.3k
AWS で加速する機械学習 / Accelerate Machine Learning with AWS
ayemos
1
360
クックパッドの機械学習基盤 2018 / Machine Learning Platform at Cookpad ~ 2018 ~
ayemos
15
21k
PyTorchとCaffe2とONNXと深層学習モデルのデプロイについて
ayemos
1
3.1k
クックパッドにおけるAWS GPUインスタンスの利用事例 / Powering by AWS GPU Instances in Cookpad Inc
ayemos
0
460
Other Decks in Programming
See All in Programming
TAKTでAI駆動開発の品質を設計する
j5ik2o
6
1.2k
生成AI時代にこそ効くGo | Why Go Works in the Age of Generative AI
mom0tomo
8
3.2k
例外の正しい扱い方 そのエラー try-catchして大丈夫?
jinwatanabe
0
210
ユニットテストの先へ:テスト技法で要求・仕様を整理するJava開発実践 / Beyond_Unit_Testing_Practical_Java_Development_Techniques_for_Organizing_Requirements_and_Specifications
shimashima35
0
390
JJUG CCC 2026 Spring: JSpecify で実現する Kotlin フレンドリーな Java API 設計
ternbusty
1
160
New "Type" system on PicoRuby
pocke
1
830
セキュリティの専門家じゃなくてもできる。「セキュリティ意識」をアップデートして サプライチェーン攻撃への耐性を高めよう。
tk3fftk
5
710
Webフレームワークの ベンチマークについて
yusukebe
0
160
A2UI という光を覗いてみる
satohjohn
1
130
3Dシーンの圧縮
fadis
1
730
Composerを使ったサプライチェーン攻撃の様子を眺めてみる #phpstudy
o0h
PRO
2
240
Oxlintのカスタムルールの現況
syumai
6
1.1k
Featured
See All Featured
Highjacked: Video Game Concept Design
rkendrick25
PRO
1
390
Public Speaking Without Barfing On Your Shoes - THAT 2023
reverentgeek
1
420
Tell your own story through comics
letsgokoyo
1
950
ReactJS: Keep Simple. Everything can be a component!
pedronauck
666
130k
Efficient Content Optimization with Google Search Console & Apps Script
katarinadahlin
PRO
1
610
Accessibility Awareness
sabderemane
1
140
How People are Using Generative and Agentic AI to Supercharge Their Products, Projects, Services and Value Streams Today
helenjbeal
1
210
Technical Leadership for Architectural Decision Making
baasie
3
400
Paper Plane
katiecoart
PRO
1
51k
Easily Structure & Communicate Ideas using Wireframe
afnizarnur
194
17k
Winning Ecommerce Organic Search in an AI Era - #searchnstuff2025
aleyda
1
2k
How Software Deployment tools have changed in the past 20 years
geshan
0
34k
Transcript
Build Image Classification service with AWS ECS and GPU instances
Yuichiro Someya @ Cookpad
• છ୩ ༔Ұ [Yuichiro Someya] • ౦େେֶӃ ܭࢉֶઐ߈ म࢜ •
'16 ৽ଔ @ ΫοΫύου • github.com/ayemos • twitter.com/kumasan_com echo `whoami`
• ྉཧࣸਅͷࣗಈऩूαʔϏεΛ • CaffeNet[1]Λݩʹ࡞ͬͨྉཧը૾ೝࣝϞσϧͱ • Amazon SQS/S3 Ͱߏங͞Εͨσʔλϑϩʔͱ • Amazon
ECS (GPU instance) Λར༻ͯ͠ӡ༻͍ͯ͠Δ <>IUUQTHJUIVCDPN#7-$DB⒎F Agenda
• ྉཧࣸਅͷࣗಈऩूαʔϏεΛ • CaffeNet[1]Λݩʹ࡞ͬͨྉཧը૾ೝࣝϞσϧͱ • Amazon SQS/S3 Ͱߏங͞Εͨσʔλϑϩʔͱ • Amazon
ECS (GPU instance) Λར༻ͯ͠ӡ༻͍ͯ͠Δ <>IUUQTHJUIVCDPN#7-$DB⒎F Agenda
ΫοΫύου • Ϩγϐɿ 250ສҎ্ • ݄࣍ؒར༻ऀɿ 6,000ສਓҎ্
• εϚϗͷࣸਅ͔Βྉཧ͚ͩΛࣗಈతʹऩू • Ұ෦ͷϢʔβʔ͚ʹݶఆతʹެ։த ྉཧ͖Ζ͘
• CaffeNetΛ ྉཧʗඇྉཧ ఆ͚ʹFine Tuningͨ͠Ϟσϧ • Caffe[1]Ͱֶश͞ΕͨϞσϧΛChainerͷCaffe emulatorͰಡΉ ref: http://docs.chainer.org/en/stable/reference/caffe.html
• ྨΧςΰϦΛ ྉཧʗඇྉཧ ʹมߋ͠ɺΫοΫύου্ͷ ྉཧࣸਅΛֶͬͯश <>IUUQDB⒎FCFSLFMFZWJTJPOPSH CookpadNet
• CookpadNetͲ͜ͰఆΛߦ͍ɺͦͷ݁ՌͲ͜ʹͲ͏͑Δ ͷ͔ʁ • ఆϞσϧΛΫϥΠΞϯτʹஔ͍ͯఆ • ϞσϧαΠζ͕େ͖͍(100MB~)ͷͰɺݱ࣮తͰͳ͍ • (αΠζͷখ͍͞ϞσϧΛݚڀத) •
ఆΛߦ͏ίϯϙʔωϯτΛ֎෦ʹஔ͘ • HTTP Serverʁ σʔλϑϩʔʗϫʔΫϑϩʔ
$MJFOU "OESPJE J04 "1*4FSWFS SVCZ $MBTTJpDBUJPO4FSWFS QZUIPO DIBJOFS
$MJFOU "OESPJE J04 "1*4FSWFS SVCZ 1045DMBTTJGZ\QIPUPCJOBSZ^ $MBTTJpDBUJPO4FSWFS QZUIPO DIBJOFS
$MJFOU "OESPJE J04 "1*4FSWFS SVCZ 1045DMBTTJGZ\QIPUPCJOBSZ^ $MBTTJpDBUJPO4FSWFS QZUIPO 1045DMBTTJGZ\QIPUPCJOBSZ^ DIBJOFS
$MJFOU "OESPJE J04 "1*4FSWFS SVCZ 1045DMBTTJGZ\QIPUPCJOBSZ^ $MBTTJpDBUJPO4FSWFS QZUIPO 1045DMBTTJGZ\QIPUPCJOBSZ^ SFTVMU\JT@GPPECPPM^
DIBJOFS
$MJFOU "OESPJE J04 "1*4FSWFS SVCZ 1045DMBTTJGZ\QIPUPCJOBSZ^ $MBTTJpDBUJPO4FSWFS QZUIPO 1045DMBTTJGZ\QIPUPCJOBSZ^ SFTVMU\JT@GPPECPPM^
DIBJOFS SFTVMU\JT@GPPECPPM^
$MJFOU "OESPJE J04 "1*4FSWFS SVCZ 1045DMBTTJGZ\QIPUPCJOBSZ^ $MBTTJpDBUJPO4FSWFS QZUIPO 1045DMBTTJGZ\QIPUPCJOBSZ^ SFTVMU\JT@GPPECPPM^
DIBJOFS ը૾ͷΞοϓϩʔυ ը૾ॲཧ ఆ SFTVMU\JT@GPPECPPM^
$MJFOU "OESPJE J04 "1*4FSWFS SVCZ 1045DMBTTJGZ\QIPUPCJOBSZ^ $MBTTJpDBUJPO4FSWFS QZUIPO 1045DMBTTJGZ\QIPUPCJOBSZ^ SFTVMU\JT@GPPECPPM^
DIBJOFS ը૾ͷΞοϓϩʔυ ը૾ॲཧ ఆ SFTVMU\JT@GPPECPPM^ >>> 300~500 ms <<<
• ը૾ॲཧͱϞσϧʹinferenceʹֻ͕͔ͦͦ࣌ؒ͜͜Δ (300~500ms) • APIαʔόʔ͔Βಉظతʹୟ͚ͳ͍ (Unicorn ͷ worker͕ਚ͖ͯ͠·͏) • Amazon
S3, SQSΛར༻ͨ͠ඇಉظͳఆॲཧϫʔΫϑϩʔ σʔλϑϩʔʗϫʔΫϑϩʔ
$MJFOU "OESPJE J04 "1*4FSWFS SVCZ $MBTTJpDBUJPO8PSLFS QZUIPO DIBJOFS "NB[PO4 4UPSBHF
"NB[PO424 2VFVF %#
$MJFOU "OESPJE J04 "1*4FSWFS SVCZ $MBTTJpDBUJPO8PSLFS QZUIPO DIBJOFS "NB[PO4 4UPSBHF
<6QMPBEQIPUPUPDMBTTJGZ> "NB[PO424 2VFVF %#
$MJFOU "OESPJE J04 "1*4FSWFS SVCZ $MBTTJpDBUJPO8PSLFS QZUIPO DIBJOFS "NB[PO4 4UPSBHF
<6QMPBEQIPUPUPDMBTTJGZ> "NB[PO424 2VFVF FORVFVF \LFZ@PO@TTUSJOH^ %#
$MJFOU "OESPJE J04 "1*4FSWFS SVCZ $MBTTJpDBUJPO8PSLFS QZUIPO DIBJOFS "NB[PO4 4UPSBHF
<6QMPBEQIPUPUPDMBTTJGZ> "NB[PO424 2VFVF FORVFVF \LFZ@PO@TTUSJOH^ EFRVFVF \LFZ@PO@TTUSJOH^ %#
$MJFOU "OESPJE J04 "1*4FSWFS SVCZ $MBTTJpDBUJPO8PSLFS QZUIPO DIBJOFS "NB[PO4 4UPSBHF
<6QMPBEQIPUPUPDMBTTJGZ> "NB[PO424 2VFVF FORVFVF \LFZ@PO@TTUSJOH^ EFRVFVF \LFZ@PO@TTUSJOH^ <%PXOMPBE*NBHF> %#
$MJFOU "OESPJE J04 "1*4FSWFS SVCZ $MBTTJpDBUJPO8PSLFS QZUIPO DIBJOFS "NB[PO4 4UPSBHF
<6QMPBEQIPUPUPDMBTTJGZ> "NB[PO424 2VFVF FORVFVF \LFZ@PO@TTUSJOH^ EFRVFVF \LFZ@PO@TTUSJOH^ 1045SFTVMU \LFZ@PO@TTUSJOH SFTVMU\JT@GPPE <%PXOMPBE*NBHF> %#
$MJFOU "OESPJE J04 "1*4FSWFS SVCZ $MBTTJpDBUJPO8PSLFS QZUIPO DIBJOFS "NB[PO4 4UPSBHF
<6QMPBEQIPUPUPDMBTTJGZ> 1045JT@QIPUP\LFZ@PO@TTUSJOH^ "NB[PO424 2VFVF FORVFVF \LFZ@PO@TTUSJOH^ EFRVFVF \LFZ@PO@TTUSJOH^ 1045SFTVMU \LFZ@PO@TTUSJOH SFTVMU\JT@GPPE <%PXOMPBE*NBHF> %#
$MJFOU "OESPJE J04 "1*4FSWFS SVCZ $MBTTJpDBUJPO8PSLFS QZUIPO DIBJOFS SFTVMU\JT@GPPECPPM^ "NB[PO4
4UPSBHF <6QMPBEQIPUPUPDMBTTJGZ> 1045JT@QIPUP\LFZ@PO@TTUSJOH^ "NB[PO424 2VFVF FORVFVF \LFZ@PO@TTUSJOH^ EFRVFVF \LFZ@PO@TTUSJOH^ 1045SFTVMU \LFZ@PO@TTUSJOH SFTVMU\JT@GPPE <%PXOMPBE*NBHF> %#
$MJFOU "OESPJE J04 "1*4FSWFS SVCZ $MBTTJpDBUJPO8PSLFS QZUIPO DIBJOFS SFTVMU\JT@GPPECPPM^ "NB[PO4
4UPSBHF <6QMPBEQIPUPUPDMBTTJGZ> 1045JT@QIPUP\LFZ@PO@TTUSJOH^ "NB[PO424 2VFVF FORVFVF \LFZ@PO@TTUSJOH^ EFRVFVF \LFZ@PO@TTUSJOH^ 1045SFTVMU \LFZ@PO@TTUSJOH SFTVMU\JT@GPPECPPM^^ <%PXOMPBE*NBHF> ඇಉظʹఆॲཧ
• ྉཧࣸਅͷࣗಈऩूαʔϏεΛ • CaffeNet[1]Λݩʹ࡞ͬͨྉཧը૾ೝࣝϞσϧͱ • Amazon SQS/S3 Ͱߏங͞Εͨσʔλϑϩʔͱ • Amazon
ECS Λར༻ͯ͠ӡ༻͍ͯ͠Δ <>IUUQTHJUIVCDPN#7-$DB⒎F Agenda
• ECS: Amazon EC2 Container Service • Docker ContainerΛEC2Ͱߏ͞ΕͨΫϥελʹஔ(Task) •
github.com/eagletmt/hako • ECSͷߏΛyamlϑΝΠϧͰཧ ECSͱGPUͱDockerͱ…
"8471$ # cookpadnet-worker.yml scheduler: type: ecs region: ap-northeast-1 cluster: hako-production-g2
desired_count: 1 app: image: cookpadnet-worker-gpu cpu: 128 memory: 3072 memory_reservation: 2048 env: AWS_REGION: ap-northeast-1 COOKPADNET_ENV: production ... %PDLFS3FHJTUSZ ։ൃऀ EPDLFSQVTI IBLPEFQMPZ &$4 EPDLFSQVMM 5BTL DPPLQBEOFUXPSLFS
"8471$ # cookpadnet-worker.yml scheduler: type: ecs region: ap-northeast-1 cluster: hako-production-g2
desired_count: 1 app: image: cookpadnet-worker-gpu cpu: 128 memory: 3072 memory_reservation: 2048 env: AWS_REGION: ap-northeast-1 COOKPADNET_ENV: production ... %PDLFS3FHJTUSZ ։ൃऀ EPDLFSQVTI IBLPEFQMPZ &$4 EPDLFSQVMM 5BTL DPPLQBEOFUXPSLFS DockerԽ͞ΕͨWorkerΛ hakoͰσϓϩΠ & ߏཧ
w XPSLFSͰ(16Λ༻ w ಉՁ֨ଳͷ$16Πϯελϯεͱൺͯ ഒͷੑೳࠩ w %PDLFS (16 GPU
• Driver͕ඞཁ • nvidia-driverͷkernel module • ಉ͡όʔδϣϯͷuser-level drivers • Docker
Container͔ΒGPU devicesΛૢ࡞͢Δҝ ContainerʹదͳLinux Capabilityͷઃఆ͕ඞཁ ԾԽ v.s. Χʔωϧ
ubuntu EPDLFSDPOUBJOFS ཧ OWJEJB(16 VTFSMFWFMESJWFS LFSOFMNPEVMFT
ubuntu EPDLFSDPOUBJOFS ཧ OWJEJB(16 VTFSMFWFMESJWFS LFSOFMNPEVMFT
ubuntu EPDLFSDPOUBJOFS ཧ OWJEJB(16 VTFSMFWFMESJWFS LFSOFMNPEVMFT ESJWFSךQBUIכ04ח״殯ז
NVIDIA Docker • Docker CLIͷബ͍ϥούʔ • `docker run` ࣌ʹඞཁͳvolumeΛࣗಈతʹmount ͯ͘͠ΕΔ
NVIDIA Docker • Docker CLIͷബ͍ϥούʔ • `docker run` ࣌ʹඞཁͳvolumeΛࣗಈతʹmount ͯ͘͠ΕΔ
"NB[PO&$4דכ劢؟ه٦ز
ubuntu EPDLFSDPOUBJOFS ཧ OWJEJB(16 VTFSMFWFMESJWFS LFSOFMNPEVMFT
ubuntu EPDLFSDPOUBJOFS ཧ OWJEJB(16 VTFSMFWFMESJWFS LFSOFMNPEVMFT (ಉҰόʔδϣϯ)
ubuntu EPDLFSDPOUBJOFS ཧ OWJEJB(16 VTFSMFWFMESJWFS LFSOFMNPEVMFT 㣐⡤鍑寸 (ಉҰόʔδϣϯ)
• Driver͕ඞཁ • nvidia-driverͷkernel module • ಉ͡όʔδϣϯͷuser-level drivers • Docker
Container͔ΒGPU devicesΛૢ࡞͢Δҝ ContainerʹదͳLinux Capabilityͷઃఆ͕ඞཁ ԾԽ v.s. Χʔωϧ
• GPUσόΠεಛघͳϑΝΠϧͱͯ͠ଘࡏ • ΞΫηε͢ΔͨΊʹಛఆͷCapabilityઃఆ͕ඞཁ ԾԽ v.s. Χʔωϧ EPDLFSSVOa EFWJDFEFWOWJEJBEFWOWJEJBa EFWJDFEFWOWJEJBVWNEFWOWJEJBVWNa
HQVXPSLFS
• GPUσόΠεಛघͳϑΝΠϧͱͯ͠ଘࡏ • ΞΫηε͢ΔͨΊʹಛఆͷCapabilityઃఆ͕ඞཁ ԾԽ v.s. Χʔωϧ EPDLFSSVOa EFWJDFEFWOWJEJBEFWOWJEJBa EFWJDFEFWOWJEJBVWNEFWOWJEJBVWNa
HQVXPSLFS &$4ͷ5BTLఆٛʹ͓͍ͯEFWJDFΦϓγϣϯະαϙʔτ
• GPUσόΠεಛघͳϑΝΠϧͱͯ͠ଘࡏ • ΞΫηε͢ΔͨΊʹಛఆͷCapabilityઃఆ͕ඞཁ ԾԽ v.s. Χʔωϧ EPDLFSSVOa EFWJDFEFWOWJEJBEFWOWJEJBa EFWJDFEFWOWJEJBVWNEFWOWJEJBVWNa
HQVXPSLFS &$4ͷ5BTLఆٛʹ͓͍ͯEFWJDFΦϓγϣϯະαϙʔτ
• GPUσόΠεಛघͳϑΝΠϧͱͯ͠ଘࡏ • ΞΫηε͢ΔͨΊʹಛఆͷCapabilityઃఆ͕ඞཁ ԾԽ v.s. Χʔωϧ EPDLFSSVOQSJWJMFHFEHQVXPSLFS
ԾԽ v.s. Χʔωϧ EPDLFSSVOQSJWJMFHFEHQVXPSLFS • capability શ։์ • rootͰ࣮ߦ͞Ε͍ͯΔdockerd্ͷcontainerͷதͰrootΛ औ͍ͬͯΔͷͰ৭ʑग़དྷΔ
EPDLFSSVOQSJWJMFHFEBMQJOFMBUFTUEBUFT • GPUσόΠεಛघͳϑΝΠϧͱͯ͠ଘࡏ • ΞΫηε͢ΔͨΊʹಛఆͷCapabilityઃఆ͕ඞཁ
• GPUσόΠεಛघͳϑΝΠϧͱͯ͠ଘࡏ • ΞΫηε͢ΔͨΊʹಛఆͷCapabilityઃఆ͕ඞཁ ԾԽ v.s. Χʔωϧ EPDLFSSVOQSJWJMFHFEHQVXPSLFS • rootҎ֎ͷϢʔβʔͰ࣮ߦ͢Δ͜ͱʹ͢Δ
• DockerFileͰ `USER runner`
• ྉཧࣸਅͷࣗಈऩूαʔϏεΛ • CaffeNet[1]Λݩʹ࡞ͬͨྉཧը૾ೝࣝϞσϧͱ • Amazon SQS/S3 Ͱߏங͞Εͨσʔλϑϩʔͱ • Amazon
ECS (GPU instance) Λར༻ͯ͠ӡ༻͍ͯ͠Δ <>IUUQTHJUIVCDPN#7-$DB⒎F Agenda