Upgrade to Pro
— share decks privately, control downloads, hide ads and more …
Speaker Deck
Features
Speaker Deck
PRO
Sign in
Sign up for free
Search
Search
mlct.pdf
Search
Hirofumi Nakagawa/中河 宏文
July 23, 2018
Programming
2
2.1k
mlct.pdf
Hirofumi Nakagawa/中河 宏文
July 23, 2018
Tweet
Share
More Decks by Hirofumi Nakagawa/中河 宏文
See All by Hirofumi Nakagawa/中河 宏文
IoTデバイスでMLモデルを動かす技術
hnakagawa
0
180
Kanazawa_AI.pdf
hnakagawa
0
190
メルカリ写真検索における Amazon EKS の活用事例と プロダクトにおけるEdgeAI technologyの展望
hnakagawa
5
9k
メルカリの写真検索を支えるバックエンド CCSE 2019 version
hnakagawa
0
330
メルカリ写真検索における Amazon EKS の活用事例
hnakagawa
6
29k
メルカリの写真検索を支えるバックエンド
hnakagawa
1
1.2k
Mercari ML Platform
hnakagawa
1
17k
機械学習によるマーケット健全化施策を支える技術
hnakagawa
0
250
メルカリのマーケット健全化施策を支えるML基盤
hnakagawa
10
9.1k
Other Decks in Programming
See All in Programming
All(?) About Point Sets
hole
0
120
モデル駆動設計をやってみよう Modeling Forum2025ワークショップ/Let’s Try Model-Driven Design
haru860
0
150
The Missing Link in Angular's Signal Story: Resource API and httpResource
manfredsteyer
PRO
0
130
PyCon mini 東海 2025「個人ではじめるマルチAIエージェント入門 〜LangChain × LangGraphでアイデアを形にするステップ〜」
komofr
3
980
問題の見方を変える「システム思考」超入門
panda_program
0
200
AI POSにおけるLLM Observability基盤の導入 ― サイバーエージェントDXインターン成果報告
hekuchan
0
540
r2-image-worker
yusukebe
1
170
CSC509 Lecture 13
javiergs
PRO
0
250
Honoを技術選定したAI要件定義プラットフォームAcsimでの意思決定
codenote
0
230
Claude Code on the Web を超える!? Codex Cloud の実践テク5選
sunagaku
0
550
知られているようで知られていない JavaScriptの仕様 4選
syumai
0
600
Private APIの呼び出し方
kishikawakatsumi
3
880
Featured
See All Featured
Become a Pro
speakerdeck
PRO
29
5.6k
Responsive Adventures: Dirty Tricks From The Dark Corners of Front-End
smashingmag
253
22k
For a Future-Friendly Web
brad_frost
180
10k
Thoughts on Productivity
jonyablonski
73
4.9k
Helping Users Find Their Own Way: Creating Modern Search Experiences
danielanewman
31
2.9k
Docker and Python
trallard
46
3.6k
Refactoring Trust on Your Teams (GOTO; Chicago 2020)
rmw
35
3.2k
How to Create Impact in a Changing Tech Landscape [PerfNow 2023]
tammyeverts
55
3.1k
jQuery: Nuts, Bolts and Bling
dougneiner
65
8k
The Straight Up "How To Draw Better" Workshop
denniskardys
239
140k
[Rails World 2023 - Day 1 Closing Keynote] - The Magic of Rails
eileencodes
37
2.6k
The Myth of the Modular Monolith - Day 2 Keynote - Rails World 2024
eileencodes
26
3.2k
Transcript
ϝϧΧϦͷMLج൫ MLCT vol.5 hnakagawa
ࣗݾհ • Hirofumi Nakagawa (hnakagawa) • 20177݄ೖࣾ • ॴଐSRE •
σόΠευϥΠό։ൃ͔Βϑϩϯ τΤϯυ։ൃ·ͰΔԿͰ • NOT σʔλαΠΤϯςΟετ • https://github.com/hnakagawa
͓ࣄ • ML Platform։ൃ • σʔλαΠΤϯςΟετͱSREͷεΩϧΪϟο ϓΛຒΊΔ • ML Reliability,
SysML?, MLOps? • SREͷཱ͔ΒMLγεςϜͷࣗಈԽΛߦ͏
ML Platform • ͷML Platform • kubernetesϕʔε • طଘͷML FrameworkΛ༻͠
؆୯ʹTraining/ServingΛߦ͏ ڥΛఏڙ
ͦͷ͏ͪOSSͰެ։༧ఆ(ଟ
ϝϧΧϦͷMLར༻ࣄྫ • ײಈग़ • ҧग़ݕ • Ձ֨αδΣετ • ΤΠταδΣετ ʑ…
̍ઍສpredictionΛߦ͍ͬͯΔ
ML Platform Architecture ,VCFSOFUFT $POUSPMMFS $-* $MVTUFS8PSLGMPX %BTICPBSE 4UPSBHF(BUFXBZ .FUSJDT
3VOOFS $PNQPOFOU .FSDBSJ.- $PNQPOFOU &YUFSOBM .JEEMFXBSF
Model Training & Serving Workflow
.-1MBUGPSN USBJOJOHDMVTUFS Workflow for Production $* .-1MBUGPSN TFSWJOHDMVTUFSGPSUFTU .PEFM3FHJTUSZ +PC
+PC ɾɾ 3&45 "1* 4USFBNJOH 5'4FSW JOH ɾɾɾ
.-1MBUGPSN USBJOJOHDMVTUFS Training Workflow $* .PEFM3FHJTUSZ +PC +PC ɾɾɾ 1.
GitHubͷpushΛτϦΨʹtrainingΛىಈ 2. Training͞ΕͨModelModel Registry ্͕Δ
Serving Workflow .-1MBUGPSN TFSWJOHDMVTUFSGPSUFTU .PEFM3FHJTUSZ ɾɾ 3&45 "1* 4USFBNJOH 5'
4FSWJOH 1. Model RegistryΛࢹͯࣗ͠ಈͰModel ΛServing 2. Serving&Test͕ޭ͢Δͱຊ൪༻k8s manifestΛग़ྗ
Container Workflow %BUB4PVSDF *NBHF 5FYUɹ 1SFQSPDFT TJOH *NBHF &TUJNBUPS *NBHF
17 17 1JDUVSF 1SFQSPDFT TJOH *NBHF 17 It’s own implementation
Model Serving APIͷߏྫ 5FOTPS'MPX 4FSWJOH 5' .PEFM 5' .PEFM 'MBTL
4, .PEFM 4, .PEFM 4, .PEFM gRPC .FSDBSJ"1* REST FlaskͰલॲཧΛߦ͍ ཪͷTensorFlow Servingʹ͍͛ͯΔ
Model Serving API Streaming ver ͷߏྫ 5FOTPS'MPX 4FSWJOH 5' .PEFM
5' .PEFM .-1MBUGPSN 'SBNFXPSL PS "QBDIF#FBN 4, .PEFM 4, .PEFM 4, .PEFM gRPC PubSub
ModelͱίϯςφɾΠϝʔδ • ڊେͳML ModelΛίϯςφɾΠϝʔδʹؚΊ Δ͔൱͔ • ؚΊͳ͍ͷͰ͋ΕԿॲʹஔ͢Δ͔ • ϙʔλϏϦςΟੑͱϩʔυ࣌ؒͷτϨʔυΦϑ •
ྑ͍ΞΠσΟΞ͕͋Εڭ͑ͯԼ͍͞…
௨ৗͷAPIͱಛੑ͕ҧ͏ • ѻ͏ϦιʔεɺModelαΠζ͕େ͖͘ͳΔ ߹͕ଟ͍(ඦMBʙGB) • CPUɾϝϞϦϦιʔεͷফඅ͕ܹ͍͠ • ߹ʹΑͬͯGPU͏
ϝϞϦফඅ • ҧݕγεςϜͷPython࣮෦࣮ߦ࣌ ʹ2GBϝϞϦΛফඅ͢Δˠࠓޙ͞Βʹ૿͑ Δ༧ఆ͋Δ • Scikit-learnͰهड़͞Εͨલॲཧ෦͕େ͖͘ ͳΓ͕ͪ
Pythonͱฒྻੑ • વThread͕͑ͳ͍(GILͷͨΊ) • ϓϩηεຖʹModelΛϩʔυ͢Δͱඞཁͳϝ ϞϦαΠζ͕େ͖͘ͳΔˠ Blue-Green DeployͷোʹͳΔ
ਖ਼PythonͰͷServing Πϯϑϥతʹਏ͍ࣄ͕ଟ͍…
ϝϞϦΛݡ͘͏ • fork͢ΔલʹmodelΛϩʔυ͠Copy on Write Λޮ͔͢ • k8sͷone process per
containerηΦϦ͋ ͑ͯഁ͍ͬͯΔ
Copy On Writeͷ෮श ϝϞϦ ϓϩηε ࢠϓϩηε 2.fork 1BHF" 1.allocation ಉ͡ྖҬΛࢀর
ϓϩηε͕ϝϞϦͷ༰Λ ॻ͖͑Δͱ… ϝϞϦ ϓϩηε ࢠϓϩηε 1BHF" 1BHF# OS͕ผͷྖҬΛAllocationͯ͠ݩσʔλΛίϐʔ͢Δ ผͷྖҬΛࢀর
Current Issues
ߴͳܧଓతϝϯςφϯε͕ඞཁ • MLػೳσʔλͷ͕มΘͬͨΓɺ༧֎ ͷ͕ൃੜͨ͠Γͯ͠ɺͦΕΒʹରԠ͠ଓ ͚Δඞཁ͕͋Δ MLػೳϦϦʔεޙେ͖ͳ ίετ͕͔͔Γଓ͚Δ
େ෯ͳࣗಈԽ͕ඞਢ
In Progress
ߴͳࣗಈԽ • ࣾͷσʔλ͔ΒFeature Extraction͢Δ࣮ ΛίϯϙʔωϯτԽ • ಛఆͷΛղܾ͢ΔϞσϧߏஙΛ͋Δఔ ࣗಈԽ • ϦϦʔεޙͷRe-TrainingɺHyper
parameter optimizationɺDeployΛࣗಈԽ
AutoFlow 'FBUVSF&YUSBDUJPO $PNQPOFOUT $MBTTJGJDBUJPO $PNQPOFOUT $PODBUFOBUJPO $PNQPOFOUT .PEFM #VJMEFS $PNQPOFOUT
3FHJTUSZ Ϋϥελ্ͰϞσϧͷࣗಈߏஙͱϋΠύʔύϥ ϝʔλͷࣗಈௐΛߦ͏
AutoServing %FQMPZ ϦϦʔεޙͷਫ਼ࢹɾRe-TrainingɾRe-Deploy ΛࣗಈͰߦ͏ .POJUPSJOH &WBMVBUJPO )ZQFS QBSBNFUFS PQUJNJ[BUJPO 3F5SBJOJOH
·ͱΊ • MLʹগ͠௨ৗͱҧ͏Πϯϑϥ͕ඞཁʹͳΔ ˠ·ͩϕετɾϓϥΫςΟε͔Βͳ͍ • ͦͦMLͳػೳΛຊ֨ӡ༻͠Α͏ͱ͢Δ ͱɺେ෯ͳࣗಈԽɾΈԽΛਐΊͳ͍ͱ্ ख͘ߦ͔ͳ͍
͝ਗ਼ௌ͋Γ͕ͱ͏͍͟͝·ͨ͠!!