Upgrade to Pro
— share decks privately, control downloads, hide ads and more …
Speaker Deck
Features
Speaker Deck
PRO
Sign in
Sign up for free
Search
Search
メルカリのマーケット健全化施策を支えるML基盤
Search
Sponsored
·
Ship Features Fearlessly
Turn features on and off without deploys. Used by thousands of Ruby developers.
→
Hirofumi Nakagawa/中河 宏文
May 23, 2018
Programming
10
9.1k
メルカリのマーケット健全化施策を支えるML基盤
Hirofumi Nakagawa/中河 宏文
May 23, 2018
Tweet
Share
More Decks by Hirofumi Nakagawa/中河 宏文
See All by Hirofumi Nakagawa/中河 宏文
IoTデバイスでMLモデルを動かす技術
hnakagawa
0
210
Kanazawa_AI.pdf
hnakagawa
0
210
メルカリ写真検索における Amazon EKS の活用事例と プロダクトにおけるEdgeAI technologyの展望
hnakagawa
5
9.1k
メルカリの写真検索を支えるバックエンド CCSE 2019 version
hnakagawa
0
350
メルカリ写真検索における Amazon EKS の活用事例
hnakagawa
6
29k
メルカリの写真検索を支えるバックエンド
hnakagawa
1
1.2k
Mercari ML Platform
hnakagawa
1
17k
mlct.pdf
hnakagawa
2
2.1k
機械学習によるマーケット健全化施策を支える技術
hnakagawa
0
270
Other Decks in Programming
See All in Programming
今こそ知るべき耐量子計算機暗号(PQC)入門 / PQC: What You Need to Know Now
mackey0225
3
390
React 19でつくる「気持ちいいUI」- 楽観的UIのすすめ
himorishige
11
7.5k
Raku Raku Notion 20260128
hareyakayuruyaka
0
370
MUSUBIXとは
nahisaho
0
140
Python’s True Superpower
hynek
0
110
Automatic Grammar Agreementと Markdown Extended Attributes について
kishikawakatsumi
0
200
AI巻き込み型コードレビューのススメ
nealle
2
1.5k
[KNOTS 2026登壇資料]AIで拡張‧交差する プロダクト開発のプロセス および携わるメンバーの役割
hisatake
0
300
NetBSD+Raspberry Piで 本物のPSGを鳴らすデモを OSC駆動の7日間で作った話 / OSC2026Osaka
tsutsui
1
100
Amazon Bedrockを活用したRAGの品質管理パイプライン構築
tosuri13
5
800
MDN Web Docs に日本語翻訳でコントリビュート
ohmori_yusuke
0
660
The Past, Present, and Future of Enterprise Java
ivargrimstad
0
620
Featured
See All Featured
The Language of Interfaces
destraynor
162
26k
Beyond borders and beyond the search box: How to win the global "messy middle" with AI-driven SEO
davidcarrasco
1
58
Hiding What from Whom? A Critical Review of the History of Programming languages for Music
tomoyanonymous
2
430
Paper Plane
katiecoart
PRO
0
46k
Bioeconomy Workshop: Dr. Julius Ecuru, Opportunities for a Bioeconomy in West Africa
akademiya2063
PRO
1
57
RailsConf 2023
tenderlove
30
1.3k
Design and Strategy: How to Deal with People Who Don’t "Get" Design
morganepeng
133
19k
Reality Check: Gamification 10 Years Later
codingconduct
0
2k
Design in an AI World
tapps
0
150
Understanding Cognitive Biases in Performance Measurement
bluesmoon
32
2.8k
16th Malabo Montpellier Forum Presentation
akademiya2063
PRO
0
53
The Web Performance Landscape in 2024 [PerfNow 2024]
tammyeverts
12
1k
Transcript
ϝϧΧϦͷϚʔέοτ݈શԽ ࢪࡦΛࢧ͑ΔMLج൫ Mercari ML Ops Night Vol.1 hnakagawa
ࣗݾհ • Hirofumi Nakagawa (hnakagawa) • 20177݄ೖࣾ • ॴଐSRE •
σόΠευϥΠό։ൃ͔Βϑϩϯ τΤϯυ։ൃ·ͰΔԿͰ • NOT MLΤϯδχΞ • https://github.com/hnakagawa
͓ࣄ • ML Platform։ൃ • MLΤϯδχΞͱSREͷεΩϧΪϟοϓΛຒΊ Δ • ML Reliability,
SysML?, MLOps? • SREͷཱ͔ΒMLγεςϜͷࣗಈԽΛߦ͏
ML Platform • ͷML Platform • kubernetesϕʔε • ϩʔΧϧڥͱΫϥελڥͷ ࠩΛநԽ͢Δ
• ศརAPI܈ • طଘͷML FrameworkΛ༻͠ ؆୯ʹTraining/ServingΛߦ͏ ڥΛఏڙ
ͦͷ͏ͪOSSͰެ։༧ఆ(ଟ
ࣄྫ ϦΞϧλΠϜࢹγεςϜ • ௨শ Lovemachine • ML Platform্ʹ࣮͞Ε͍ͯΔ .-1MBUGPSN USBJOJOHDMVTUFS
-PWFNBDIJOF ($4 GKE PubSub .-1MBUGPSN TFSWJOHDMVTUFS -PWFNBDIJOF
Model Training & Serving Workflow
.-1MBUGPSN USBJOJOHDMVTUFS Workflow for Production $* .-1MBUGPSN TFSWJOHDMVTUFSGPSUFTU .PEFM3FHJTUSZ +PC
+PC ɾɾ 3&45 "1* 4USFBNJOH 5' 4FSWJOH ɾɾɾ
.-1MBUGPSN USBJOJOHDMVTUFS Training Workflow $* .PEFM3FHJTUSZ +PC +PC ɾɾɾ 1.
GitHubͷpushΛτϦΨʹtrainingΛىಈ 2. Training͞ΕͨModelModel Registry ্͕Δ
Serving Workflow .-1MBUGPSN TFSWJOHDMVTUFSGPSUFTU .PEFM3FHJTUSZ ɾɾ 3&45 "1* 4USFBNJOH 5'
4FSWJOH ɾɾɾ 1. Model RegistryΛࢹͯࣗ͠ಈͰModel ΛServing 2. Serving&Test͕ޭ͢Δͱຊ൪༻k8s manifestΛग़ྗ
Model Serving APIͷߏྫ 5FOTPS'MPX 4FSWJOH 5' .PEFM 5' .PEFM 'MBTL
4, .PEFM 4, .PEFM 4, .PEFM gRPC .FSDBSJ"1* REST FlaskͰલॲཧΛߦ͍ ཪͷTensorFlow Servingʹ͍͛ͯΔ
Model Serving API Streaming ver ͷߏྫ 5FOTPS'MPX 4FSWJOH 5' .PEFM
5' .PEFM .-1MBUGPSN 'SBNFXPSL PS "QBDIF#FBN 4, .PEFM 4, .PEFM 4, .PEFM gRPC PubSub
TensorFlow Serving • TensorFlow project͕ఏڙͯ͠ ͍ΔServingڥ • PythonॲཧܥΛհͣ͞ʹTFͷ modelΛservingͰ͖Δ •
ඪ४ͷ࣮ͰgRPCͰAPIΛ ఏڙ
ModelͱίϯςφɾΠϝʔδ • ڊେͳML ModelΛίϯςφɾΠϝʔδʹؚΊ Δ͔൱͔ • ؚΊͳ͍ͷͰ͋ΕԿॲʹஔ͢Δ͔ • ϙʔλϏϦςΟੑͱϩʔυ࣌ؒͷτϨʔυΦϑ •
ྑ͍ΞΠσΟΞ͕͋Εڭ͑ͯԼ͍͞…
௨ৗͷAPIͱҧ͏ • ѻ͏ϦιʔεɺModelαΠζ͕େ͖͘ͳΔ ߹͕ଟ͍(ඦMBʙGB) • CPUɾϝϞϦϦιʔεͷফඅ͕ܹ͍͠ • ߹ʹΑͬͯGPU͏
ϝϞϦফඅ • LovemachineͷPython࣮෦࣮ߦ࣌ʹ 2GBϝϞϦΛফඅ͢Δˠࠓޙ͞Βʹ૿͑Δ༧ ఆ͋Δ • Scikit-learnͰهड़͞ΕͨTF-IDFͷલॲཧ෦ ͕େ͖͘ͳΔࣄ͕ଟ͍
Pythonͱฒྻੑ • વThread͕͑ͳ͍(GILͷͨΊ) • ϓϩηεຖʹModelΛϩʔυ͢Δͱඞཁͳϝ ϞϦαΠζ͕େ͖͘ͳΔˠ Blue-Green DeployͷোʹͳΔ
ਖ਼PythonͰͷServing Πϯϑϥతʹਏ͍ࣄ͕ଟ͍…
ϝϞϦΛݡ͘͏ • fork͢ΔલʹmodelΛϩʔυ͠Copy on Write Λޮ͔͢ • k8sͷone process per
containerηΦϦ͋ ͑ͯഁ͍ͬͯΔ
Copy On Writeͷ෮श ϝϞϦ ϓϩηε ࢠϓϩηε 2.fork 1BHF" 1.allocation ಉ͡ྖҬΛࢀর
ϓϩηε͕ϝϞϦͷ༰Λ ॻ͖͑Δͱ… ϝϞϦ ϓϩηε ࢠϓϩηε 1BHF" 1BHF# OS͕ผͷྖҬΛAllocationͯ͠ݩσʔλΛίϐʔ͢Δ ผͷྖҬΛࢀর
Current Issues • ਓؒͷߦಈΛ૬खʹ͍ͯ͠Δҝɺσʔλͷ ͕มΘΓ͔ͬͨ͢Γɺ༧֎ͷ͕ൃ ੜͨ͠Γͯ͠ɺରԠ͠ଓ͚Δඞཁ͕͋Δ ˠ ML Model࡞ऀʹෛ୲ֻ͕͔Γଓ͚Δ ˠ
SREͱͯࣗ͠ಈԽΛؚΜͩΈͰղܾ ͍ͨ͠
In Progress • ࣾͷσʔλ͔ΒEmbedding͢Δ࣮Λίϯ ϙʔωϯτԽ • ಛఆͷΛղܾ͢ΔϞσϧߏஙΛ͋Δఔ ࣗಈԽ ˠࣾͷղܾʹಛԽͨ͠ઐ༻ͷAutoMLత ͳԿ͔
AutoFlow(Ծ) 'FBUVSF&YUSBDUJPO $PNQPOFOUT $MBTTJpDBUJPO $PNQPOFOUT $PODBUFOBUJPO $PNQPOFOUT .PEFM #VJMEFS $PNQPOFOUT
3FHJTUSZ Ϋϥελ্ͰϞσϧͷࣗಈߏஙͱϋΠύʔύϥ ϝʔλͷࣗಈௐΛߦ͏
·ͱΊ • MLʹগ͠௨ৗͱҧ͏Πϯϑϥ͕ඞཁʹͳΔ ˠ·ͩϕετɾϓϥΫςΟε͔Βͳ͍ • ͦͦMLͳػೳΛຊ֨ӡ༻͠Α͏ͱ͢Δ ͱɺେ෯ͳࣗಈԽɾΈԽΛਐΊͳ͍ͱ্ ख͘ߦ͔ͳ͍
͝ਗ਼ௌ͋Γ͕ͱ͏͍͟͝·ͨ͠!!
We are Hiring!!
SRE ML Reliability • SysML? MLOps? ৽͍͠Job description • SREεΩϧ+MLͷجૅࣝ
• MLΠϯϑϥͷࣗಈԽɾΈԽΛਪ͠ਐΊͯ ͘ΕΔਓࡐ • ͪΖΜଞͷ৬छઈࢍืूத!!
ৄࡉͪ͜Β https://careers.mercari.com/