Upgrade to Pro
— share decks privately, control downloads, hide ads and more …
Speaker Deck
Features
Speaker Deck
PRO
Sign in
Sign up for free
Search
Search
ABEJA Platform での MLOps LINE×ABEJA MLOps Study ...
Search
Sponsored
·
SiteGround - Reliable hosting with speed, security, and support you can count on.
→
Yusuke Ueno
April 24, 2019
Technology
770
0
Share
Embed
Copy iframe code
Copy JS code
Copy link
Start on current slide
ABEJA Platform での MLOps LINE×ABEJA MLOps Study @FUKUOKA
Yusuke Ueno
April 24, 2019
Other Decks in Technology
See All in Technology
AI時代のコスト管理を考えよう〜明日から使える実践AWSノウハウ~
yoshimi0227
0
890
IaC コードを資産へ:AWS CDK 社内ライブラリと横断展開 / aws-summit-japan-2026
gotok365
10
1.6k
週末にループ・エンジニアリングの理解を深めるためのスライド
nagatsu
0
380
フィジカル版Github Onshapeの紹介
shiba_8ro
0
330
クラウドファンディング版StackChan 3体(4体)をインタラクティブな体験型作品にして展示もした話 / スタックチャンお誕生日会2026
you
PRO
0
190
From Prompt Engineering to Loop Engineering
shibuiwilliam
1
240
データレイクの「見えない問題」を可視化する
sansantech
PRO
1
200
感情と身体を置き去りにしない、エンジニアの生きのこり方 ──いまから、ここから「自分の状態」を扱うという選択
saorimurooka
0
340
LayerX コーポレートエンジニアリング室におけるサプライチェーンセキュリティへの取り組み / Supply Chain Security at LayerX Corporate Engineering
yuyatakeyama
3
840
気軽に使える"情報のハブ"としてのNotion活用 〜フロー情報の集積点 と、 Claude Code × Notion AI〜
syucream
1
200
千葉での単身赴任からAWSをやり続け、千葉に戻ってきた話
yama3133
1
120
FPGAの開発コンペでZephyrを使ってみた
iotengineer22
0
200
Featured
See All Featured
Imperfection Machines: The Place of Print at Facebook
scottboms
270
14k
How GitHub (no longer) Works
holman
316
150k
Measuring & Analyzing Core Web Vitals
bluesmoon
9
870
Digital Projects Gone Horribly Wrong (And the UX Pros Who Still Save the Day) - Dean Schuster
uxyall
1
1.8k
Balancing Empowerment & Direction
lara
6
1.2k
Building Better People: How to give real-time feedback that sticks.
wjessup
370
20k
What’s in a name? Adding method to the madness
productmarketing
PRO
24
4.1k
Jamie Indigo - Trashchat’s Guide to Black Boxes: Technical SEO Tactics for LLMs
techseoconnect
PRO
0
190
Design and Strategy: How to Deal with People Who Don’t "Get" Design
morganepeng
133
19k
Designing for Performance
lara
611
70k
Mozcon NYC 2025: Stop Losing SEO Traffic
samtorres
1
260
The innovator’s Mindset - Leading Through an Era of Exponential Change - McGill University 2025
jdejongh
PRO
1
210
Transcript
Software Engineer at ABEJA Yusuke Ueno ABEJA Platform Ͱͷ ML
Ops
ࠓ͢͜ͱ • ABEJA Platform ͱʁ • ػցֶशͷ࣮ݧཧʹ͍ͭͯ • ABEJA Platform
Ͱͷ࣮ݧཧͱͦͷ࣮
ABEJA Platform ͱʁ
Copyright © 2019 ABEJA, Inc. All rights reserved.
None
Copyright © 2019 ABEJA, Inc. All rights reserved. نײ
Copyright © 2019 ABEJA, Inc. All rights reserved. ML Ops
ͱ? DevOps ͜ͷΑ͏ͳҹ • Development ͱ Operation ؒͷϓϩηεվળ • ΞϓϦέʔγϣϯͷσϦόϦೳྗΛ͋͛ΔจԽతֶɺ ϓϥΫςΟεɺπʔϧ
Copyright © 2019 ABEJA, Inc. All rights reserved. ML Ops
͜͏ఆٛͯ͠Έ·͢ • ML Engineer ͱ Development ؒͷϓϩηεվળ • Ϗδωεʹద༻Ͱ͖Δਫ਼ΛͭϞσϧΛఏڙ͢Δೳྗ Λ্͛ΔจԽతֶɺϓϥΫςΟεɺπʔϧ
Copyright © 2019 ABEJA, Inc. All rights reserved. ࠓֶश෦ʹ͍ͭͯ
Copyright © 2019 ABEJA, Inc. All rights reserved. ֶश ΠςϨʔςΟϒͳ࡞ۀ
• ֶशίʔυͷ࡞ɾमਖ਼ • ҟͳΔΦϓςΟϚΠβͰͷࢼߦ • ϋΠύʔύϥϝʔλͷௐ • αϯϓϦϯάํ๏ͷमਖ਼ • ҟͳΔόʔδϣϯͷϥΠϒϥϦͷ༻ • ϥϯμϜγʔυͷมߋ
Copyright © 2019 ABEJA, Inc. All rights reserved. ࣮ݧͷཧ͕ॏཁ ҰճҰճͷ࣮ݧͷ݅ͱ݁ՌΛه͍ͯ͠ͳ͍ͱɺޙͰਫ਼
͕ྑ͔ͬͨ࣌ͷ࣮ݧΛ࠶ݱͰ͖ͳ͍ هͯ͠ɺӾཡͰ͖ΔΑ͏ʹ͓ͯ͘͠ඞཁ͕͋Δ
• σʔληοτ • ίʔυ • ύϥϝʔλ • ࣮ߦڥ • ࣮ݧ݁ՌʢධՁࢦඪʣ
• ॏΈύϥϝʔλ • ϩά • ࣮ߦ࣌ؒ ه • ࣮ݧ݁Ռͷൺֱ • ৄࡉใͷදࣔ • ࣮ݧ݅ • ࣮ݧ݁Ռ • ՄࢹԽʢը૾ͳͲʣ • ϝϯόʔؒͰͷڞ༗ • աڈͷ࣮ݧͷݕࡧ • Ӿཡ Ӿཡ
Copyright © 2019 ABEJA, Inc. All rights reserved. ࣮ݧཧͷશମ૾ {
} ֶशίʔυ ύϥϝʔλ ධՁ݁Ռ ॏΈϑΝΠϧ ϩά ࣮ߦ࣌ؒ ֶशδϣϒ σʔληοτ ࣮ߦڥ ϝϯόʔؒͰͷڞ༗ όʔδϣϯཧ ՄࢹԽ ֶशδϣϒؒͰͷൺֱ
Copyright © 2019 ABEJA, Inc. All rights reserved. ࣮ݧཧͷશମ૾ {
} ֶशίʔυ ύϥϝʔλ ධՁ݁Ռ ॏΈϑΝΠϧ ϩά ࣮ߦ࣌ؒ ֶशδϣϒ σʔληοτ ࣮ߦڥ
Copyright © 2019 ABEJA, Inc. All rights reserved. σʔληοτͷόʔδϣϯཧ ̎ͭͷίϯϙʔωϯτΛ༻ҙ
• Datalake • ΦϒδΣΫτετϨʔδ • Datasets • Datalake ΦϒδΣΫτͷࢀরใͱϝλσʔλ
Copyright © 2019 ABEJA, Inc. All rights reserved. σʔληοτͷόʔδϣϯཧ •
Annotation Tool ʹͯ Datalake ͷσʔλʹରͯ͠Ξϊςʔ γϣϯͨ݁͠ՌΛ Datasets ͱͯ͠ग़ྗ %BUBMBLF %BUBTFUT
Copyright © 2019 ABEJA, Inc. All rights reserved. σʔληοτͷόʔδϣϯཧ σʔλΛՃͨ͠߹ɺผͷ
datasets ͱͯ͠࡞Մೳ \^ \^ \^ ɾɾɾ GJMFT BOOPUBUJPOT EBUBTFUT WFSTJPO \^ \^ WFSTJPO
Copyright © 2019 ABEJA, Inc. All rights reserved. σʔληοτͷόʔδϣϯཧ tag
Ͱ datasets Λཧతʹׂ͠ಛఆͷཁૉͷΈΛநग़ ɾɾɾ EBUBTFUT UBH" UBH# \^ " \^ " \^ " \^ # \^ #
Copyright © 2019 ABEJA, Inc. All rights reserved. σʔληοτͷՄࢹԽ σʔληοτࣗମͷ֬ೝ͕Մೳ
Copyright © 2019 ABEJA, Inc. All rights reserved. ࣮ݧཧͷશମ૾ {
} ֶशίʔυ ύϥϝʔλ ධՁ݁Ռ ॏΈϑΝΠϧ ϩά ࣮ߦ࣌ؒ ֶशδϣϒ σʔληοτ ࣮ߦڥ
Copyright © 2019 ABEJA, Inc. All rights reserved. ࣮ߦڥ Platform
Ͱ Python RuntimeɺओཁͳϑϨʔϜϫʔΫɺϥΠ ϒϥϦશ෦ೖΓͷ Docker Image Λఏڙ
Copyright © 2019 ABEJA, Inc. All rights reserved. ֶशίʔυɾύϥϝʔλ •
ֶशΛ࣮ߦ͢Δ Python ίʔυ • Platform ্Ͱݺͼग़͞ΕΔؔΛ࣮ • Docker Image ʹඞཁͳ Python ϥΠϒϥϦ͕ͳ͍߹ʹ requirements.txt ʹՃ • ༩͑ͨύϥϝʔλڥมͱͯ͠ίʔυͰऔಘՄೳ
Copyright © 2019 ABEJA, Inc. All rights reserved. ༻͢Δσʔληοτɺֶशίʔυɺύϥϝʔλɺ࣮ߦ ڥΛ·ͱΊͯɺ࣮ߦͰ͖Δঢ়ଶͰόʔδϣχϯάͯ͠ཧ
ֶशδϣϒఆٛόʔδϣϯ ֶशίʔυ { } ύϥϝʔλ σʔληοτ ࣮ߦڥ
Copyright © 2019 ABEJA, Inc. All rights reserved. ֶशδϣϒఆٛόʔδϣϯͱύϥϝʔλɺΠϯελϯελ ΠϓΛࢦఆֶͯ͠शδϣϒΛ࣮ߦ
ֶशδϣϒ࣮ߦ ֶशίʔυ { } ύϥϝʔλ ֶशδϣϒఆٛόʔδϣϯ { } ্ॻ͖ύϥϝʔλ ֶशδϣϒ σʔληοτ ΠϯελϯελΠϓ ʴ ه ࣮ߦڥ
Copyright © 2019 ABEJA, Inc. All rights reserved. ࣮ݧཧͷશମ૾ {
} ֶशίʔυ ύϥϝʔλ ධՁ݁Ռ ॏΈϑΝΠϧ ϩά ࣮ߦ࣌ؒ ֶशδϣϒ σʔληοτ ࣮ߦڥ
Copyright © 2019 ABEJA, Inc. All rights reserved. ֶशδϣϒͷ࣮ߦͱ݁Ռͷཧ •
kubernetes ( EKS ) Λ༻ • Ҏલ kubernetes on EC2 • nvidia-device-plugin Λ༻ͯ͠ GPU Λೝࣝ • spotinst ͰΫϥελΦʔτεέʔϦϯά • ָʹෳͷΠϯελϯεͰͷεέʔϧ͕Մೳ • p2 ܥɺ p3 ܥΠϯελϯε
Copyright © 2019 ABEJA, Inc. All rights reserved. ࣮ݧཧͷશମ૾ {
} ֶशίʔυ ύϥϝʔλ ධՁ݁Ռ ॏΈϑΝΠϧ ϩά ࣮ߦ࣌ؒ ֶशδϣϒ σʔληοτ ࣮ߦڥ
Copyright © 2019 ABEJA, Inc. All rights reserved. ֶशδϣϒ •
k8s ͷ Job ͱֶͯ͠शίʔυʹύϥϝʔλΛ༩࣮͑ͯߦ • SDK Λ༻ͯ͠ɺΤϙοΫ͝ͱͷਫ਼Λߋ৽ ΠϯελϯελΠϓ ࣮ߦڥ 4%, ਫ਼Λอଘ ֶशίʔυ { } ύϥϝʔλ ධՁ݁Ռ
Copyright © 2019 ABEJA, Inc. All rights reserved. ࣮ݧཧͷશମ૾ {
} ֶशίʔυ ύϥϝʔλ ධՁ݁Ռ ॏΈϑΝΠϧ ϩά ࣮ߦ࣌ؒ ֶशδϣϒ σʔληοτ ࣮ߦڥ
Copyright © 2019 ABEJA, Inc. All rights reserved. ཧܥίϯςφ ֶशδϣϒͱಉ͡ϊʔυʹஔ͠ɺग़ྗͱͳΔͷΛอଘ
&'4Ͱͷڞ༗ϑΝΠϧετϨʔδ ֶशδϣϒ "HFOU 5FOTPS#PBSE 'MVFOUE Ϛϯτ εςʔλεࢹ ग़ྗϑΝΠϧอଘ ެ։ ϩάΛऔಘ อଘ
Copyright © 2019 ABEJA, Inc. All rights reserved. Fluentd ίϯςφ
ֶशδϣϒ͕ग़ྗ͢Δඪ४ग़ྗΛอଘ • k8s ͷ DaemonSet ͰίϯςφΛஔ • શͯͷϊʔυʹ̍ͭͷ Fluentd ίϯςφΛ࣮ߦ • جຊతʹ /var/log/containers/*.log Λࢹͯ͠ɺ͜ΕΒ ͷϩάΛ֎෦ͷετϨʔδʹอଘ • Pod ͕ফ͑Δͱϩάফ͑ͯ͠·͏
Copyright © 2019 ABEJA, Inc. All rights reserved. Fluentd ίϯςφ
• RUBY_GC_HEAP_OLDOBJECT_LIMIT_FACTOR ͷઃ ఆ࣍ୈͰɺNoisy Neighbor ʹͳΔ͔ɺResource Limit ʹΑΓ OOM Killer Ͱࡴ͞Εͯ͠·͏
Copyright © 2019 ABEJA, Inc. All rights reserved. TensorBoard ίϯςφ
ֶशδϣϒ͕ग़ྗ͢ΔΠϕϯτϩάͷՄࢹԽ • Inter-Pod Affinity Λ༻ͯ͠ Job ͱಉ͡ϊʔυʹஔ • Job ͱಉ͡ϑΝΠϧγεςϜΛϚϯτ͠ɺϩάΛಡΈ ࠐΈදࣔ • k8s ͷ Service ͷ Node Port Ͱ internal ʹ expose ͠ɺ ͷ Gateway ͕ೝূ͖Ͱެ։
Copyright © 2019 ABEJA, Inc. All rights reserved. Agent ίϯςφ
ֶशδϣϒͷεςʔλεࢹɾ։࢝ / ऴྃ࣌ࠁΛه • Job ͷεςʔλεΛϙʔϦϯάͯ͠ه • Job ͱಉ͡ϑΝΠϧγεςϜΛϚϯτ͠ɺֶशδϣϒ ͷऴྃͱͱʹग़ྗϑΝΠϧΛอଘ ֶशδϣϒ "HFOU εςʔλεࢹɾߋ৽ ग़ྗϑΝΠϧอଘ
Copyright © 2019 ABEJA, Inc. All rights reserved. ࣮ݧཧͷશମ૾ {
} ֶशίʔυ ύϥϝʔλ ධՁ݁Ռ ॏΈϑΝΠϧ ϩά ࣮ߦ࣌ؒ ֶशδϣϒ σʔληοτ ࣮ߦڥ
Copyright © 2019 ABEJA, Inc. All rights reserved.
Copyright © 2019 ABEJA, Inc. All rights reserved. ML Ops
• ML Engineer ͱ Development ؒͷϓϩηεվળ • Ϗδωεʹద༻Ͱ͖Δਫ਼ΛͭϞσϧΛఏڙ͢Δೳྗ Λ্͛ΔจԽతֶɺϓϥΫςΟεɺπʔϧ
Copyright © 2019 ABEJA, Inc. All rights reserved. ML Engineer
ͱ Development ؒͷϓϩηεվળ ཁٻΛຬͨ͢Ϟσϧ͕Ͱ͖ΔͱଞͷαʔϏε͕ར༻Α͏ʹެ։ • ୭͕ຊ൪͚ͷίʔυΛॻ͔͘ʁ • Data Scientist ͕ॻ͍ͨίʔυΛॻ͖͞ͳ͍ͱ͍͚ͳ͍ • ॻ͖͢ͱਫ਼͕࠶ݱ͠ͳ͍… • Ϟσϧͷߋ৽͕ଟ͗͢ → αʔϏεͷߋ৽ճ૿Ճ • ʑ…
Copyright © 2019 ABEJA, Inc. All rights reserved. ML Engineer
ͱ Development ؒͷϓϩηεվળ Development ଆֶश݁ՌͱਪίʔυͱΈ߹Θͤͯ όʔδϣϯཧՄೳ ਪίʔυ ֶश݁Ռ ॏΈϑΝΠϧ ࣮ߦڥ ධՁ݁Ռ ॏΈϑΝΠϧ ࣮ߦڥ ධՁ݁Ռ δϣϒ̍ δϣϒ̎ ॏΈϑΝΠϧ ࣮ߦڥ Ϟσϧ
Copyright © 2019 ABEJA, Inc. All rights reserved. ML Engineer
ͱ Development ؒͷϓϩηεվળ Ϟσϧͦͷ·· Web API ͱͯ͠ެ։Մೳ Ϟσϧߋ৽࣌ Web API Λ҆શʹߋ৽Մೳ ਪίʔυ Ϟσϧ ॏΈϑΝΠϧ ࣮ߦڥ ॏΈϑΝΠϧ ࣮ߦڥ Ϟσϧ̍ Ϟσϧ̎ ਪίʔυ 8FC"1* 8FC"1* σϓϩΠ ΤϯυϙΠϯτ Γସ͑Մೳ
Copyright © 2019 ABEJA, Inc. All rights reserved. Platform ͰͷϞσϧཧશମ
{ } ֶशίʔυ ύϥϝʔλ ධՁ݁Ռ ॏΈϑΝΠϧ ϩά ࣮ߦ࣌ؒ ֶशδϣϒ σʔληοτ ࣮ߦڥ ਪίʔυ ॏΈϑΝΠϧ ࣮ߦڥ
Copyright © 2019 ABEJA, Inc. All rights reserved. ·ͱΊ •
࣮ݧཧ໘͕ͩɺΒͳ͍ͱޙͰࠔΔ • ֶशͷೖྗͱͳΔ༻͢Δσʔληοτɺֶशίʔυɺ ࣮ߦڥͳͲΛ·ͱΊͯόʔδϣϯཧ • ग़ྗ݁Ռͷอଘग़དྷΔ͚ͩ։ൃऀʹෛ୲Λ͔͚ͳ͍ܗ Ͱ Platform ଆͰ࣮ • αʔϏεԽ͢ΔϞσϧͱֶशδϣϒͷ݁Ռͷඥ͚ͯτ ϨʔαϏϦςΟΛ୲อ