Slide 1

Slide 1 text

ػցֶशʹΑΔϚʔέοτ ݈શԽࢪࡦΛࢧ͑Δٕज़ Mercari Server Side Tech Talk Vol.2 ʙCREφΠτʙ
 
 hnakagawa


Slide 2

Slide 2 text

ࣗݾ঺հ • Hirofumi Nakagawa (hnakagawa) • 2017೥7݄ೖࣾ • ॴଐ͸SRE • σόΠευϥΠό։ൃ͔Βϑϩϯ τΤϯυ։ൃ·Ͱ΍ΔԿͰ΋԰ • NOT MLΤϯδχΞ • https://github.com/hnakagawa

Slide 3

Slide 3 text

͓࢓ࣄ • ML Platform։ൃ • MLΤϯδχΞͱSREͷεΩϧΪϟοϓΛຒΊ Δ • ML Reliability, SysML?, MLOps? • SREͷཱ৔͔ΒCREνʔϜΛॿ͚Δ

Slide 4

Slide 4 text

ML Platform • ಺੡ͷML Platform • kubernetesϕʔε • ϩʔΧϧ؀ڥͱΫϥελ؀ڥͷ ࠩΛந৅Խ͢Δ • ศརAPI܈ • طଘͷML FrameworkΛ࢖༻͠ ؆୯ʹTraining/ServingΛߦ͏ ؀ڥΛఏڙ

Slide 5

Slide 5 text

ͦͷ͏ͪOSSͰެ։༧ఆ(ଟ෼

Slide 6

Slide 6 text

ࠓ೔ͷAgenda͸
 ϦΞϧλΠϜ঎඼؂ࢹγεςϜ

Slide 7

Slide 7 text

ϦΞϧλΠϜ঎඼؂ࢹγεςϜ • ௨শ Lovemachine • ML Platform্ʹ࣮૷͞Ε͍ͯΔ .-1MBUGPSN USBJOJOHDMVTUFS -PWFNBDIJOF ($4 GKE PubSub .-1MBUGPSN TFSWJOHDMVTUFS -PWFNBDIJOF

Slide 8

Slide 8 text

ML ModelͷServing….?

Slide 9

Slide 9 text

Model Serving APIͷߏ੒ྫ 5FOTPS'MPX
 4FSWJOH 5' .PEFM 5' .PEFM 'MBTL 4,
 .PEFM 4,
 .PEFM 4,
 .PEFM gRPC .FSDBSJ"1* REST FlaskͰલॲཧΛߦ͍
 ཪͷTensorFlow Servingʹ౤͍͛ͯΔ

Slide 10

Slide 10 text

Model Serving API
 Streaming ver ͷߏ੒ྫ 5FOTPS'MPX
 4FSWJOH 5' .PEFM 5' .PEFM .-1MBUGPSN 'SBNFXPSL
 PS
 "QBDIF#FBN
 4,
 .PEFM 4,
 .PEFM 4,
 .PEFM gRPC PubSub

Slide 11

Slide 11 text

TensorFlow Serving • TensorFlow project͕ఏڙͯ͠ ͍ΔServing؀ڥ • PythonॲཧܥΛհͣ͞ʹTFͷ modelΛservingͰ͖Δ • ඪ४ͷ࣮૷Ͱ͸gRPCͰAPIΛ ఏڙ

Slide 12

Slide 12 text

ModelͱίϯςφɾΠϝʔδ • ڊେͳML ModelΛίϯςφɾΠϝʔδʹؚΊ Δ͔൱͔ • ؚΊͳ͍ͷͰ͋Ε͹Կॲʹ഑ஔ͢Δ͔ • ϙʔλϏϦςΟੑͱϩʔυ࣌ؒͷτϨʔυΦϑ • ྑ͍ΞΠσΟΞ͕͋Ε͹ڭ͑ͯԼ͍͞…

Slide 13

Slide 13 text

௨ৗͷAPIͱ͸ҧ͏ • ѻ͏ϦιʔεɺModelαΠζ͕େ͖͘ͳΔ৔ ߹͕ଟ͍(਺ඦMBʙ਺GB) • CPUɾϝϞϦϦιʔεͷফඅ͕ܹ͍͠ • ৔߹ʹΑͬͯ͸GPU΋࢖͏

Slide 14

Slide 14 text

ϝϞϦফඅ໰୊ • LovemachineͷPython࣮૷෦෼͸࣮ߦ࣌ʹ໿ 2GBϝϞϦΛফඅ͢Δˠࠓޙ͞Βʹ૿͑Δ༧ ఆ΋͋Δ • Scikit-learnͰهड़͞ΕͨTF-IDF౳ͷલॲཧ෦ ෼͕େ͖͘ͳΔࣄ͕ଟ͍

Slide 15

Slide 15 text

Pythonͱฒྻੑ • ౰વThread͕࢖͑ͳ͍(GILͷͨΊ) • ϓϩηεຖʹModelΛϩʔυ͢Δͱඞཁͳϝ ϞϦαΠζ͕େ͖͘ͳΔˠ Blue-Green Deployͷো֐ʹͳΔ

Slide 16

Slide 16 text

ਖ਼௚PythonͰͷServing͸
 Πϯϑϥతʹਏ͍ࣄ͕ଟ͍…

Slide 17

Slide 17 text

ϝϞϦΛݡ͘࢖͏ • fork͢ΔલʹmodelΛϩʔυ͠Copy on Write Λޮ͔͢ • k8sͷone process per containerηΦϦ͸͋ ͑ͯഁ͍ͬͯΔ

Slide 18

Slide 18 text

Copy On Writeͷ෮श ϝϞϦ ਌ϓϩηε ࢠϓϩηε 2.fork 1BHF" 1.allocation ಉ͡ྖҬΛࢀর

Slide 19

Slide 19 text

ϓϩηε͕ϝϞϦͷ಺༰Λ
 ॻ͖׵͑Δͱ… ϝϞϦ ਌ϓϩηε ࢠϓϩηε 1BHF" 1BHF# OS͕ผͷྖҬΛAllocationͯ͠ݩσʔλΛίϐʔ͢Δ ผͷྖҬΛࢀর

Slide 20

Slide 20 text

Current Issues • Mercari APIͱͷͭͳ͗ࠐΈʹۤ࿑
 ˠ Ұ௨Γ࡞Ε͹ޙ͸࠶ར༻Ͱ͖Δ͸ͣ • ਓؒͷߦಈΛ૬खʹ͍ͯ͠Δҝɺσʔλͷ܏޲͕ม ΘΓ΍͔ͬͨ͢Γɺ༧૝֎ͷ໰୊͕ൃੜͨ͠Γͯ͠ɺ ରԠ͠ଓ͚Δඞཁ͕͋Δ
 ˠ ML Model࡞੒ऀʹෛ୲ֻ͕͔Γଓ͚Δ
 ˠ SREͱͯ͠͸ࣗಈԽΛؚΜͩ࢓૊ΈͰղܾ͍ͨ͠

Slide 21

Slide 21 text

Future Plans • ࣾ಺ͷσʔλ͔Βಛ௃ྔΛநग़͢Δͯ͠ Embedding͢Δ൚༻ͷ࢓૊Έ
 ˠద౰ͳ෼ྨثͱ૊Έ߹ΘͤΕ͹ɺ୭Ͱ΋ͦͦ͜ ͜ͷ෼ྨϞσϧΛ࡞੒Ͱ͖Δ?
 →FBLearner Flowతͳ΍ͭ? • ࣾ಺ͷ໰୊ղܾʹಛԽͨ͠ઐ༻ͷAutoMLతͳԿ ͔?

Slide 22

Slide 22 text

·ͱΊ • ML ModelͷServingʹ͸ɺগ͠௨ৗͱҧ͏Πϯϑ ϥ͕ඞཁʹͳΔ
 →·ͩϕετɾϓϥΫςΟε͸෼͔Βͳ͍ • ਓͷߦಈΛ૬खʹ͢Δͷ͸େม • ͦ΋ͦ΋MLͳػೳΛຊ֨ӡ༻͠Α͏ͱ͢Δͱɺେ ෯ͳࣗಈԽɾ࢓૊ΈԽΛਐΊͳ͍ͱ্ख͘ߦ͔ͳ ͍

Slide 23

Slide 23 text

͝ਗ਼ௌ͋Γ͕ͱ͏͍͟͝·ͨ͠!!

Slide 24

Slide 24 text

We are Hiring!!

Slide 25

Slide 25 text

SRE ML Reliability • SysML? MLOps? ৽͍͠Job description • SREεΩϧ+ML෼໺ͷجૅ஌ࣝ • MLΠϯϑϥͷࣗಈԽɾ࢓૊ΈԽΛਪ͠ਐΊͯ ͘ΕΔਓࡐ • ΋ͪΖΜଞͷ৬छ΋ઈࢍืूத!!

Slide 26

Slide 26 text

ৄࡉ͸ͪ͜Β
 https://careers.mercari.com/