$30 off During Our Annual Pro Sale. View Details »

機械学習とのつきあいかた / How to get involved with Machine Learning @ Wantedly

機械学習とのつきあいかた / How to get involved with Machine Learning @ Wantedly

2019年新人研修で使った資料です。

以下の内容を話した全体的にはポエムです。

1. Wantedlyの機械学習
2. MLと組織
3. MLとデータ
4. MLプロジェクの進め方

Makoto Tanji

April 24, 2019
Tweet

More Decks by Makoto Tanji

Other Decks in Programming

Transcript

 1. ػցֶशͱͷ͖͍͔ͭ͋ͨ
  @Wantedly
  New Grad Training 2019
  April 24, 2019 - Makoto Tanji (@tan-z-tan)

  View Slide

 2. ©2019 Wantedly, Inc.
  • ML Engineer at Wantedly, Inc.
  • Wantedly Visit 2015 - 2016
  • Client Growth, Scout
  • Wantedly People 2016 - current
  • Server side
  • Machine Learning
  • Data Analysis
  ࣗݾ঺հ Who am I?
  Makoto Tanji

  View Slide

 3. ©2019 Wantedly, Inc.
  ࠓ೔࿩͢͜ͱ

  View Slide

 4. ©2019 Wantedly, Inc.
  1. ςʔϚ
  2. Wantedlyͷػցֶश
  3. Take home messages
  1. MLͱ૊৫
  2. MLͱσʔλ
  3. MLϓϩδΣΫͷਐΊํ
  4. ·ͱΊ
  Contents

  View Slide

 5. ©2019 Wantedly, Inc.
  ϝσΟΞʹࡌΔΑ͏ͳ࠷৽ٕज़ͷػցֶशͷ׆༻͸՚΍͔Ͱ͕͢ɺ࣮ӡ༻
  ͷͨΊʹ͸࣮ࡍ͸๲େͳσʔλ͕ඞཁͩͬͨΓɺਫ਼౓͕ग़ͳ͔ͬͨΓɺ҆
  ఆͨ͠ӡ༻͕೉͍͠ͱݴ͏ࠔ೉΋ଘࡏ͠·͢ɻ
  8BOUFEMZͷ.-ΤϯδχΞ͸ɺ8ΤϯδχΞɾσʔλαΠΤϯςΟε
  τɾΠϯϑϥͱڠಇ͠ͳ͕Β͜ͷ໰୊ʹऔΓ૊ΜͰ͍·͢ɻ
  ·ͨকདྷ8ΤϯδχΞ΋ΠϯϑϥΤϯδχΞ΋ػցֶशͱԿΒ͔ͷܗͰ
  ؔΘΔ͜ͱ͕ࠓΑΓ΋૿͖͑ͯ·͢ɻ
  8BOUFEMZͰ࣮ࡍʹಈ͍͍ͯΔػցֶशΛݟͳ͕Βػցֶशͱͷ͖͍ͭ͋
  ͔ͨΛߟ͑ͯߦ͖·͠ΐ͏ɻ
  ςʔϚ

  View Slide

 6. ©2019 Wantedly, Inc.
  શମతʹػցֶशʹ·ͭΘΔϙΤϜͰ͢

  View Slide

 7. ©2019 Wantedly, Inc.
  Wantedlyͷػցֶश׆༻ࣄྫ
  Machine Learnings working at Wantedly

  View Slide

 8. ©2019 Wantedly, Inc.
  Wantedlyͷػցֶश׆༻ࣄྫ

  View Slide

 9. ©2019 Wantedly, Inc.
  Wantedlyͷػցֶश׆༻ࣄྫ
  7JTJU
  w ืूͷਪન
  w 1VTI௨஌ͷύʔιφϥΠζ
  w εΧ΢τϑΟϧλͷվળ
  w ʜ
  1FPQMF
  w ໊ࢗը૾ͷ0$3ͱ෼ྨ
  w ਓͷਪન
  w هࣄͷਪનɺ1VTIͷ։෧཰༧ଌ
  w ༷ʑͳݻ༗දݱநग़
  w ʢػցֶशͷϦϙδτϦ͸.-λά͕͍ͭͯ·͢ʣ
  w IUUQTHJUIVCDPNXBOUFEMZ VUG&$RNMUZQFMBOHVBHF

  View Slide

 10. ©2019 Wantedly, Inc.
  ͳͥػցֶशΛಋೖ͢Δͷ͔ʁ

  View Slide

 11. ©2019 Wantedly, Inc.
  Wantedlyͷػցֶश׆༻ࣄྫ
  ఏڙ͍ͨ͠Ձ஋͕͋Δ͔Β
  • εΩϟϯ͔Β਺ඵͰσʔλԽͱ͍͏ମݧΛఏڙ͍ͨ͠
  w ໊ࢗΛຕຕ࿈བྷઌʹҠ͢ͷਏ͍ˠͦΕਓؒͷ࢓ࣄ͡Όͳ͍Ͱ͢ɻΧϝϥͰࡱͬͨΒҰ౓ʹσʔλԽ
  • ίίϩΦυϧγΰτʹग़ձ͏ମݧΛఏڙ͍ͨ͠
  w ͳΜ͔͍͍࢓ࣄͳ͍͔ͳʁˠ͋ͳ͕ͨڵຯ͕͋Γͦ͏ͳืू͔ΒࣗಈͰਪન͠·͢Α

  View Slide

 12. ©2019 Wantedly, Inc.
  ࣾ಺Ͱಈ͍͍ͯΔ.-ͷߏ੒ྫ

  View Slide

 13. ©2019 Wantedly, Inc.
  ࣾ಺Ͱಈ͍͍ͯΔMLͷߏ੒ྫ
  ֶशࡁΈͷ"1*ύλʔϯ
  ਪનύλʔϯ
  ͦͷଞ

  View Slide

 14. ©2019 Wantedly, Inc.
  ֶशࡁΈͷAPIύλʔϯ
  ϚΠΫϩαʔϏεͷαʔόͷ"1*ͱͯ͠ఏڙ
  1. OCRʢจࣈೝࣝʣAPIɺςΩετ͔Βͷநग़APIɺಉҰੑ൑ఆAPIͳͲ୔ࢁ
  2. લ΋ֶͬͯशͨ͠ϞσϧΛಈ͔ͯ݁͠ՌΛฦ͢
  3. e.g. Peopleͷ໊ࢗೝࣝؔ࿈API͸͜ͷྫ͕ଟ͍
  ߟ͑Δ͜ͱ
  • ਫ਼౓͕஌Βͳ͍͏ͪʹམ͍ͪͯͳ͍͔
  • ϝτϦΫεΛຖ೔औΔɻ৽͘͠APIΛ࡞Δͱ͖͸νΣοΫϦετʹೖΕΔ
  • ίετ
  • CPU཯଎ʹͳ͍ͬͯΔ͜ͱ͕ଟ͘ɺkubernetesͷϦιʔεΛେྔʹ࢖ͬͯͳ͍͔

  View Slide

 15. ©2019 Wantedly, Inc.
  ਪનύλʔϯ
  લ΋͓ͬͯ͢͢ΊείΞΛܭࢉɾఏڙ
  1. Ϣʔβ͕ϖʔδʹΞΫηεͨ͠ͱ͖ʹҰॠͰΦεεϝͷืूΛग़͍ͨ͠
  2. Ϣʔβͷաڈͷߦಈ͔Βࣄલʹܭࢉ͓ͯ͘͠
  3. Ϣʔβ×ืूͷ݁ՌΛσʔλετΞʹอଘ
  ߟ͑Δ͜ͱ
  • ʮϢʔβ×ืूʯ͕͸͍ΔσʔλετΞ
  • ຖ೔ͷֶशJob͕ࢭ·͍ͬͯͳ͍͔ʁ
  • Job͕ଟஈʹͳΔͱಛʹΘ͔Γʹ͘͘ͳΔˠData Pipelineͷ࿩

  View Slide

 16. ©2019 Wantedly, Inc.
  ͦͷଞ
  σʔλͷਖ਼نԽ
  1. ࡶଟͳ৘ใΛਖ਼نԽɾಉҰੑͳͲΛ൑ఆ͢Δ
  2. ࣙॻΛ࡞Δ࢓ࣄ
  3. Ϣʔβʹ͸ݟ͑ͳ͍͚Ͳ಺෦Ͱݡ͘ͳ͍ͬͯΔ
  #JH2VFSZ.-
  1. ಺෦ͷϩδοΫͰ࢖͍ͬͯΔ
  2. ࣮͸࢖ΘΕͯ·͢

  View Slide

 17. ©2019 Wantedly, Inc.
  ࠓ೔ͷTake-Home Messages

  View Slide

 18. ©2019 Wantedly, Inc.
  1. ૊৫ͷ࿩
  2. σʔλͷ࿩
  3. ϓϩδΣΫτͷਐΊํͷ࿩

  View Slide

 19. ©2019 Wantedly, Inc.
  ૊৫ͷ࿩

  View Slide

 20. ©2019 Wantedly, Inc.
  ʮ૊৫ߏ଄͸ઓུΛܾΊΔʯ ΠΰʔϧɾΞϯκϑ

  ʢ͔͍͍ͬ֨͜ݴΛݴ͍͔͚ͨͬͨͩʣ

  View Slide

 21. ©2019 Wantedly, Inc.
  ૊৫ͷ࿩
  8BOUFEMZͰ͸ɺϓϩμΫτνʔϜʹػցֶश
  ΤϯδχΞ͕͍Δ
  MLΤϯδχΞ
  ̏ਓ
  MLΤϯδχΞ
  ̎ਓ
  Infrastructure
  Visit People

  View Slide

 22. ©2019 Wantedly, Inc.
  ૊৫ͷ࿩
  Ϣʔβ·Ͱͷڑ཭͕͍ۙ
  1. ߦͬͨվળ͕μΠϨΫτʹϢʔβ·Ͱಧ͘
  ϓϩμΫτʹՁ஋ͷ͋ΔվળΛߦ͍΍͍͢
  1. ϓϩμΫτͷ޲͔͍ͬͯΔํ޲Λڞ༗͍ͯ͠ΔͷͰᴥᴪ͕ى͜ΓͮΒ͍
  2. ඞཁͰ͋Ε͹ϓϩμΫτͷϏδωεϩδοΫ·ͰखΛೖΕΒΕΔ
  վળ͚ͩͰͳ͘৽͍͠ػೳΛࣗવͱٞ࿦͢Δ
  1. ։ൃνʔϜͷҰһͳͷͰ਺஋తͳվળ͚ͩͰͳ͘ʮ͜͏͋Δ΂͖ʯํ޲Λٞ࿦ͯ͠ਐΊΒΕΔ

  View Slide

 23. ©2019 Wantedly, Inc.
  ૊৫ͷ࿩
  ଞͷ૊৫ͷ͋Γํ
  σʔλαΠΤϯςΟετ
  ΤϯδχΞ
  σβΠφ
  Ϛωʔδϟ
  શࣾԣஅ
  MLνʔϜ
  ϓϩμΫτνʔϜ
  શࣾԣஅ
  σʔλ෼ੳνʔϜ
  MLΤϯδχΞ
  ґཔ
  ݁Ռ

  View Slide

 24. ©2019 Wantedly, Inc.
  ૊৫ͷ࿩
  ଞͷ։ൃνʔϜͱ͸Ͳ͏ؔΘΔͷ͔ʁ
  1. ։ൃελΠϧ: APIΛఆٛͯ͠WEBଆͱMLଆͰฒߦͯ͠։ൃ
  1. e.g. Ϣʔβʹෳ਺ͷهࣄΛฦ͍ͨ͠ɻهࣄͷϦετ͸WEBଆͰऔಘ͢ΔɻMLͷAPIΛݺͼग़ͯ͠ॱংΛܾఆ͢Δ
  2. ؔΘΓํ
  1. WEBଆ: લ΋ͬͯAPI spec͚ܾͩΊ͓ͯ͘ͱɺSTUBͯ͠MLଆΛ଴ͨͣʹ։ൃՄೳɻ
  2. Πϯϑϥ: service-in͢ΔલʹɺෛՙͷϝτϦΫεͱ࠷ѱࢮΜͰ΋͍͍API͔ΫϦςΟΧϧͳAPI͔Λ͢Γ߹Θͤ
  3. ίϛϡχέʔγϣϯ
  1. ࣄલʹʮਫ਼౓ͷظ଴஋ʯΛ͢Γ߹ΘͤΔͱ͍͍ɻ
  2. ௒ݡ͍API͕αΫοͱͰ͖Δ͜ͱ͸΄΅ແ͍ɻ99.9ˋͷਫ਼౓͕ඞཁͳཁ݅ͳͷ͔ɺݟͤํΛڞ༗͢Δ

  View Slide

 25. ©2019 Wantedly, Inc.
  8BOUFUEMZͰ͸ɺϓϩμΫτνʔϜʹ.-ΤϯδχΞ͕ೖΔͱ͍͏ಇ͖ํΛ
  औ͍ͬͯ·͢ɻϢʔβʹۙ͘ɺϓϩμΫτυϦϒϯͳಇ͖ํ͕͠΍͍͢ͱ
  ͍͏ϝϦοτ͕͋Γ·͢ɻ
  ϓϩμΫτνʔϜͷதͰ͸ɺࣄલʹʮޓ͍ͷ੹೚ൣғʯͱʮਫ਼౓ͷظ଴஋ʯ
  Λ͸͖ͬΓͤ͞Δͱࠞཚͳ͘։ൃ͕ਐΈ·͢ɻ
  ૊৫ͷ࿩

  View Slide

 26. ©2019 Wantedly, Inc.
  %FFQ-FBSOJOHྠߨձ΍ͬͯ·͢
  ຖिਫ༵೔ʙ!8BOUFEMZ

  View Slide

 27. ©2019 Wantedly, Inc.
  σʔλ

  View Slide

 28. ©2019 Wantedly, Inc.
  σʔλ
  ໘ന͍ྫ͑ʮσʔλ͸ݪ༉ʯ
  ͦͷ··Ͱ͸࢖͑ͳ͍ɻਫ਼੡͕ඞཁ
  Ճ޻͢Δͱ༷ʑͳ༻్ʹ࢖͑Δ
  ੈքதͰσʔλͷऔΓ߹͍Λ͍ͯ͠Δ
  ग़లɿෆ໌

  View Slide

 29. ©2019 Wantedly, Inc.
  Wantedlyͷ໘ന͍σʔλ
  w ϢʔβͷϓϩϑΟʔϧ਺ඦສΦʔμʔ
  w Ϣʔβͷͭͳ͕ΓωοτϫʔΫΤοδ਺ԯΦʔμʔ
  w ಡΈऔΒΕ໊ͨࢗԯ͘Β͍

  View Slide

 30. ©2019 Wantedly, Inc.
  Data lake
  ඞཁͳσʔλ͸#JH2VFSZʹ
  ू໿Ͱ͖Δ࢓૊Έ͕͋Δ
  "OBMZUJDT

  View Slide

 31. ©2019 Wantedly, Inc.
  ؾΛ͚ͭΔ͜ͱ
  w σʔλपΓͰ࣮ࡍ͋ͬͨ໰୊
  1. RPAͷ׆༻Ͱҙຯͷͳ͍େྔͷΞΫηε͕͋ͬͨͨΊɺݟ͔͚ͷฏۉར༻਺্͕͕ͬ
  ͨɻίϯόʔδϣϯΛ൐Θͳ͍ϩά͕૿͑ͯ਺஋͕มԽͨ͠
  2. ৽͍͠ػೳͷϦϦʔεͰςʔϒϧͷΧϥϜʹ৽͍͠஋͕ೖΔΑ͏ʹͳͬͨɻલ΋ͬͯ
  ݕ஌ͨͨ͠ΊϦϦʔεલʹमਖ਼Ͱ͖͕ͨɺ஌Βͳ͔ͬͨΒϞσϧ͕յΕ͍͔ͯͨ΋

  View Slide

 32. ©2019 Wantedly, Inc.
  ؾΛ͚ͭΔ͜ͱ
  w σʔλͷϓϥΠόγʔͷ໰୊
  1. ίϯϓϥΠΞϯεΛकΓ·͠ΐ͏

  View Slide

 33. ©2019 Wantedly, Inc.
  8BOUFEMZͰ͸େن໛ͳϢʔβؔ࿈ͷσʔλΛ#JH2VFSZʹू໿͞ΕΔͱ͍
  ͏ஈ֊·Ͱ࣮ݱ͞Ε͓ͯΓɺσʔλΛ࢖͏͜ͱࣗମ͸༰қʹͳΓ·ͨ͠ɻ
  ࠓޙ͸ߋʹਐΜͰ׆༻͠΍͕͢͞ΛղܾʹͳΓ·͢ɻ
  σʔλ΍ϩά͸αʔόͷ࣮૷͚ͩʹ׬݁ͯ͠ͳ͘ɺϝτϦΫε΍σʔλΛ
  ࢖͏.-ͷϞσϧʹ΋ӨڹΛ༩͑·͢ɻαʔόͷมߋ͕.-ϞσϧΛյ͞ͳ
  ͍Α͏ʹपΓʹڞ༗͠·͠ΐ͏ɻʢຊ౰͸ҟৗݕ஌ͳͲͰγεςϜతʹक
  Γ͍ͨʣ
  σʔλ

  View Slide

 34. ©2019 Wantedly, Inc.
  ϓϩδΣΫτͷਐΊํ

  View Slide

 35. ©2019 Wantedly, Inc.
  ػցֶशͷϓϩδΣΫτ͸Ͳ͏΍ͬͯ࢝·Δͷʁ

  View Slide

 36. ©2019 Wantedly, Inc.
  έʔεόΠέʔεͰҰൠԽ͕೉͍͠
  খ͍͞ϓϩδΣΫτ͸ࡶஊϕʔε͕
  ࢝·Γͷ͜ͱ͕ଟ͍ؾ͕͢Δ

  View Slide

 37. ©2019 Wantedly, Inc.
  ੈͷதͷ#FTU1SBDUJDFΛ୳ͦ͏

  View Slide

 38. ©2019 Wantedly, Inc.
  Machine Learning Best Practice
  • ϧʔϧ1: ػցֶशΛ࢖Θͳ͍Ͱ࢝ΊΒΕͳ͍͔ߟ͑Δ
  • ϧʔϧ2: ઃܭͯ͠ࢦඪΛ࡞ΔʢݱࡏͷγεςϜΛཧղ͔ͯ͠ΒࢦඪΛ࡞Δʣ
  • ϧʔϧ3: ϩδοΫ͕ෳࡶʹͳͬͬͨΒػցֶशΛબ୒͢Δ
  ref: https://developers.google.com/machine-learning/guides/rules-of-ml/

  View Slide

 39. ©2019 Wantedly, Inc.
  ϓϩδΣΫτͷਐΊํ
  ྫɿʮ͓஌Γ߹͍Ͱ͔͢ʁʯػೳ

  View Slide

 40. ©2019 Wantedly, Inc.
  ϓϩδΣΫτͷਐΊํ
  ख୳Γظ ϧʔϧ૿Ճظ ػցֶश
  • ڞ௨ͷ஌ਓ਺͕ଟ͍ਓʢΛਪન͢Δʣ
  • ϝτϦΫε=ϦΫΤετ਺ͱঝೝ཰
  • ڞ௨ͷ஌ਓ਺͕ଟ͍ਓ
  • ಉ͡ձࣾͷਓ
  • ڞ௨ͷ໊ࢗΛ͍࣋ͬͯΔ਺͕ଟ͍ਓ
  • ಉ࣌ʹεΩϟϯ͞ΕͨਓͰڞ௨ͷͭ
  ͳ͕Γ͕͋Δ
  • ͜ͷ૊Έ߹Θͤ
  • ɾɾɾ
  • Deep Learningϕʔεͷਪન
  • + ABςετ

  View Slide

 41. ©2019 Wantedly, Inc.
  ϓϩδΣΫτͷਐΊํ
  ػցֶशΛಋೖ͢Δͱ͖ʹ͸
  े෼ෳࡶͰݱঢ়ͷࢦඪΛվળͰ͖ͦ͏ͳ՝୊ΛબͿ

  View Slide

 42. ©2019 Wantedly, Inc.
  ·ͱΊ

  View Slide

 43. ©2019 Wantedly, Inc.
  ·ͱΊ
  ػցֶशͷϓϩμΫτ͸ɺ8ΤϯδχΞɾΠ
  ϯϑϥνʔϜͱҰॹʹ࡞͍ͬͯΔ
  ϓϩμΫτνʔϜͷதʹ.-ΤϯδχΞ͕͍Δߏ
  ଄ͳͷͰɺҰॹʹٞ࿦ͯ͠։ൃͰ͖Δ؀ڥ
  ద੾ͳ՝୊Λݟ͚ͭͯ.-ͰՁ஋Λग़͢͜ͱʹڵ
  ຯΛ޿͛ͯ΄͍͠

  View Slide

 44. ©2019 Wantedly, Inc.
  References
  • Rules of Machine Learning: Best Practices for ML Engineering
  • https://developers.google.com/machine-learning/guides/rules-of-ml/
  • Wantedly ͷػցֶशϓϩμΫτ։ൃΛࢧ͑Δػցֶशج൫ / #rejectcon2018
  • https://speakerdeck.com/south37/number-rejectcon2018
  Photo Credit
  • https://unsplash.com/photos/PMwu9gfCSbw

  View Slide