Upgrade to Pro
— share decks privately, control downloads, hide ads and more …
Speaker Deck
Features
Speaker Deck
PRO
Sign in
Sign up for free
Search
Search
機械学習チームにおけるソフトウェアエンジニア〜役割、キャリア /devsum-2018-summer
Search
Takahiko Ito
July 27, 2018
8
11k
機械学習チームにおけるソフトウェアエンジニア〜役割、キャリア /devsum-2018-summer
https://event.shoeisha.jp/devsumi/20180727
Takahiko Ito
July 27, 2018
Tweet
Share
More Decks by Takahiko Ito
See All by Takahiko Ito
Elasticsearch における類似度ベクトル検索のベストプラクティスを求めて/es-vector-search
takahiko03
9
6.1k
pfm
takahiko03
0
1.1k
機械学習プロジェクトを頑健にする施策 ML Ops Study #2
takahiko03
12
4.5k
Cookiecutter Template for Data Scientists Working in Docker Containers
takahiko03
2
2.4k
Cookiecutter for ML experiments with Docker
takahiko03
0
1.1k
日本語の表記ゆれ 解決方法の検討と実装
takahiko03
2
2.2k
Featured
See All Featured
I Don’t Have Time: Getting Over the Fear to Launch Your Podcast
jcasabona
32
2.1k
How to Think Like a Performance Engineer
csswizardry
22
1.4k
Facilitating Awesome Meetings
lara
53
6.2k
BBQ
matthewcrist
87
9.5k
RailsConf 2023
tenderlove
29
1k
JavaScript: Past, Present, and Future - NDC Porto 2020
reverentgeek
47
5.2k
Why Our Code Smells
bkeepers
PRO
336
57k
[Rails World 2023 - Day 1 Closing Keynote] - The Magic of Rails
eileencodes
33
2.1k
A better future with KSS
kneath
238
17k
Building a Modern Day E-commerce SEO Strategy
aleyda
38
7.1k
Building Better People: How to give real-time feedback that sticks.
wjessup
367
19k
Bootstrapping a Software Product
garrettdimon
PRO
307
110k
Transcript
ػցֶशνʔϜʹ͓͚Δ ιϑτΣΞΤϯδχΞ ׂɺΩϟϦΞ ҏ౻ܟ
ࣗݾհ • ΫοΫύουגࣜձࣾͰಇ͍͍ͯΔι ϑτΣΞΤϯδχΞ • ത࢜ʢֶʣ • TwitterΞΧϯτ: takahi_i •
Φʔϓϯιʔεɿ RedPen 2
ུྺ 3 ݚڀऀ KDD PKDD ιϑτΣΞΤϯδχΞ ࢄϑϨʔϜϫʔΫɿHadoop ݕࡧΤϯδϯɿSolrɺESɺSedueɺFAST ESP ݚڀपΓɿػցֶशɺϨίϝϯυɺNLP
࠷ۙ ػցֶश άϧʔϓ 2007 2017
ຊͷτϐοΫ • ػցֶशνʔϜʹ͓͚ΔιϑτΣΞΤϯδχΞͷׂ • Webاۀʹ͓͚ΔػցֶशͷΩϟϦΞܗ 4
ຊͷτϐοΫ • ػցֶशνʔϜʹ͓͚ΔιϑτΣΞΤϯδχΞͷׂ • Webاۀʹ͓͚ΔػցֶशͷΩϟϦΞܗ 5
४උɿػցֶशϓϩδΣΫτͷಛ • ௨ৗͷιϑτΣΞ։ൃͱҧ͍ • ίʔυ͚ͩͰ݁͠ͳ͍ • ίʔυΛಡΜͰৼΔ͍͕͔Βͳ͍ʢϞσϧʹৼΔ͍͕Ӆṭ͞Ε͍ͯΔʣ • ೖྗσʔλʹґଘ͢Δʢ͞Βʹೖྗͷ͕มԽ͢Δ͜ͱ͕͋Δʣ •
ݟΛڞ༗ʢ࣮Λཧղ͢ΔϝϯόΛෳἧ͑Δʣͯ҆͠શੑΛ֬อ͢Δͷ͕͍͠ • ಛʹΞϧΰϦζϜࣗମ͕͍͠߹ ࣮έΞ͠ͳ͚ΕͳΒͳ͍෦͕ଟ͍ϓϩδΣΫτͱ͍͑Δ 6
࠷ۙͷྲྀΕ • ػցֶशͷಋೖɺཧͷ͠͞ΛΤ ϯδχΞϦϯάͰରॲ • σʔλαΠΤϯςΟετΛαϙʔτ ͢ΔνʔϜฤ • MLΤϯδχΞɺσʔλΤϯδχΞ etc…
• AIɺػցֶशͰ׆༂͢Διϑτ ΣΞΤϯδχΞ͕૿Ճ 7 • ͰɺػցֶशνʔϜͰιϑτΣ ΞΤϯδχΞͲ͏͍ͬͨߩݙΛ ͢Δ͖ͳͷ͔ʁ • ϓϩδΣΫτͷϥΠϑαΠΫϧʹ ͢Δͱׂ͕ݟ͑ͯ͘Δ
४උɿػցֶशϓϩδΣΫτͷαΠΫϧ ̏ͭͷεςʔδ͔ΒͳΔ 1. ࣮ݧɿ Jupyter Notebook Λར༻ ͯ͠୳ࡧతͳࢼߦࡨޡ 2. ίʔυཧɿϦϑΝΫλϦϯάɺ
ϥΠϒϥϦԽɺCI 3. σϓϩΠ: αʔϏεԽɺCDɺ ࢹ 8
४උɿػցֶशϓϩδΣΫτͷαΠΫϧ 9 • ࣮ݧɺίʔυཧɺσϓϩΠ͕άϧά ϧճΔ • αΠΫϧΛճͭͭ͠ΑΓྑ͍γεςϜʹ ਫ਼্ɺγεςϜͷؤ݈Խ
ѱ͍ϓϩδΣΫτɿαΠΫϧ͕ճΒͣ Ϟσϧ͕ݻఆ αΠΫϧ͕ճΒͳ͍ͱ • ਫ਼্͕͠ͳ͍ • σϓϩΠίετ͕ େ͖͍ 10
ػցֶशνʔϜͷΤϯδχΞʹٻΊΒ ΕΔ͜ͱ • ϓϩδΣΫτͷαΠΫϧΛߴɺ ҆શʹճͤΔڥͷඋ • ֤εςʔδͷΛɺΤϯδχ ΞϦϯάͰղܾ 11 ҎԼɺ֤εςʔδʹ͓͚Δ
ͱղܾํ๏ʹ͍ͭͯղઆͯ͠Ώ͘
࣮ݧεςʔδʹ͓͚Δ ೋͭଘࡏ͢Δ 1. σʔλͷऔಘ 2. ܭࢉػϦιʔε 12
࣮ݧεςʔδʹ͓͚Δ ೋͭଘࡏ͢Δ 1. σʔλͷऔಘ 2. ܭࢉػϦιʔε
σʔλऔಘʹؔ࿈͢ΔΞϯνύλʔϯ σʔλΛ؆୯ʹऔಘͰ͖ΔڥΛߏங͢ΔલʹϦαʔνϟʔɺσʔλ αΠΤϯςΟετΛେྔʹޏ͏ 14 ҙ • σʔλαΠΤϯςΟετͷΞτϓοτσʔλͷऔಘίετʹґଘ͢Δ • σʔλऔಘίετ͕େ͖͘ർฐ͢ΔڥͰσʔλαΠΤϯςΟετेͳ ݁ՌΛͤͳ͍
σʔλʹΞΫηεͰ͖ͳ͍σʔλαΠ ΤϯςΟετͱԿऀ͔ʁ ͍ΘʮٰαʔϑΝʔʯ 15 αʔϑΝʔʹਫΛಧ͚Δඞཁ͕͋Δʂ
σʔλऔಘ • σʔλͷछྨɿ 1.σʔλϕʔεͷςʔϒϧ܈ʢϢʔβʣ • େ͖͞ɿϝΨόΠτʙΪΨόΠτ 2. ϩά • େ͖͞ɿΪΨόΠτʙϖλόΠτ
• σʔλαΠΤϯςΟετ͕྆ํͷσʔλʹࣗͰΞΫηεͰ͖Δඞཁ͕͋Δ 16
σʔλੳج൫ • ڊେͳσʔλΛอ࣋͠ɺ؆୯ʹநग़Ͱ͖Δ ϓϥοτϑΥʔϜ • બࢶͨ͘͞Μɿ • ࣗࣾӡ༻ɿHiveʢHadoopʣɺSparkɺ Presto •
ϗεςΟϯάαʔϏεɿBigQueryɺ RedshiftɺTreasureData 17
ࣾͷ͋ΒΏΔσʔλΛੳج൫ʹࡌ ͤΔ • ੳج൫ͷબఆɺ࡞͕ऴྃ͠ ͨΒσʔλΛੳج൫ʹࡌͤΔ • ϩάऩूπʔϧΛར༻ͯࣗ͠ಈ ͰσʔλΛೖ͢ΔΈΛ ͑Δ •
ਓखʹΑΔσʔλೖ 18
ฐࣾࣄྫɿੳج൫ • DWHνʔϜ͕୲ʢػցֶशνʔϜͱผʹଘࡏʣ • ඞཁͳσʔλΛSQLͰ؆୯ʹऔಘͰ͖ΔڥΛඋͯ͠Β͍ͬͯ Δ • ৄ͘͠ɿ https://www.slideshare.net/mineroaoki/cookpad- techconf-2016-dwh
19
࣮ݧεςʔδʹ͓͚Δ ೋͭଘࡏ͢Δ 1. σʔλͷऔಘ 2. ܭࢉػϦιʔε 20
ܭࢉػϦιʔεͷރׇ • ෳਓͰ୯ҰαʔόʹϩάΠϯͯ͠࡞ۀ • Ϧιʔε͕Γͳ࣮ͯ͘ݧ͕ਐ·ͳ͍ɻɻɻ 21
ेͳܭࢉػϦιʔεͷ֬อ • ػցֶशνʔϜͷϝϯόશһ͕ շదʹ࣮ݧͰ͖ΔڥΛ͑Δ • GPU͕Γͳ͍ͱ͔ 22
ฐࣾࣄྫɿܭࢉػϦιʔεཧ • ɹɹ @ayemos_y ࢯʹΑΔܭࢉػϦιʔε ͷཧ༻SlackΣΞ • Slack্Ͱܭࢉثͷ֬อΛ͓ئ͍͢ΔͱEC2 ΠϯελϯεΛ࡞ͬͯ͘ΕΔɻ •
ܭࢉػ͕ΘΕͳ͘ͳΔͱࣗಈͰམͱͯ͠ ͘ΕΔʢ͓ۚͷઅʣ • ৄ͘͠ɿhttps://techlife.cookpad.com/ entry/2017/10/26/174345 23
ίʔυཧεςʔδʹ͓͚Δ ̎ͭͷ͕͋Δ 1.ίʔυ͕ཧղͰ͖ͳ͍ 2.ϙʔλϏϦςΟ͕ແ͍ 24
ίʔυཧεςʔδʹ͓͚Δ ̎ͭͷ͕͋Δ 1. ίʔυ͕ཧղͰ͖ͳ͍ 2. ϙʔλϏϦςΟ͕ແ͍ 25
࣮ݧεΫϦϓτ͕ཧղͰ͖ͳ͍ • ঢ়گɿͳΜ͔ಈ࡞͍ͯ͠ΔΑ͏͕ͩɺϞσϧΛੜ͍ͯ͠Δίʔυ͕ཧղͰ ͖ͳ͍ • ྫɿJupyter Notebook Λͦͷ··ίϐϖͨ͠εΫϦϓτ • ػցֶशΞϧΰϦζϜ͍͠㱺ίʔυ͕ཧ͞Ε͍ͳ͍ͱͬͱ͍͠
• ίʔυͷཧ͕ඞཁ 26 ·ͣNotebookΛεΫϦϓτʹͯ͠Ώ͘
εΫϦϓτԽ • ࣮ݧεςʔδͰ࡞ͨ͠ Jupyter Notebook ʹهड़͞Ε͍ͯΔॲཧ Λ Python εΫϦϓτʹҠߦ •
࡞ۀɿ • ࣮ݧϑϩʔͷߏԽɿͨॻ͖ͷॲཧ͔ΒؔɺΫϥεͷநग़ • ͋Θͤͯؤ݈ੑͷ֬อɿϦϑΝΫλϦϯάɺςετՃ 27
ϦϑΝΫλϦϯά • ϓϩάϥϜͷ֎෦͔Βݟͨಈ࡞ Λม͑ͣʹιʔείʔυͷ෦ ߏΛཧ͢ΔʢWikipedia Α Γʣ • ॴײɿGitHub
Qiita Ͱެ։͞ Ε͍ͯΔػցֶशϓϩδΣΫτ ཧ͞Ε͍ͯΔͷ͕গͳ͍ ʢଞͷίʔυͱൺֱʣɻ 28
ϦϑΝΫλϦϯά߲ ॳาతͳཧͰಡΈ্͕͢͢͞ΔʢCIɺCDͷੴʣ • ؔͷ͞ • มͷείʔϓ • ͕ؔऔΔҾͷ • ϚδοΫφϯόʔͷఆͷஔ͖͑
• ಉ͡ॲཧΛҰՕॴʹ·ͱΊΔ • ਂ͍ωετ෦Λؔͱͯ͠நग़͢Δ 29
ࣗಈςετ • ςετɿೖྗʹରͯ͠ظͨ͠Ξ τϓοτʹͳ͍ͬͯΔ͔Λݕূ ͢Δίʔυ • ࠷ݶɿલॲཧɺEnd-to-Endͷς ετॻ͘ 30
ςετͷԸܙ • ςετ=༷ • υΩϡϝϯτΛॻ͍ͯ࣌ؒͱͱʹᴥᴪ͕ੜ·ΕΔ • CIͰಈ࡞͢Δςετʹᴥᴪ͕ͳ͍ • ॻ͍͓͍ͯͯ͋͛ΔͱɺϓϩδΣΫτΛҾ͖ܧ͙ਓͷཧղΛॿ͚Δ •
ςετ͕ແ͍ػցֶशͷίʔυΛमਖ਼͢Δͷڪා 31
ͦͷଞͷίʔυཧεςʔδͰͷ࡞ۀ • ϫʔΫϑϩʔπʔϧͷಋೖ ʢmakeɺLuigiʣ • ϩΨʔՃ • υΩϡϝϯτڧԽ • ܧଓతΠϯςάϨʔγϣϯͷڥ
උ 32
Ξϯνύλʔϯɿίʔυཧεςʔδʹ͓ ͚Δۀ ୳ࡧతͳ࣮ݧ ίʔυཧ Ϟσϧͷ σϓϩΠ • ةݥɿϦαʔνϟ͕ݕূ࣮ͨ͠ݧ༰ΛΤϯδχΞ͕ཧͯ͠σϓϩΠ • ػցֶशͷίʔυ௨ৗͷϓϩάϥϜΑΓҾ͖ܧ͗ίετ͕େ͖͍
• ୭ϓϩδΣΫτΛཧղͰ͖ͳ͘ͳΔڪΕ͕͋Δ 33 33
ฐࣾࣄྫɿίʔυཧΛϖΞͰऔΓ Ή • ίʔυཧ࣌ʹϦαʔνϟɺΤϯδχΞͷϖΞ Λ࡞Δ • ίʔυͷݟΛڞ༗ͭͭ͠࡞ۀ • νʔϜϝϯόͷίʔσΟϯάೳྗΛۉҰԽ ୳ࡧతͳ࣮ݧ
ίʔυཧ Ϟσϧͷ σϓϩΠ 34
ίʔυཧͰͷ ̎ͭͷ͕͋Δ 1. ίʔυ͕ཧղͰ͖ͳ͍ 2. ϙʔλϏϦςΟ͕ແ͍ 35
εΫϦϓτΛ࣮ߦ͢Δڥ͕࡞Εͳ͍ • ػցֶशΛѻ͏εΫϦϓτଟͷϥΠϒϥϦʹґଘ • PythonҎ֎ͷݴޠͰهड़͞ΕͨπʔϧʹґଘʢMeCabͳͲʣ • ֤εςʔδʢ࣮ݧɺཧɺσϓϩΠʣ͝ͱʹҟͳΔڥʢܭࢉػʣ Ͱಈ࡞͢ΔͷͰ࣮ߦڥʹϙʔλϏϦςΟ͕ແ͍ͱਏ͍ɻɻɻ • ྫɿϩʔΧϧͰ͏·͘ಈ͍͍ͯͨεΫϦϓτ͕ຊ൪αʔόͰಈ
࡞͠ͳ͍ 36
ղܾํ๏ɿDocker Λಋೖ • ܰྔͳԾԽڥ • PythonϥΠϒϥϦҎ֎ͷɺґଘ͢ΔڥDockerfileʹهड़Ͱ͖Δ • ϓϩδΣΫτͷϙʔλϏϦςΟ্͕ 37
DockerͰڥΛԾԽ 38 • ཧɿ࣮ݧஈ֊͔Β DockerͰ࡞ۀ • εςʔδ͕มΘͬͯ ࣮֬ʹεΫϦϓτ͕ಈ ࡞͢Δڥ͕खʹೖΔ •
݁ՌɺϓϩδΣΫτͷ αΠΫϧ͕ճ͘͢͠
͔͠͠ɺɺDockerɺɺগʑࡶ ίϚϯυ͕͍ɻɻɻɻ(TдT) 39 ϓϩδΣΫτຖʹϙʔτϑΥϫʔυɺ Πϝʔδɺίϯςφ໊Λ֮͑Δඞཁ͋Δ(TдT)
ྫɿDocker ίϚϯυ • Docker Πϝʔδͷ࡞ • docker build -t ml-image
-f ./docker/Dockerfile . • Dockerίϯςφͷ࡞ • docker run -it -v `pwd`:/work -p 8888:8888 --name ml-image ml- container • ίϯςφͷআɺ࠶ੜ͢Δͨͼʹຖճಉ͡ίϚϯυΛଧͪࠐΉ… 40
ฐࣾࣄྫɿCookiecutter Docker Science • DockerڥͰͷ࣮ݧʙσϓϩΠ·ͰΛαϙʔτ͢ΔCookiecutterςϯ ϓϨʔτΛ࡞ • ΦʔϓϯιʔεϓϩδΣΫτʢͬͯΈͯΒ͑Δͱخ͍͠Ͱ͢ʣ • URL:
https://docker-science.github.io/ • ιʔεɿhttps://github.com/docker-science/cookiecutter-docker-science • Cookiecutter: ϓϩδΣΫτͷςϯϓϨʔτੜπʔϧ 41
ػೳɿCookicutter Docker Science • ΤϯδχΞϦϯάೳྗͷߴ͘ͳ͍ϝϯόͰDockerΛѻ͍͘͢ • DockerͷίϚϯυΛ make λʔήοτͰӅṭ •
Πϝʔδ໊ɺϙʔτɺϑΝΠϧϚϯτઃఆɺίϯςφ࡞Γ͠ etc … • ࣮ݧ͔ΒཧɺσϓϩΠ·ͰΛҙࣝͨ͠σΟϨΫτϦߏΛग़ྗ • σΟϨΫτϦߏͷڞ௨ԽʹΑΓϓϩδΣΫτͷݟ௨͠ • Cookiecutter Data Science ͷߏΛࢀߟʹͨ͠ 42
ϑΝΠϧɺσΟϨΫτϦߏͷ౷Ұ 43 make init Ͱ S3͔Βσʔ λΛμϯϩʔυ ֶशεΫϦϓτ͕ग़ྗ͢Δ ϞσϧΛอ࣋ ࣮ݧ༻ͷϊʔτϒο
ΫΛอ࣋ ίʔυཧ࣌ʹ࡞ ΒΕΔϝιουɺΫϥε Λอ࣋ ϓϩδΣΫτͷϫʔ ΫϑϩʔΛه
Cookiecutter Docker Science ͷ͍ํʢϓ ϩδΣΫτੜʣ $cookiecutter
[email protected]
:docker-science/cookiecutter-docker-science.git project_name [project_name]: image-classification
project_slug [image_classification]: jupyter_host_port [8888]: description [Please Input a short description]: Classify images into several categories data_source [Please Input data source in S3]: s3://research-data/food-images 44
σϞ: Cookiecutter Docker Science • ϓϩδΣΫτͷੜ • https://asciinema.org/a/ 6XV9dNixtzfUwWdoqLj7HG7A2 •
Docker image / container ίϯςφ࡞ • https://asciinema.org/a/ 06CcXPubAj3RSiMSTy3CZDrfG • Jupyter Notebook Λ্ཱͪ͛Δ 45
σϓϩΠεςʔδʹ͓͚Δ 46 • αʔόߏஙόονεΫϦϓτ Ͱػցֶशͷ݁ՌΛσϓϩΠ͢ Δඞཁ͕͋Δɻ • ػցֶशͷ݁ՌΛεϜʔζʹσ ϓϩΠ͢ΔʹɺΑ͍ج൫ʢΠ ϯϑϥʣ͕ඞཁ
ػցֶशͷ݁ՌΛσϓϩΠ͢Δίετ • ػցֶशνʔϜͷੜ࢈ੑ৫ͷΠϯϑϥٕ ज़ʹґଘ • ϓϩμΫγϣϯڥͰͷαʔόߏஙίετ ͕େ͖͍ͱɺػցֶशͷՌΛαʔϏεʹ өͰ͖ͳ͍ • جຊɿػցֶशνʔϜͷϝϯό͕ࣗ
GitHub GHE ʹϓϧϦΫΤετΛग़͢͜ͱ ͰɺϓϩμΫγϣϯڥʹαʔόΛߏஙͰ͖ Δ 47
ฐࣾࣄྫɿσϓϩΠͷޮԽ • αʔόཧ • ECSͷར༻ʢKubernates։࢝ʣ • Πϯϑϥ෦ʹΑͬͯඋ͞Ε͍ͯΔ • ػցֶशνʔϜࣗͰϓϩμΫγϣϯڥʹ αʔόΛߏஙͰ͖Δڥ͕ఏڙ͞Ε͍ͯΔ
• ඞཁͳ࡞ۀɿઃఆϑΝΠϧΛϨϙδτϦʹ Ճ͢Δ͚ͩʢهड़༰ɿೝূɺαʔόͷੑೳ etcʣ 48 • chie8842 ࢯʹΑͬͯػցֶशϓϩδΣ ΫτͷઃఆΛڞ௨Խ͕ਐΜͰ͍Δ • খ͞ͳϓϩδΣΫτͰ͋Εڞ௨ͷઃఆΛ ͏ • ೝূɺதؒσʔλஔ͖ etc • σϓϩΠϑϩʔͷ؆ૉԽ • هड़͕ඞཁͳઃఆ߲Λѹॖ
ฐࣾࣄྫɿKelner • @_lunardog_ ࢯ͕։ൃ͍ͯ͠ΔOSSϓϩδΣΫτ • രͰਂֶशͷϞσϧΛσϓϩΠ͢Δπʔϧ • URL: https://github.com/lunardog/kelner 49
·ͱΊɿػցֶशνʔϜʹ͓͚ΔΤϯ δχΞͷׂ • ػցֶशΛαʔϏεͰར༻͢ΔʹέΞ͢Δ෦͕ࢁ • ΞϧΰϦζϜɺσʔλͷऔಘɺίʔυͷ࣭ɺσϓϩΠ etc • ࣮ݧεςʔδͰݚڀೳྗ͕ཁٻ͞ΕɺίʔυཧҎ߱ͰΤϯ δχΞϦϯάೳྗ͕ཁٻ
• ݫີʹۀମ੍Λங͘ͷ͍ͨ͠Ίɺ֤ࣗͷೳྗͷ overlap ෦ Λ૿͢ͷ͕ॏཁʢϨϏϡʔɺϖΞϓϩͰݟڞ༗ͳͲʣ 50
ຊͷτϐοΫ • ػցֶशνʔϜʹ͓͚ΔιϑτΣΞΤϯδχΞͷׂ • Webاۀʹ͓͚ΔػցֶशͷΩϟϦΞܗ 51
νʔϜنʹΑͬͯҟͳΔΩϟϦΞ • ಉ͡MLνʔϜͰνʔϜنʹΑͬͯΩϟϦΞܗҟͳΔ • ͕ࣗͳΓ͍ͨΤϯδχΞ૾ʹϚον͢ΔنΛબ͢Δͷ͕Α͍ 52
খ͞ͳMLνʔϜͷϝϯό • ҰਓͰଟ༷ͳλεΫʹରॲ • ػցֶशϓϩδΣΫτͷαΠΫϧͯ͢ʹߩݙ͢Δ • σʔλऩूɺίʔυཧɺαʔόߏஙɺϞχλϦϯά • ϚϧνελοΫԽ •
ػցֶशࣗମͷΞϧΰϦζϜΛಥ͖٧ΊΔ࣌ؒ͋·ΓऔΕͳ͍ 53
େ͖ͳMLνʔϜͷϝϯό • ۀମ੍ͷඋ • ઐʹಛԽͨ͠ܦݧΛੵΊΔ • ྫɿ • Ϧαʔνϟɿ৽͍͠ΞϧΰϦζϜͷఏҊɺจ •
ΠϯϑϥΤϯδχΞɿΤϨΨϯτͳMLσϓϩΠϑϨʔϜϫʔΫͷ ಋೖɺ֦ு 54
ͨͩ͠ɺͲͪΒͷ߹Ͱ Ͳͷϝϯό࠷ݶ͍࣋ͬͯΔ͖ٕज़ελοΫ͋Δ • Git ͷͪΐͬͱৄ͍͍͠ํ • rebaseɺstashɺίϛοτ·ͱΊʢsquashʣɺbisect etc… • ԾԽɿDocker
• ίʔυͷ࣭ͷέΞ: ςετۦಈ։ൃ • IssueཧɿJiraɺRedmine ϝϯόશһ͕Ͱ͖ΔͱϓϩδΣΫτΛཧ͍͢͠ 55
ฐࣾࣄྫɿػցֶशνʔϜΛߏ͢Δ ϝϯό • த͙Β͍ͷنʢ߹ܭ໊̔ʣ • େ͖͚ͯ͘ೋछྨͷׂ͕͋Δ 1. ػցֶशΤϯδχΞɿػցֶशʴαʔϏεͷΠϯςάϨʔγϣϯ ɿϦαʔνϟدΓɺΤϯδχΞدΓͱ֤ࣗͷಘҙҧ͏ 2.
ΠϯϑϥΤϯδχΞɿػցֶशͷج൫Λ࡞ • ͜ͷଞɺνʔϜ֎ʹੳج൫ɺશࣾΠϯϑϥνʔϜ͕͋Γଟ͘ͷαϙʔτΛ Β͍ͬͯΔ 56
ฐࣾࣄྫɿػցֶशνʔϜϝϯόͷ ׂ • զʑͷنͰׂ֤͕ࣗʹϚον͢Δࣄ͚ͩΛ͢Δͱ৫͕ඇ ޮ • ͔ͳΓྲྀಈతͳׂ୲ɻओۀͷׂҎ֎ੵۃతʹ୲ • ྫɿΠϯϑϥΛओۀʹ͢Δϝϯό͕σʔλੳΛ୲ •
ҙਤɿ৫Ͱ֤ϝϯό͕ߩݙͰ͖Δ෯Λ૿͢ 57
ػցֶशʢR&Dʣʹ͓͚ΔΩϟϦΞܗͷ ͠͞ • ΠϯϑϥαʔϏε։ൃͱͷҧ͍ • ػցֶशࣗମແͯ͘αʔϏεΓཱͭ • ػցֶशαʔϏεΛΑΓྑ͘͢Δٕज़ • ձ͕ࣾظ͢ΔχʔζͱϚον͢ΔՌΛग़͢Α͏ҙࣝ͠ͳ͍ͱ͓ՙ
৫ʹͳΓ͍͢ → νʔϜղମ ʗ(^o^)ʘ • ڥʹΑΔधཁͷมಈ͕େ͖͘ɺҰ؏ͨ͠ΩϟϦΞΛங͖ʹ͍͘ɻɻɻ 58
ػցֶशνʔϜϝϯόͷੜଘઓུ • ෆ҆ఆͳ৫ͳͷͰɺ͋ΔఔੜଘઓུΛҙࣝͨ͠ํ͕Α͍ • ML/AIधཁ͕ʢҰ࣌తʹʣݮਰͯ͠ՁΛࣦΘͳ͍Α͏ʹඋ͑Δ • ํੑ̎ͭ͋ΔʢϚϧνελοΫԽɺҰಥഁʣ 59
ੜଘઓུ̍ɿؔ࿈ٕज़ͷशಘ • ػցֶशͷεϜʔζͳಋೖʹଟ༷ͳؔ࿈ٕज़͕ඞཁ • ػցֶशʹؔ࿈͢Δٕज़शಘͯ͠ϚϧνελοΫԽ • ػցֶशҎ֎ʹؔ࿈ͰڧΈΛ͍࣋ͬͯΔͱΑ͍ • ྫɿtakahi-i ݕࡧΤϯδϯʢSolrɺElasticsearchʣ͕ŧŔŕŪũƄŝſ
• ػցֶशҎ֎ͰձࣾʹߩݙͰ͖ΔΑ͏ʹ 60
ػցֶशͷपลͰར༻͞ΕΔٕज़ͷҰ෦ 61 ػցֶश
ετϨʔδʹؔ͢Δٕज़ ػցֶश • ֶशث͕ग़ྗ͢Δ݁ՌετϨʔδʹอ࣋ ͞Ε্ͨͰαʔϏεͰར༻͞ΕΔ • ݕࡧΤϯδϯɿSolrɺElasticsearch • σʔλϕʔεɿMySQLɺPostgreSQL •
·ͣεΩʔϚɺςʔϒϧઃܭΛͰ͖ΔΑ ͏ʹ 62
Πϯϑϥʹؔ͢Δٕज़ 63 ػցֶश • ػցֶशͷαʔϏεಋೖΛޮԽ͢Δͷʹར༻Ͱ ͖Δଟ༷ͳΠϯϑϥٕज़͕ଘࡏ͢Δ • ϫʔΫϑϩʔཧɿAirflow • ߏཧɿAnsibleɺChef
• αʔόཧɿDockerɺKubernatesɺECS • Πϯϑϥ͕ίʔυԽ͞ΕͨڥͩͱɺػցֶशΤ ϯδχΞࣗͰ͜ͷ͋ͨΓͷ࡞ۀ͕Ͱ͖Δ
αʔϏε։ൃͰར༻͢Δٕज़ 64 • σʔλϕʔεʹػցֶशͷΞϊςʔγϣϯ݁ՌΛอ࣋ͨ͠ޙɺ αʔϏεͰར༻͢ΔͨΊͷΠϯςάϨʔγϣϯ࡞ۀ͕ඞཁ • MVCϑϨʔϜϫʔΫʢRailsͳͲʣ • ·ͣϞσϧɺίϯτϩʔϥ࡞ΛࣗͰ࡞Δ •
ϑϩϯτͷௐɿJavaScriptɺCSS • ਐԽͷ͕ૣ͘ΩϟονΞοϓ͕େม͕ͩɺशಘ͢Δͱ αʔϏεʹಋೖ͘͢͠ͳΔ • ྫɿES̒ɺTypeScriptɺ ReactɺVueɺwebpack etc … ػցֶश
ੳج൫Ͱར༻͢Δٕज़ 65 • ॳาɿػցֶशͷσʔλΛੳج൫͔Βऔಘ͢ΔʢSQLʣ • தڃɿଟ༷ͳϏοΫσʔλϑϨʔϜϫʔΫΛ͏ • PythonɺSQL͚ͩͰͳ͘ɺࢄϑϨʔϜϫʔΫ ʢHadoopɺSparkʣ্ͷίʔυॻ͘ܦݧΛੵΉ •
ཧղΛਂΊΔͨΊʹझຯͰΞϧΰϦζϜΛ࣮ͯ͠ެ։ • ͕ࣗੲ࡞ͬͨͷʢLSH ͷұ࣮ʣɿ https:// github.com/takahi-i/likelike • ཧɿੳج൫ͷվળʹߩݙ ػցֶश
ҙɿֶɺཧֶͷम࢜ɺത͔࢜Βσʔ λαΠΤϯςΟετʹస͕ਐΜͰ͍Δɻ த్ͳཧղɺΞτϓοτͰੜ͖Δͷ ݫ͍͠ɻ ੜଘઓུ̎ɿػցֶशΛಥ͖٧ΊΔ • ͱͯ͠ͷ͕ऩ·ͬͯτοϓͷधཁ ৗʹଘࡏ͠ଓ͚Δ • ػցֶशʹ͓͍ͯɺҰྲྀͰ͋Δ͜ͱΛ
ࣔ͢ • ఆظతʢԿ͔ʹҰճʣʹ Top ΧϯϑΝϨ ϯεʹ࠾͞ΕΔ • NIPSɺICMLɺICCVɺCVPRɺCOLT etc… • Kaggle grand master 66
·ͱΊ • ػցֶशνʔϜʹ͓͚ΔΤϯδχΞͷׂʹ͍ͭͯղ આ • ϓϩδΣΫτͷ֤εςʔδʢ࣮ݧɺཧɺσϓϩΠʣ ͷΛղܾ͢Δ • ؔ࿈͢ΔνʔϜɿੳج൫ɺΠϯϑϥ͕ॏཁͳׂ •
ฐࣾͷࣄྫΛ͍͔ͭ͘հ • Webاۀʹ͓͚ΔػցֶशͷΩϟϦΞܗʹ͍ͭ ͯհ 67
ืूɿݚڀ։ൃ෦ͷΞϓϦέʔγϣ ϯΤϯδχΞ • ืूதͰ͢ • ݚڀͷՌͱΫοΫύουͷ࣮αʔϏεΛڮ͠Λ͢ΔϙδγϣϯͰ͢ • PoCͰͳ࣮͘αʔϏεɺΞϓϦέʔγϣϯͷ։ൃ • ɿػցֶशɺεϚʔτΩονϯ
• ৄͪ͘͜͠ΒΛ͓ಡΈ͍ͩ͘͞ ɿhttps:// cookpad.wd3.myworkdayjobs.com/en-US/jobs/job/Tokyo--Japan/--_R-001087-31 68
͝ਗ਼ௌ͋Γ͕ͱ͏͍͟͝·ͨ͠ 69