Upgrade to Pro
— share decks privately, control downloads, hide ads and more …
Speaker Deck
Features
Speaker Deck
PRO
Sign in
Sign up for free
Search
Search
kurashiruにおけるSageMakerの活用
Search
RytaroTsuji
October 15, 2018
Technology
1
220
kurashiruにおけるSageMakerの活用
aws loft ML night 2018/10/9
RytaroTsuji
October 15, 2018
Tweet
Share
More Decks by RytaroTsuji
See All by RytaroTsuji
Enterprise Generative AI on CloudNative
kametaro
0
170
2020_IR_Reading_dely_tsuji.pdf
kametaro
0
76
Other Decks in Technology
See All in Technology
JavaにおけるNull非許容性
skrb
2
2.6k
Perlの生きのこり - エンジニアがこの先生きのこるためのカンファレンス2025
kfly8
2
270
Amazon Q Developerの無料利用枠を使い倒してHello worldを表示させよう!
nrinetcom
PRO
2
120
"TEAM"を導入したら最高のエンジニア"Team"を実現できた / Deploying "TEAM" and Building the Best Engineering "Team"
yuj1osm
1
200
スキルだけでは満たせない、 “組織全体に”なじむオンボーディング/Onboarding that fits “throughout the organization” and cannot be satisfied by skills alone
bitkey
0
190
1行のコードから社会課題の解決へ: EMの探究、事業・技術・組織を紡ぐ実践知 / EM Conf 2025
9ma3r
11
3.9k
ディスプレイ広告(Yahoo!広告・LINE広告)におけるバックエンド開発
lycorptech_jp
PRO
0
390
Two Blades, One Journey: Engineering While Managing
ohbarye
4
2.1k
LINEギフトにおけるバックエンド開発
lycorptech_jp
PRO
0
300
RayでPHPのデバッグをちょっと快適にする
muno92
PRO
0
190
AIエージェント開発のノウハウと課題
pharma_x_tech
0
380
ExaDB-XSで利用されているExadata Exascaleについて
oracle4engineer
PRO
3
260
Featured
See All Featured
Typedesign – Prime Four
hannesfritz
40
2.5k
Statistics for Hackers
jakevdp
797
220k
Designing Experiences People Love
moore
140
23k
Bash Introduction
62gerente
611
210k
Building a Modern Day E-commerce SEO Strategy
aleyda
38
7.1k
Helping Users Find Their Own Way: Creating Modern Search Experiences
danielanewman
29
2.4k
Producing Creativity
orderedlist
PRO
344
40k
4 Signs Your Business is Dying
shpigford
182
22k
Designing for Performance
lara
604
68k
I Don’t Have Time: Getting Over the Fear to Launch Your Podcast
jcasabona
32
2.1k
Evolution of real-time – Irina Nazarova, EuRuKo, 2024
irinanazarova
6
570
A Philosophy of Restraint
colly
203
16k
Transcript
A m a z o n S a g e
M a k e r ͷ ׆ ༻ ࣄ ྫ
ձ ࣾ ɾαʔϏε հ • delyגࣜձࣾ • 20144݄ۀ • ࣾһ70ਓɺैۀһ130ਓ
• kurashiru (Ϋ ϥ γϧ ) • 20162݄ ɺα ʔ Ϗ ε ։࢝ • 20165݄ ɺΞ ϓ Ϧ Ϧ Ϧ ʔε • 20174݄ɺશࠃTVCM์ૹ։࢝ • 201712݄ɺྦྷܭ1000ສDLಥഁ
ࣗ ݾ հ • ⁋ོଠ(@kametaro) github/twitter • dely גࣜձࣾ
• ։ൃ෦ΤϯδχΞɾػցֶश୲ • झຯ • ʢପԁۂઢͱอܕܗࣜͷษڧதʣ • ུྺ • ڈ·ͰΞϓϦˍαʔόʔαΠυͷΤϯδχΞΛϝΠϯͰͬͯ·ͨ͠ɻػցֶश ΤϯδχΞͱͯ͠·ͩ·ͩϖʔϖʔͰ͢ɻ
ϨγϐఏҊʹ๊͓͍͍ͯ͑ͯͨ՝ 1Ґ 2Ґ 3Ґ 4Ґ 5Ґ 6Ґ શϢʔβʔʹڞ௨ͷϨγϐ܈Λදࣔ ਓͦΕͧΕͷΈʹ߹ͬͨϨγϐఏҊ͕Ͱ͖͍ͯͳ͍
ཧͷϨγϐఏҊ ਓͦΕͧΕͷΈʹج͍ͮͯύʔιφϥΠζ͞ΕͨఏҊ 1Ґ 2Ґ 3Ґ 1Ґ 2Ґ 3Ґ 1Ґ 2Ґ
3Ґ
Amazon SageMaker ͷಋೖΛܾఆ • ཧͷϨγϐఏҊΛ࣮ݱ͢Δʹػցֶशٕज़͕ඞਢ • ػցֶशΤϯδχΞ1໊ͷΈɺͰ࠷ͰϦϦʔε͍ͨ͠ • SageMakerϑϧϚωʔδυͳػցֶशαʔϏε •
ϞσϧߏஙɺτϨʔχϯάɺσϓϩΠ·ͰΛҰؾ௨؏ͰରԠ • ։ൃணख͔Β1.5ϲ݄ͰProductionڥͷөʹޭ
࣮ ̍ ɿ Ϋ ϥ ε λ Ϧ ϯ
ά Ϣʔ β ʔ ૉੑ • ͓ ؾ ʹ ೖ Γ / ݕࡧճ • ࢹௌճ/ ࢹ ௌ ࣌ ؒ • ϩ άΠ ϯ ༗ແ • ฏ/ ٳͷ ىಈճ • ேனͷ ىಈճ etc… Ϩ γ ϐ ૉੑ • Χ ς ΰ Ϧ ɺ ༸ த • ०ͳ৯ࡐ • ௐཧ࣌ؒɺ৯ࡐ • Χ ϩ Ϧ ʔ ɺ Ԙ ྔ • ਏ ͍ ɾ ͍ etc… Ϣ ʔ β ʔ ͓ Α ͼ Ϩ γ ϐ ͷ ಛ ྔ Λ ந ग़ ͯ͠ Ϋ ϥε λ Ϧϯ ά
࣮̎ɿڠௐϑ Ο ϧ λ Ϧ ϯ ά ڠௐϑ Ο ϧ
λ Ϧ ϯ ά 1. ࣗʹࣅ͍ͯΔਓͷΈͱ ࣗ ͷΈࣅ͍ͯΔͣʂ 2. ࣗ ʹࣅ ͍ͯΔਓ ͕ ΜͩϨγϐ ࣗ ͕ · ͩ ݟ ͨ ͜ͱͳ ͯ͘ ͖ ͳ ͣ ʂ ֤ Ϣ ʔ β ʔ Ϋ ϥ ε λ ͕ Ή Ͱ ͋ Ζ ͏ Ϩ γ ϐ Λ ਪ ʹ Α ΓϨ ʔ ς Ο ϯ ά Λ औ ಘ ɺίϯ ς ϯ π ϓʔϧ ʹ ֨ ೲ
࣮ ̏ɿίϯ ς ϯ π ϓʔϧ ͷ ࠷ ద
Խ ࣌ؒܦա܁Γฦ͠ࢹௌʹ ΑΓί ϯ ς ϯ π ຏ ͠ ͯ ͍ ͘ ↓ ಉ ͡ Ϋ ϥε λ ͷ ະ ࢹ ௌ Ϩ γ ϐ ʹ ೖ Ε ସ ͑ ͯ ɺ ί ϯ ς ϯ π ϓʔ ϧ Λ Ϧ ϑ Ϩ ο γ ϡ
Ϩ γ ϐ ఏ Ҋ · Ͱ ͷ σ ʔ
λ ͷ ྲྀ Ε
Ϩ γ ϐ ఏ Ҋ · Ͱ ͷ σ ʔ
λ ͷ ྲྀ Ε 1. Έ ࠐ Έ ͢ ͘ ɺ Έ ͑ ָ • ֶशίϯςφ͕Γग़ͤΔͷͰɺ δϣϒϑϩʔͷՃฒྻԽ͕ྟ ػԠมʹߦ͑Δ SageMaker
ϩά ऩूج൫ data ETL Machine Learning Service development Container vm(minicube)
[[etl]] ap-northeast-1 us-east-1 ap-northeast-1 Amazon Athena kops kops cronjobs extract transform train predict load [[etl]] Transform train predict load Amazon SageMaker predict endpoint container train job container Predict endpoint container - instance type - instance count train job container - instance type - instance count DynamoDB recommendation RDB recommendation AWS Glue staging production apply staging apply feature input feature CRR CRR apply application endpoint
ϩά ऩूج൫ data ETL Machine Learning Service development Container vm(minicube)
[[etl]] ap-northeast-1 us-east-1 ap-northeast-1 Amazon Athena kops kops cronjobs extract transform train predict load [[etl]] Transform train predict load Amazon SageMaker predict endpoint container train job container Predict endpoint container - instance type - instance count train job container - instance type - instance count DynamoDB recommendation RDB recommendation AWS Glue staging production apply staging apply feature input feature CRR CRR apply application endpoint SageMaker 1. ॊೈͳόονγεςϜ • τϨʔχϯάδϣϒʹ͔͔ΔෛՙΛ ผΠϯελϯεʹҕৡՄೳ • ඇಉظͰδϣϒ࣮ߦՄೳ 2. ࣗ༝ʹΤϯυϙΠϯτԽ • ӬଓԽͨ͠API͔Βਪ݁ՌΛฦ٫ • Φʔτεέʔϧػೳ͋Γ
Amazon SageMakerͷ׆༻ • ੳʢϊʔτϒοΫΠϯελϯεʣ • ֶशͱਪʢΞϧΰϦζϜɾίϯςφʣ ͜ΕΒͷओʹͭ·͍ͣͨΛհ
ੳᶃ ϊʔτϒοΫΠϯελϯε ‣ Jupyter NotebookͷΠϯελϯεΛ؆୯ʹىಈͰ͖Δɻ ‣ ΠϯελϯεαΠζΛ࡞ޙʹมߋՄೳɻ
ੳᶄ ϥΠϑαΠΫϧઃఆ #!/bin/bash set -e sudo yum install -y gcc72
gcc72-c++ echo ". /home/ec2-user/anaconda3/etc/profile.d/ conda.sh" >> ~/.bashrc source ~/.bashrc conda activate python3 pip install --upgrade pip pip install sshtunnel --no-warn-conflicts pip install pymysql --no-warn-conflicts pip install gensim --no-warn-conflicts pip install msgpack --no-warn-conflicts pip install janome --no-warn-conflicts pip install jupyter-emacskeys --no-warn-conflicts pip install fasttext --no-warn-conflicts ϊʔτϒοΫΠϯελϯεىಈޙʹ ඞཁͳϥΠϒϥϦͷΠϯετʔϧͳͲ Λࡁ·ͤΔɻ Lifecycle configurations ex)
ੳᶅ • ϊʔτϒοΫͰͭ·͍ͮͨͱ͜Ζ ϊʔτϒοΫͷىಈʹࣦഊ͢Δͱίϯιʔϧը໘͔ΒىಈͰ͖ͳ͘ͳΔɻ ϥΠϑαΠΫϧઃఆͷpip install͕҆ఆ͠ͳ͍ɻ ‣ ϥΠϑαΠΫϧઃఆͰίέΔ ‣ େ͖ͳϑΝΠϧΛuploadͯ͠ΠϯελϯεͷσΟεΫ༰ྔ͕͍ͬͺ͍
‣ sagemakerͷpython packageͱpipͷىಈλΠϛϯά͕όοςΟϯά͢Δͱى͜Δɻ ✓pip install numpy —no-warn-conflicts # ͜ͷΦϓγϣϯΛ͚Δ ‣ ͜ͷΑ͏ʹԿૢ࡞Ͱ͖ͳ͘ͳΔ ✓awscli͔Βىಈ͢Δ # aws sagemaker start-notebook-instance --notebook-instance-name my_note
ֶशͱਪᶃ • Built-InΞϧΰϦζϜ k-means PCA LDA Factorization Machines Linear Learner
Neural Topic Model Random Cut Forest Seq2Seq Modeling XGBoost Object Detection Image Classification DeepAR Forecasting BlazingText k-nearest-neighbor (k-NN) ‣ Factorization Machines => Ϩίϝϯυ ‣ XGBoost => ଞΫϥεྨ ‣ Image Classification => αϜωΠϧը૾ྨ ‣ k-means => ΫϥελϦϯά
ֶशͱਪᶄ • Factorization MachinesͰͭ·͍ͣͨͱ͜Ζ ՝ɿnumpyͰѻ͏ʹେ͖͗͢ΔτϨʔχϯάσʔληοτ
ֶशͱਪᶄ • Factorization MachinesͰͭ·͍ͣͨͱ͜Ζ ରࡦɿscipy.sparse.lil_matrixʹΑΔεύʔεߦྻͷੜ͢Δ େ͖ͳεύʔεߦྻΛ̍ͰຒΊ͍ͯ͘
ֶशͱਪᶄ • Factorization MachinesͰͭ·͍ͣͨͱ͜Ζ ՝ɾରࡦɿਪྔ͕ଟ͍numpy:1ߦ -> scr:10000ߦʢ16࣌ؒ -> 20ʣ Compressed
Sparse Row matrix ʹѹॖ csrߦྻ͕ࢦఆͰ͖Δ ※) Batch transform job ʹमਖ਼த
ֶशͱਪᶅ • XGBoostͰͭ·͍ͣͨͱ͜Ζ ՝ɿϋΠύʔύϥϝλௐδϣϒͬͯͲ͏ͬͯ͏ͷʁ
ֶशͱਪᶅ • XGBoostͰͭ·͍ͣͨͱ͜Ζ ରࡦɿϋΠύʔύϥϝλௐδϣϒͷҾʹrangesύϥϝλΛ͢
ֶशͱਪᶅ • XGBoostͰͭ·͍ͣͨͱ͜Ζ ରࡦɿϋΠύʔύϥϝλௐδϣϒͷ࣮ߦ
ֶशͱਪᶅ • XGBoostͰͭ·͍ͣͨͱ͜Ζ ରࡦɿϋΠύʔύϥϝλௐδϣϒΛίϯιʔϧͰ֬ೝ validation:auc
ֶशͱਪᶆ • Image ClassificationͰͭ·͍ͣͨͱ͜Ζ ՝: τϨʔχϯάσʔληοτͬͯͲ͏ͬͯ༻ҙ͢Δͷʁ MXNetͷrecϑΝΠϧΛࢦఆ͢Δ
ֶशͱਪᶆ • Image ClassificationͰͭ·͍ͣͨͱ͜Ζ ରࡦɿMXNetͷlstϑΝΠϧͱrecϑΝΠϧͷ࡞ MXNET_HOME = ‘~/incubator-mxnet/' RESOURCE_DIR =
‘~/thumbnails/' os.system('python {0}/tools/im2rec.py --list --recursive --train-ratio 0.8 --test-ratio 0.2 {1}/im2rec/target {1}'.format(MXNET_HOME, RESOURCE_DIR)) os.system('python {0}/tools/im2rec.py --resize 480 --quality 95 --num-thread 64 {1}/im2rec/train {1}'.format(MXNET_HOME, RESOURCE_DIR)) os.system('python {0}/tools/im2rec.py --resize 480 --quality 95 --num-thread 64 {1}/im2rec/test {1}'.format(MXNET_HOME, RESOURCE_DIR)) 1.https://github.com/apache/incubator-mxnet.git 2.ֶश͢ΔαϜωΠϧը૾ΛPCʹμϯϩʔυ 3.࡞ͨ͠recϑΝΠϧΛS3ͷॴఆͷॴʹΞοϓϩʔυ
ֶशͱਪᶇ • k-meansͰͭ·͍ͣͨͱ͜Ζ ՝ɾରࡦɿkΫϥελʔͷ࠷దͲ͏ͬͯௐΔͷʁ͜Εʹؔͯ͠ϋΠύʔύϥ ϝλௐδϣϒͰݱ࣌ͰͰ͖ͳ͍ͷͰҎԼͷํ๏ͰಓʹௐΔɻ ΤϧϘʔ๏ γϧΤοτੳ
ETLɾֶशόονγεςϜ • Kubernetes(kops)Λج൫ʹબͨ͠ཧ༝ step functionsʗAWS BatchͰɺδϣϒͱδϣϒϑϩʔΛҰॹʹཧͰ͖ͳ͍ɻ εέδϡʔϥʔ͕cronjobs͚ͩͰγϯϓϧʹཧͰ͖ɺίϚϯυͰ؆୯ʹมߋͰ͖Δɻ ΦϯϥΠϯֶशͰBatchͱAPIΛ࿈ܞ͢Δඞཁ͕͋ͬͨɻ কདྷతʹEKSʢ౦ژϦʔδϣϯʣͰཧͰ͖Δɻ step
functionsAWS Batch෦తʹ༻Մೳɻ SageMakerͰֶश͕ίϯςφʹΓͤΔͷͰɺόονγεςϜͷઃܭ͕ॊೈʹߦ ͑Δɻ
SageMakerΛ̑ϲ݄ͬͯΈͨײ • ੳʢϊʔτϒοΫΠϯελϯεʣ ϥΠϑαΠΫϧઃఆ͕ศརʗ͓खܰʹڥΛηοτΞοϓͰ͖Δ ͪΐͬͱॲཧ͕ॏ͘ͳͬͨͱࢥͬͨΒɺ͋ͱ͔ΒΠϯελϯελΠϓΛมߋՄೳ • ֶशͱਪʢΞϧΰϦζϜɾίϯςφʣ Built-inΞϧΰϦζϜɺTensorflowʗChainerͳͲਂֶशϑϨʔϜϫʔΫॆ࣮ ֶशίϯςφ͕Γ͞ΕΔͷͰɺ࣮ߦதͷδϣϒϦιʔεΛؾʹ͠ͳͯ͘ࡁΉ ϊʔτϒοΫΛෳਓͰར༻Ͱ͖Δ
ϞσϧΛ؆୯ʹΤϯυϙΠϯτͱͯ͠σϓϩΠͰ͖ɺΦʔτεέʔϧՄೳ ϋΠύʔύϥϝλௐδϣϒΛͬͯɺҰ൪ྑ͍ϋΠύʔύϥϝλΛࣗಈઃఆͰ͖Δ
ࠓޙͷల • ৯ࡐͷ ༨Γ ͢ ͞ Λ ߟྀ͠ ͨ
Ϩ γ ϐ ఏҊ 1. աڈʹ ࢹௌ͠ ͨ Ϩ γ ϐ ͷ தͰ ༨Γ ͢ ͍ ৯ࡐΛ ผ 2. ͦ ͷ ৯ࡐΛ ޮΑ ͘ ফඅͰ ͖ Δ Ϩ γ ϐ Λ ఏҊ • ύʔιφϥΠζͨ͠ϨγϐͷఏҊ 1. ʰ ਏ ͍ ʗ ͍ ʱ ɺ ʰ ͜ ͬ ͯ Γ ʗ ͞ ͬ ͺ Γ ʱ ͳ Ͳ ɺ Α Γ Ϣ ʔ β ͷ Έ ϥ Π ϑ ε λ Π ϧ ʹ ߹ ͬ ͨ Ϩ γ ϐ ͷ ఏ Ҋ 2. ༨ ͬ ͨ ৯ ࡐ ʹ ͪ ΐ ͍ ͠ ͠ ͯ Ͱ ͖ Δ Ϩ γ ϐ ͷ ఏ Ҋ
delyͰػցֶशΤϯδχΞΛืू͍ͯ͠·͢ʂ • ΫϥγϧγΣϑ͕࡞ͬͨϨγϐຊʹඒຯ͍͠ΜͰ͢Αɻ ඒ ຯ ͠ ͦ ͏ ͳ ͷ
ݟ ͨ ͩ ͚ ͳ Μ Ͱ ͠ ΐ ͏ ʁ ͍ ͍ ɺ ͦ Μ ͳ ͜ ͱ ͳ ͍ Μ Ͱ ͢ɻ ຯ Θ ͬ ͯ Έ Δ ͭ ͍ Ͱ ʹ ػ ց ֶ श Γ ͨ ͍ ͱ ͍ ͏ ํ ͥ ͻ ͓ ͪ ͠ ͯ ͓ Γ · ͢ ʂ • ػցֶशʹؔ࿈͢Δ͜ͱશ෦ܦݧͰ͖·͢ɻ ͍ · ͷ ͱ ͜ Ζ σ ʔ λ ੳ ɺ α ʔ Ϗ ε ఏ ڙ ɺ ֶ श Ξ ϧ ΰ Ϧ ζ Ϝ બ ఆ ɺ ج ൫ ߏ ங ɾ ӡ ༻ · Ͱ શ ෦ Ұ ਓ Ͱ ͬ ͯ · ͢ɻ গ ͠ େ ͖ ͍ ن ͷ ৫ ͩ ͱ ෳ ਓ Ͱ Δ Α ͏ ͳ ͜ ͱ Λ ڽ ॖ ͠ ͯ ܦ ݧ Ͱ ͖ · ͢ ʂ