Upgrade to Pro
— share decks privately, control downloads, hide ads and more …
Speaker Deck
Features
Speaker Deck
PRO
Sign in
Sign up for free
Search
Search
kurashiruにおけるSageMakerの活用
Search
RytaroTsuji
October 15, 2018
Technology
1
240
kurashiruにおけるSageMakerの活用
aws loft ML night 2018/10/9
RytaroTsuji
October 15, 2018
Tweet
Share
More Decks by RytaroTsuji
See All by RytaroTsuji
Enterprise Generative AI on CloudNative
kametaro
0
230
2020_IR_Reading_dely_tsuji.pdf
kametaro
0
88
Other Decks in Technology
See All in Technology
Keycloak を使った SSO で CockroachDB にログインする / CockroachDB SSO with Keycloak
kota2and3kan
0
160
OSC仙台プレ勉強会 AlmaLinuxとは
koedoyoshida
0
190
僕、S3 シンプルって名前だけど全然シンプルじゃありません よろしくお願いします
yama3133
1
230
TypeScript 7.0の現在地と備え方
uhyo
7
1.7k
(Test) ai-meetup slide creation
oikon48
3
430
今のWordPress の制作手法ってなにがあんねん?(改) / What’s the Deal with WordPress Development These Days?
tbshiki
0
500
頼れる Agentic AI を支える Datadog のオブザーバビリティ / Powering Reliable Agentic AI with Datadog Observability
aoto
PRO
0
200
プラットフォームエンジニアリングはAI時代の開発者をどう救うのか
jacopen
7
3.8k
スクリプトの先へ!AIエージェントと組み合わせる モバイルE2Eテスト
error96num
0
180
フロントエンド刷新 4年間の軌跡
yotahada3
0
480
脳内メモリ、思ったより揮発性だった
koutorino
0
380
VPCエンドポイント意外とお金かかるなぁ。せや、共有したろ!
tommy0124
1
680
Featured
See All Featured
Why Mistakes Are the Best Teachers: Turning Failure into a Pathway for Growth
auna
0
84
Templates, Plugins, & Blocks: Oh My! Creating the theme that thinks of everything
marktimemedia
31
2.7k
Building an army of robots
kneath
306
46k
4 Signs Your Business is Dying
shpigford
187
22k
Primal Persuasion: How to Engage the Brain for Learning That Lasts
tmiket
0
290
Accessibility Awareness
sabderemane
0
82
The Myth of the Modular Monolith - Day 2 Keynote - Rails World 2024
eileencodes
26
3.4k
Building Experiences: Design Systems, User Experience, and Full Site Editing
marktimemedia
0
440
Designing for Performance
lara
611
70k
Abbi's Birthday
coloredviolet
2
5.4k
Statistics for Hackers
jakevdp
799
230k
Odyssey Design
rkendrick25
PRO
2
550
Transcript
A m a z o n S a g e
M a k e r ͷ ׆ ༻ ࣄ ྫ
ձ ࣾ ɾαʔϏε հ • delyגࣜձࣾ • 20144݄ۀ • ࣾһ70ਓɺैۀһ130ਓ
• kurashiru (Ϋ ϥ γϧ ) • 20162݄ ɺα ʔ Ϗ ε ։࢝ • 20165݄ ɺΞ ϓ Ϧ Ϧ Ϧ ʔε • 20174݄ɺશࠃTVCM์ૹ։࢝ • 201712݄ɺྦྷܭ1000ສDLಥഁ
ࣗ ݾ հ • ⁋ོଠ(@kametaro) github/twitter • dely גࣜձࣾ
• ։ൃ෦ΤϯδχΞɾػցֶश୲ • झຯ • ʢପԁۂઢͱอܕܗࣜͷษڧதʣ • ུྺ • ڈ·ͰΞϓϦˍαʔόʔαΠυͷΤϯδχΞΛϝΠϯͰͬͯ·ͨ͠ɻػցֶश ΤϯδχΞͱͯ͠·ͩ·ͩϖʔϖʔͰ͢ɻ
ϨγϐఏҊʹ๊͓͍͍ͯ͑ͯͨ՝ 1Ґ 2Ґ 3Ґ 4Ґ 5Ґ 6Ґ શϢʔβʔʹڞ௨ͷϨγϐ܈Λදࣔ ਓͦΕͧΕͷΈʹ߹ͬͨϨγϐఏҊ͕Ͱ͖͍ͯͳ͍
ཧͷϨγϐఏҊ ਓͦΕͧΕͷΈʹج͍ͮͯύʔιφϥΠζ͞ΕͨఏҊ 1Ґ 2Ґ 3Ґ 1Ґ 2Ґ 3Ґ 1Ґ 2Ґ
3Ґ
Amazon SageMaker ͷಋೖΛܾఆ • ཧͷϨγϐఏҊΛ࣮ݱ͢Δʹػցֶशٕज़͕ඞਢ • ػցֶशΤϯδχΞ1໊ͷΈɺͰ࠷ͰϦϦʔε͍ͨ͠ • SageMakerϑϧϚωʔδυͳػցֶशαʔϏε •
ϞσϧߏஙɺτϨʔχϯάɺσϓϩΠ·ͰΛҰؾ௨؏ͰରԠ • ։ൃணख͔Β1.5ϲ݄ͰProductionڥͷөʹޭ
࣮ ̍ ɿ Ϋ ϥ ε λ Ϧ ϯ
ά Ϣʔ β ʔ ૉੑ • ͓ ؾ ʹ ೖ Γ / ݕࡧճ • ࢹௌճ/ ࢹ ௌ ࣌ ؒ • ϩ άΠ ϯ ༗ແ • ฏ/ ٳͷ ىಈճ • ேனͷ ىಈճ etc… Ϩ γ ϐ ૉੑ • Χ ς ΰ Ϧ ɺ ༸ த • ०ͳ৯ࡐ • ௐཧ࣌ؒɺ৯ࡐ • Χ ϩ Ϧ ʔ ɺ Ԙ ྔ • ਏ ͍ ɾ ͍ etc… Ϣ ʔ β ʔ ͓ Α ͼ Ϩ γ ϐ ͷ ಛ ྔ Λ ந ग़ ͯ͠ Ϋ ϥε λ Ϧϯ ά
࣮̎ɿڠௐϑ Ο ϧ λ Ϧ ϯ ά ڠௐϑ Ο ϧ
λ Ϧ ϯ ά 1. ࣗʹࣅ͍ͯΔਓͷΈͱ ࣗ ͷΈࣅ͍ͯΔͣʂ 2. ࣗ ʹࣅ ͍ͯΔਓ ͕ ΜͩϨγϐ ࣗ ͕ · ͩ ݟ ͨ ͜ͱͳ ͯ͘ ͖ ͳ ͣ ʂ ֤ Ϣ ʔ β ʔ Ϋ ϥ ε λ ͕ Ή Ͱ ͋ Ζ ͏ Ϩ γ ϐ Λ ਪ ʹ Α ΓϨ ʔ ς Ο ϯ ά Λ औ ಘ ɺίϯ ς ϯ π ϓʔϧ ʹ ֨ ೲ
࣮ ̏ɿίϯ ς ϯ π ϓʔϧ ͷ ࠷ ద
Խ ࣌ؒܦա܁Γฦ͠ࢹௌʹ ΑΓί ϯ ς ϯ π ຏ ͠ ͯ ͍ ͘ ↓ ಉ ͡ Ϋ ϥε λ ͷ ະ ࢹ ௌ Ϩ γ ϐ ʹ ೖ Ε ସ ͑ ͯ ɺ ί ϯ ς ϯ π ϓʔ ϧ Λ Ϧ ϑ Ϩ ο γ ϡ
Ϩ γ ϐ ఏ Ҋ · Ͱ ͷ σ ʔ
λ ͷ ྲྀ Ε
Ϩ γ ϐ ఏ Ҋ · Ͱ ͷ σ ʔ
λ ͷ ྲྀ Ε 1. Έ ࠐ Έ ͢ ͘ ɺ Έ ͑ ָ • ֶशίϯςφ͕Γग़ͤΔͷͰɺ δϣϒϑϩʔͷՃฒྻԽ͕ྟ ػԠมʹߦ͑Δ SageMaker
ϩά ऩूج൫ data ETL Machine Learning Service development Container vm(minicube)
[[etl]] ap-northeast-1 us-east-1 ap-northeast-1 Amazon Athena kops kops cronjobs extract transform train predict load [[etl]] Transform train predict load Amazon SageMaker predict endpoint container train job container Predict endpoint container - instance type - instance count train job container - instance type - instance count DynamoDB recommendation RDB recommendation AWS Glue staging production apply staging apply feature input feature CRR CRR apply application endpoint
ϩά ऩूج൫ data ETL Machine Learning Service development Container vm(minicube)
[[etl]] ap-northeast-1 us-east-1 ap-northeast-1 Amazon Athena kops kops cronjobs extract transform train predict load [[etl]] Transform train predict load Amazon SageMaker predict endpoint container train job container Predict endpoint container - instance type - instance count train job container - instance type - instance count DynamoDB recommendation RDB recommendation AWS Glue staging production apply staging apply feature input feature CRR CRR apply application endpoint SageMaker 1. ॊೈͳόονγεςϜ • τϨʔχϯάδϣϒʹ͔͔ΔෛՙΛ ผΠϯελϯεʹҕৡՄೳ • ඇಉظͰδϣϒ࣮ߦՄೳ 2. ࣗ༝ʹΤϯυϙΠϯτԽ • ӬଓԽͨ͠API͔Βਪ݁ՌΛฦ٫ • Φʔτεέʔϧػೳ͋Γ
Amazon SageMakerͷ׆༻ • ੳʢϊʔτϒοΫΠϯελϯεʣ • ֶशͱਪʢΞϧΰϦζϜɾίϯςφʣ ͜ΕΒͷओʹͭ·͍ͣͨΛհ
ੳᶃ ϊʔτϒοΫΠϯελϯε ‣ Jupyter NotebookͷΠϯελϯεΛ؆୯ʹىಈͰ͖Δɻ ‣ ΠϯελϯεαΠζΛ࡞ޙʹมߋՄೳɻ
ੳᶄ ϥΠϑαΠΫϧઃఆ #!/bin/bash set -e sudo yum install -y gcc72
gcc72-c++ echo ". /home/ec2-user/anaconda3/etc/profile.d/ conda.sh" >> ~/.bashrc source ~/.bashrc conda activate python3 pip install --upgrade pip pip install sshtunnel --no-warn-conflicts pip install pymysql --no-warn-conflicts pip install gensim --no-warn-conflicts pip install msgpack --no-warn-conflicts pip install janome --no-warn-conflicts pip install jupyter-emacskeys --no-warn-conflicts pip install fasttext --no-warn-conflicts ϊʔτϒοΫΠϯελϯεىಈޙʹ ඞཁͳϥΠϒϥϦͷΠϯετʔϧͳͲ Λࡁ·ͤΔɻ Lifecycle configurations ex)
ੳᶅ • ϊʔτϒοΫͰͭ·͍ͮͨͱ͜Ζ ϊʔτϒοΫͷىಈʹࣦഊ͢Δͱίϯιʔϧը໘͔ΒىಈͰ͖ͳ͘ͳΔɻ ϥΠϑαΠΫϧઃఆͷpip install͕҆ఆ͠ͳ͍ɻ ‣ ϥΠϑαΠΫϧઃఆͰίέΔ ‣ େ͖ͳϑΝΠϧΛuploadͯ͠ΠϯελϯεͷσΟεΫ༰ྔ͕͍ͬͺ͍
‣ sagemakerͷpython packageͱpipͷىಈλΠϛϯά͕όοςΟϯά͢Δͱى͜Δɻ ✓pip install numpy —no-warn-conflicts # ͜ͷΦϓγϣϯΛ͚Δ ‣ ͜ͷΑ͏ʹԿૢ࡞Ͱ͖ͳ͘ͳΔ ✓awscli͔Βىಈ͢Δ # aws sagemaker start-notebook-instance --notebook-instance-name my_note
ֶशͱਪᶃ • Built-InΞϧΰϦζϜ k-means PCA LDA Factorization Machines Linear Learner
Neural Topic Model Random Cut Forest Seq2Seq Modeling XGBoost Object Detection Image Classification DeepAR Forecasting BlazingText k-nearest-neighbor (k-NN) ‣ Factorization Machines => Ϩίϝϯυ ‣ XGBoost => ଞΫϥεྨ ‣ Image Classification => αϜωΠϧը૾ྨ ‣ k-means => ΫϥελϦϯά
ֶशͱਪᶄ • Factorization MachinesͰͭ·͍ͣͨͱ͜Ζ ՝ɿnumpyͰѻ͏ʹେ͖͗͢ΔτϨʔχϯάσʔληοτ
ֶशͱਪᶄ • Factorization MachinesͰͭ·͍ͣͨͱ͜Ζ ରࡦɿscipy.sparse.lil_matrixʹΑΔεύʔεߦྻͷੜ͢Δ େ͖ͳεύʔεߦྻΛ̍ͰຒΊ͍ͯ͘
ֶशͱਪᶄ • Factorization MachinesͰͭ·͍ͣͨͱ͜Ζ ՝ɾରࡦɿਪྔ͕ଟ͍numpy:1ߦ -> scr:10000ߦʢ16࣌ؒ -> 20ʣ Compressed
Sparse Row matrix ʹѹॖ csrߦྻ͕ࢦఆͰ͖Δ ※) Batch transform job ʹमਖ਼த
ֶशͱਪᶅ • XGBoostͰͭ·͍ͣͨͱ͜Ζ ՝ɿϋΠύʔύϥϝλௐδϣϒͬͯͲ͏ͬͯ͏ͷʁ
ֶशͱਪᶅ • XGBoostͰͭ·͍ͣͨͱ͜Ζ ରࡦɿϋΠύʔύϥϝλௐδϣϒͷҾʹrangesύϥϝλΛ͢
ֶशͱਪᶅ • XGBoostͰͭ·͍ͣͨͱ͜Ζ ରࡦɿϋΠύʔύϥϝλௐδϣϒͷ࣮ߦ
ֶशͱਪᶅ • XGBoostͰͭ·͍ͣͨͱ͜Ζ ରࡦɿϋΠύʔύϥϝλௐδϣϒΛίϯιʔϧͰ֬ೝ validation:auc
ֶशͱਪᶆ • Image ClassificationͰͭ·͍ͣͨͱ͜Ζ ՝: τϨʔχϯάσʔληοτͬͯͲ͏ͬͯ༻ҙ͢Δͷʁ MXNetͷrecϑΝΠϧΛࢦఆ͢Δ
ֶशͱਪᶆ • Image ClassificationͰͭ·͍ͣͨͱ͜Ζ ରࡦɿMXNetͷlstϑΝΠϧͱrecϑΝΠϧͷ࡞ MXNET_HOME = ‘~/incubator-mxnet/' RESOURCE_DIR =
‘~/thumbnails/' os.system('python {0}/tools/im2rec.py --list --recursive --train-ratio 0.8 --test-ratio 0.2 {1}/im2rec/target {1}'.format(MXNET_HOME, RESOURCE_DIR)) os.system('python {0}/tools/im2rec.py --resize 480 --quality 95 --num-thread 64 {1}/im2rec/train {1}'.format(MXNET_HOME, RESOURCE_DIR)) os.system('python {0}/tools/im2rec.py --resize 480 --quality 95 --num-thread 64 {1}/im2rec/test {1}'.format(MXNET_HOME, RESOURCE_DIR)) 1.https://github.com/apache/incubator-mxnet.git 2.ֶश͢ΔαϜωΠϧը૾ΛPCʹμϯϩʔυ 3.࡞ͨ͠recϑΝΠϧΛS3ͷॴఆͷॴʹΞοϓϩʔυ
ֶशͱਪᶇ • k-meansͰͭ·͍ͣͨͱ͜Ζ ՝ɾରࡦɿkΫϥελʔͷ࠷దͲ͏ͬͯௐΔͷʁ͜Εʹؔͯ͠ϋΠύʔύϥ ϝλௐδϣϒͰݱ࣌ͰͰ͖ͳ͍ͷͰҎԼͷํ๏ͰಓʹௐΔɻ ΤϧϘʔ๏ γϧΤοτੳ
ETLɾֶशόονγεςϜ • Kubernetes(kops)Λج൫ʹબͨ͠ཧ༝ step functionsʗAWS BatchͰɺδϣϒͱδϣϒϑϩʔΛҰॹʹཧͰ͖ͳ͍ɻ εέδϡʔϥʔ͕cronjobs͚ͩͰγϯϓϧʹཧͰ͖ɺίϚϯυͰ؆୯ʹมߋͰ͖Δɻ ΦϯϥΠϯֶशͰBatchͱAPIΛ࿈ܞ͢Δඞཁ͕͋ͬͨɻ কདྷతʹEKSʢ౦ژϦʔδϣϯʣͰཧͰ͖Δɻ step
functionsAWS Batch෦తʹ༻Մೳɻ SageMakerͰֶश͕ίϯςφʹΓͤΔͷͰɺόονγεςϜͷઃܭ͕ॊೈʹߦ ͑Δɻ
SageMakerΛ̑ϲ݄ͬͯΈͨײ • ੳʢϊʔτϒοΫΠϯελϯεʣ ϥΠϑαΠΫϧઃఆ͕ศརʗ͓खܰʹڥΛηοτΞοϓͰ͖Δ ͪΐͬͱॲཧ͕ॏ͘ͳͬͨͱࢥͬͨΒɺ͋ͱ͔ΒΠϯελϯελΠϓΛมߋՄೳ • ֶशͱਪʢΞϧΰϦζϜɾίϯςφʣ Built-inΞϧΰϦζϜɺTensorflowʗChainerͳͲਂֶशϑϨʔϜϫʔΫॆ࣮ ֶशίϯςφ͕Γ͞ΕΔͷͰɺ࣮ߦதͷδϣϒϦιʔεΛؾʹ͠ͳͯ͘ࡁΉ ϊʔτϒοΫΛෳਓͰར༻Ͱ͖Δ
ϞσϧΛ؆୯ʹΤϯυϙΠϯτͱͯ͠σϓϩΠͰ͖ɺΦʔτεέʔϧՄೳ ϋΠύʔύϥϝλௐδϣϒΛͬͯɺҰ൪ྑ͍ϋΠύʔύϥϝλΛࣗಈઃఆͰ͖Δ
ࠓޙͷల • ৯ࡐͷ ༨Γ ͢ ͞ Λ ߟྀ͠ ͨ
Ϩ γ ϐ ఏҊ 1. աڈʹ ࢹௌ͠ ͨ Ϩ γ ϐ ͷ தͰ ༨Γ ͢ ͍ ৯ࡐΛ ผ 2. ͦ ͷ ৯ࡐΛ ޮΑ ͘ ফඅͰ ͖ Δ Ϩ γ ϐ Λ ఏҊ • ύʔιφϥΠζͨ͠ϨγϐͷఏҊ 1. ʰ ਏ ͍ ʗ ͍ ʱ ɺ ʰ ͜ ͬ ͯ Γ ʗ ͞ ͬ ͺ Γ ʱ ͳ Ͳ ɺ Α Γ Ϣ ʔ β ͷ Έ ϥ Π ϑ ε λ Π ϧ ʹ ߹ ͬ ͨ Ϩ γ ϐ ͷ ఏ Ҋ 2. ༨ ͬ ͨ ৯ ࡐ ʹ ͪ ΐ ͍ ͠ ͠ ͯ Ͱ ͖ Δ Ϩ γ ϐ ͷ ఏ Ҋ
delyͰػցֶशΤϯδχΞΛืू͍ͯ͠·͢ʂ • ΫϥγϧγΣϑ͕࡞ͬͨϨγϐຊʹඒຯ͍͠ΜͰ͢Αɻ ඒ ຯ ͠ ͦ ͏ ͳ ͷ
ݟ ͨ ͩ ͚ ͳ Μ Ͱ ͠ ΐ ͏ ʁ ͍ ͍ ɺ ͦ Μ ͳ ͜ ͱ ͳ ͍ Μ Ͱ ͢ɻ ຯ Θ ͬ ͯ Έ Δ ͭ ͍ Ͱ ʹ ػ ց ֶ श Γ ͨ ͍ ͱ ͍ ͏ ํ ͥ ͻ ͓ ͪ ͠ ͯ ͓ Γ · ͢ ʂ • ػցֶशʹؔ࿈͢Δ͜ͱશ෦ܦݧͰ͖·͢ɻ ͍ · ͷ ͱ ͜ Ζ σ ʔ λ ੳ ɺ α ʔ Ϗ ε ఏ ڙ ɺ ֶ श Ξ ϧ ΰ Ϧ ζ Ϝ બ ఆ ɺ ج ൫ ߏ ங ɾ ӡ ༻ · Ͱ શ ෦ Ұ ਓ Ͱ ͬ ͯ · ͢ɻ গ ͠ େ ͖ ͍ ن ͷ ৫ ͩ ͱ ෳ ਓ Ͱ Δ Α ͏ ͳ ͜ ͱ Λ ڽ ॖ ͠ ͯ ܦ ݧ Ͱ ͖ · ͢ ʂ