Upgrade to PRO for Only $50/Year—Limited-Time Offer! 🔥
Speaker Deck
Features
Speaker Deck
PRO
Sign in
Sign up for free
Search
Search
kurashiruにおけるSageMakerの活用
Search
RytaroTsuji
October 15, 2018
Technology
1
240
kurashiruにおけるSageMakerの活用
aws loft ML night 2018/10/9
RytaroTsuji
October 15, 2018
Tweet
Share
More Decks by RytaroTsuji
See All by RytaroTsuji
Enterprise Generative AI on CloudNative
kametaro
0
210
2020_IR_Reading_dely_tsuji.pdf
kametaro
0
87
Other Decks in Technology
See All in Technology
202512_AIoT.pdf
iotcomjpadmin
0
140
ESXi のAIOps だ!2025冬
unnowataru
0
350
意外と知らない状態遷移テストの世界
nihonbuson
PRO
1
240
事業の財務責任に向き合うリクルートデータプラットフォームのFinOps
recruitengineers
PRO
2
200
MySQLとPostgreSQLのコレーション / Collation of MySQL and PostgreSQL
tmtms
1
1.2k
『君の名は』と聞く君の名は。 / Your name, you who asks for mine.
nttcom
1
120
半年で、AIゼロ知識から AI中心開発組織の変革担当に至るまで
rfdnxbro
0
140
Claude Codeを使った情報整理術
knishioka
1
320
Snowflake導入から1年、LayerXのデータ活用の現在 / One Year into Snowflake: How LayerX Uses Data Today
civitaspo
0
2.3k
Amazon Connect アップデート! AIエージェントにMCPツールを設定してみた!
ysuzuki
0
130
2025-12-18_AI駆動開発推進プロジェクト運営について / AIDD-Promotion project management
yayoi_dd
0
150
TED_modeki_共創ラボ_20251203.pdf
iotcomjpadmin
0
150
Featured
See All Featured
Raft: Consensus for Rubyists
vanstee
141
7.3k
Unlocking the hidden potential of vector embeddings in international SEO
frankvandijk
0
130
Building a Modern Day E-commerce SEO Strategy
aleyda
45
8.4k
We Have a Design System, Now What?
morganepeng
54
7.9k
The Straight Up "How To Draw Better" Workshop
denniskardys
239
140k
Building Adaptive Systems
keathley
44
2.9k
The Cost Of JavaScript in 2023
addyosmani
55
9.4k
<Decoding/> the Language of Devs - We Love SEO 2024
nikkihalliwell
0
100
The Mindset for Success: Future Career Progression
greggifford
PRO
0
190
Being A Developer After 40
akosma
91
590k
Taking LLMs out of the black box: A practical guide to human-in-the-loop distillation
inesmontani
PRO
3
2k
No one is an island. Learnings from fostering a developers community.
thoeni
21
3.6k
Transcript
A m a z o n S a g e
M a k e r ͷ ׆ ༻ ࣄ ྫ
ձ ࣾ ɾαʔϏε հ • delyגࣜձࣾ • 20144݄ۀ • ࣾһ70ਓɺैۀһ130ਓ
• kurashiru (Ϋ ϥ γϧ ) • 20162݄ ɺα ʔ Ϗ ε ։࢝ • 20165݄ ɺΞ ϓ Ϧ Ϧ Ϧ ʔε • 20174݄ɺશࠃTVCM์ૹ։࢝ • 201712݄ɺྦྷܭ1000ສDLಥഁ
ࣗ ݾ հ • ⁋ོଠ(@kametaro) github/twitter • dely גࣜձࣾ
• ։ൃ෦ΤϯδχΞɾػցֶश୲ • झຯ • ʢପԁۂઢͱอܕܗࣜͷษڧதʣ • ུྺ • ڈ·ͰΞϓϦˍαʔόʔαΠυͷΤϯδχΞΛϝΠϯͰͬͯ·ͨ͠ɻػցֶश ΤϯδχΞͱͯ͠·ͩ·ͩϖʔϖʔͰ͢ɻ
ϨγϐఏҊʹ๊͓͍͍ͯ͑ͯͨ՝ 1Ґ 2Ґ 3Ґ 4Ґ 5Ґ 6Ґ શϢʔβʔʹڞ௨ͷϨγϐ܈Λදࣔ ਓͦΕͧΕͷΈʹ߹ͬͨϨγϐఏҊ͕Ͱ͖͍ͯͳ͍
ཧͷϨγϐఏҊ ਓͦΕͧΕͷΈʹج͍ͮͯύʔιφϥΠζ͞ΕͨఏҊ 1Ґ 2Ґ 3Ґ 1Ґ 2Ґ 3Ґ 1Ґ 2Ґ
3Ґ
Amazon SageMaker ͷಋೖΛܾఆ • ཧͷϨγϐఏҊΛ࣮ݱ͢Δʹػցֶशٕज़͕ඞਢ • ػցֶशΤϯδχΞ1໊ͷΈɺͰ࠷ͰϦϦʔε͍ͨ͠ • SageMakerϑϧϚωʔδυͳػցֶशαʔϏε •
ϞσϧߏஙɺτϨʔχϯάɺσϓϩΠ·ͰΛҰؾ௨؏ͰରԠ • ։ൃணख͔Β1.5ϲ݄ͰProductionڥͷөʹޭ
࣮ ̍ ɿ Ϋ ϥ ε λ Ϧ ϯ
ά Ϣʔ β ʔ ૉੑ • ͓ ؾ ʹ ೖ Γ / ݕࡧճ • ࢹௌճ/ ࢹ ௌ ࣌ ؒ • ϩ άΠ ϯ ༗ແ • ฏ/ ٳͷ ىಈճ • ேனͷ ىಈճ etc… Ϩ γ ϐ ૉੑ • Χ ς ΰ Ϧ ɺ ༸ த • ०ͳ৯ࡐ • ௐཧ࣌ؒɺ৯ࡐ • Χ ϩ Ϧ ʔ ɺ Ԙ ྔ • ਏ ͍ ɾ ͍ etc… Ϣ ʔ β ʔ ͓ Α ͼ Ϩ γ ϐ ͷ ಛ ྔ Λ ந ग़ ͯ͠ Ϋ ϥε λ Ϧϯ ά
࣮̎ɿڠௐϑ Ο ϧ λ Ϧ ϯ ά ڠௐϑ Ο ϧ
λ Ϧ ϯ ά 1. ࣗʹࣅ͍ͯΔਓͷΈͱ ࣗ ͷΈࣅ͍ͯΔͣʂ 2. ࣗ ʹࣅ ͍ͯΔਓ ͕ ΜͩϨγϐ ࣗ ͕ · ͩ ݟ ͨ ͜ͱͳ ͯ͘ ͖ ͳ ͣ ʂ ֤ Ϣ ʔ β ʔ Ϋ ϥ ε λ ͕ Ή Ͱ ͋ Ζ ͏ Ϩ γ ϐ Λ ਪ ʹ Α ΓϨ ʔ ς Ο ϯ ά Λ औ ಘ ɺίϯ ς ϯ π ϓʔϧ ʹ ֨ ೲ
࣮ ̏ɿίϯ ς ϯ π ϓʔϧ ͷ ࠷ ద
Խ ࣌ؒܦա܁Γฦ͠ࢹௌʹ ΑΓί ϯ ς ϯ π ຏ ͠ ͯ ͍ ͘ ↓ ಉ ͡ Ϋ ϥε λ ͷ ະ ࢹ ௌ Ϩ γ ϐ ʹ ೖ Ε ସ ͑ ͯ ɺ ί ϯ ς ϯ π ϓʔ ϧ Λ Ϧ ϑ Ϩ ο γ ϡ
Ϩ γ ϐ ఏ Ҋ · Ͱ ͷ σ ʔ
λ ͷ ྲྀ Ε
Ϩ γ ϐ ఏ Ҋ · Ͱ ͷ σ ʔ
λ ͷ ྲྀ Ε 1. Έ ࠐ Έ ͢ ͘ ɺ Έ ͑ ָ • ֶशίϯςφ͕Γग़ͤΔͷͰɺ δϣϒϑϩʔͷՃฒྻԽ͕ྟ ػԠมʹߦ͑Δ SageMaker
ϩά ऩूج൫ data ETL Machine Learning Service development Container vm(minicube)
[[etl]] ap-northeast-1 us-east-1 ap-northeast-1 Amazon Athena kops kops cronjobs extract transform train predict load [[etl]] Transform train predict load Amazon SageMaker predict endpoint container train job container Predict endpoint container - instance type - instance count train job container - instance type - instance count DynamoDB recommendation RDB recommendation AWS Glue staging production apply staging apply feature input feature CRR CRR apply application endpoint
ϩά ऩूج൫ data ETL Machine Learning Service development Container vm(minicube)
[[etl]] ap-northeast-1 us-east-1 ap-northeast-1 Amazon Athena kops kops cronjobs extract transform train predict load [[etl]] Transform train predict load Amazon SageMaker predict endpoint container train job container Predict endpoint container - instance type - instance count train job container - instance type - instance count DynamoDB recommendation RDB recommendation AWS Glue staging production apply staging apply feature input feature CRR CRR apply application endpoint SageMaker 1. ॊೈͳόονγεςϜ • τϨʔχϯάδϣϒʹ͔͔ΔෛՙΛ ผΠϯελϯεʹҕৡՄೳ • ඇಉظͰδϣϒ࣮ߦՄೳ 2. ࣗ༝ʹΤϯυϙΠϯτԽ • ӬଓԽͨ͠API͔Βਪ݁ՌΛฦ٫ • Φʔτεέʔϧػೳ͋Γ
Amazon SageMakerͷ׆༻ • ੳʢϊʔτϒοΫΠϯελϯεʣ • ֶशͱਪʢΞϧΰϦζϜɾίϯςφʣ ͜ΕΒͷओʹͭ·͍ͣͨΛհ
ੳᶃ ϊʔτϒοΫΠϯελϯε ‣ Jupyter NotebookͷΠϯελϯεΛ؆୯ʹىಈͰ͖Δɻ ‣ ΠϯελϯεαΠζΛ࡞ޙʹมߋՄೳɻ
ੳᶄ ϥΠϑαΠΫϧઃఆ #!/bin/bash set -e sudo yum install -y gcc72
gcc72-c++ echo ". /home/ec2-user/anaconda3/etc/profile.d/ conda.sh" >> ~/.bashrc source ~/.bashrc conda activate python3 pip install --upgrade pip pip install sshtunnel --no-warn-conflicts pip install pymysql --no-warn-conflicts pip install gensim --no-warn-conflicts pip install msgpack --no-warn-conflicts pip install janome --no-warn-conflicts pip install jupyter-emacskeys --no-warn-conflicts pip install fasttext --no-warn-conflicts ϊʔτϒοΫΠϯελϯεىಈޙʹ ඞཁͳϥΠϒϥϦͷΠϯετʔϧͳͲ Λࡁ·ͤΔɻ Lifecycle configurations ex)
ੳᶅ • ϊʔτϒοΫͰͭ·͍ͮͨͱ͜Ζ ϊʔτϒοΫͷىಈʹࣦഊ͢Δͱίϯιʔϧը໘͔ΒىಈͰ͖ͳ͘ͳΔɻ ϥΠϑαΠΫϧઃఆͷpip install͕҆ఆ͠ͳ͍ɻ ‣ ϥΠϑαΠΫϧઃఆͰίέΔ ‣ େ͖ͳϑΝΠϧΛuploadͯ͠ΠϯελϯεͷσΟεΫ༰ྔ͕͍ͬͺ͍
‣ sagemakerͷpython packageͱpipͷىಈλΠϛϯά͕όοςΟϯά͢Δͱى͜Δɻ ✓pip install numpy —no-warn-conflicts # ͜ͷΦϓγϣϯΛ͚Δ ‣ ͜ͷΑ͏ʹԿૢ࡞Ͱ͖ͳ͘ͳΔ ✓awscli͔Βىಈ͢Δ # aws sagemaker start-notebook-instance --notebook-instance-name my_note
ֶशͱਪᶃ • Built-InΞϧΰϦζϜ k-means PCA LDA Factorization Machines Linear Learner
Neural Topic Model Random Cut Forest Seq2Seq Modeling XGBoost Object Detection Image Classification DeepAR Forecasting BlazingText k-nearest-neighbor (k-NN) ‣ Factorization Machines => Ϩίϝϯυ ‣ XGBoost => ଞΫϥεྨ ‣ Image Classification => αϜωΠϧը૾ྨ ‣ k-means => ΫϥελϦϯά
ֶशͱਪᶄ • Factorization MachinesͰͭ·͍ͣͨͱ͜Ζ ՝ɿnumpyͰѻ͏ʹେ͖͗͢ΔτϨʔχϯάσʔληοτ
ֶशͱਪᶄ • Factorization MachinesͰͭ·͍ͣͨͱ͜Ζ ରࡦɿscipy.sparse.lil_matrixʹΑΔεύʔεߦྻͷੜ͢Δ େ͖ͳεύʔεߦྻΛ̍ͰຒΊ͍ͯ͘
ֶशͱਪᶄ • Factorization MachinesͰͭ·͍ͣͨͱ͜Ζ ՝ɾରࡦɿਪྔ͕ଟ͍numpy:1ߦ -> scr:10000ߦʢ16࣌ؒ -> 20ʣ Compressed
Sparse Row matrix ʹѹॖ csrߦྻ͕ࢦఆͰ͖Δ ※) Batch transform job ʹमਖ਼த
ֶशͱਪᶅ • XGBoostͰͭ·͍ͣͨͱ͜Ζ ՝ɿϋΠύʔύϥϝλௐδϣϒͬͯͲ͏ͬͯ͏ͷʁ
ֶशͱਪᶅ • XGBoostͰͭ·͍ͣͨͱ͜Ζ ରࡦɿϋΠύʔύϥϝλௐδϣϒͷҾʹrangesύϥϝλΛ͢
ֶशͱਪᶅ • XGBoostͰͭ·͍ͣͨͱ͜Ζ ରࡦɿϋΠύʔύϥϝλௐδϣϒͷ࣮ߦ
ֶशͱਪᶅ • XGBoostͰͭ·͍ͣͨͱ͜Ζ ରࡦɿϋΠύʔύϥϝλௐδϣϒΛίϯιʔϧͰ֬ೝ validation:auc
ֶशͱਪᶆ • Image ClassificationͰͭ·͍ͣͨͱ͜Ζ ՝: τϨʔχϯάσʔληοτͬͯͲ͏ͬͯ༻ҙ͢Δͷʁ MXNetͷrecϑΝΠϧΛࢦఆ͢Δ
ֶशͱਪᶆ • Image ClassificationͰͭ·͍ͣͨͱ͜Ζ ରࡦɿMXNetͷlstϑΝΠϧͱrecϑΝΠϧͷ࡞ MXNET_HOME = ‘~/incubator-mxnet/' RESOURCE_DIR =
‘~/thumbnails/' os.system('python {0}/tools/im2rec.py --list --recursive --train-ratio 0.8 --test-ratio 0.2 {1}/im2rec/target {1}'.format(MXNET_HOME, RESOURCE_DIR)) os.system('python {0}/tools/im2rec.py --resize 480 --quality 95 --num-thread 64 {1}/im2rec/train {1}'.format(MXNET_HOME, RESOURCE_DIR)) os.system('python {0}/tools/im2rec.py --resize 480 --quality 95 --num-thread 64 {1}/im2rec/test {1}'.format(MXNET_HOME, RESOURCE_DIR)) 1.https://github.com/apache/incubator-mxnet.git 2.ֶश͢ΔαϜωΠϧը૾ΛPCʹμϯϩʔυ 3.࡞ͨ͠recϑΝΠϧΛS3ͷॴఆͷॴʹΞοϓϩʔυ
ֶशͱਪᶇ • k-meansͰͭ·͍ͣͨͱ͜Ζ ՝ɾରࡦɿkΫϥελʔͷ࠷దͲ͏ͬͯௐΔͷʁ͜Εʹؔͯ͠ϋΠύʔύϥ ϝλௐδϣϒͰݱ࣌ͰͰ͖ͳ͍ͷͰҎԼͷํ๏ͰಓʹௐΔɻ ΤϧϘʔ๏ γϧΤοτੳ
ETLɾֶशόονγεςϜ • Kubernetes(kops)Λج൫ʹબͨ͠ཧ༝ step functionsʗAWS BatchͰɺδϣϒͱδϣϒϑϩʔΛҰॹʹཧͰ͖ͳ͍ɻ εέδϡʔϥʔ͕cronjobs͚ͩͰγϯϓϧʹཧͰ͖ɺίϚϯυͰ؆୯ʹมߋͰ͖Δɻ ΦϯϥΠϯֶशͰBatchͱAPIΛ࿈ܞ͢Δඞཁ͕͋ͬͨɻ কདྷతʹEKSʢ౦ژϦʔδϣϯʣͰཧͰ͖Δɻ step
functionsAWS Batch෦తʹ༻Մೳɻ SageMakerͰֶश͕ίϯςφʹΓͤΔͷͰɺόονγεςϜͷઃܭ͕ॊೈʹߦ ͑Δɻ
SageMakerΛ̑ϲ݄ͬͯΈͨײ • ੳʢϊʔτϒοΫΠϯελϯεʣ ϥΠϑαΠΫϧઃఆ͕ศརʗ͓खܰʹڥΛηοτΞοϓͰ͖Δ ͪΐͬͱॲཧ͕ॏ͘ͳͬͨͱࢥͬͨΒɺ͋ͱ͔ΒΠϯελϯελΠϓΛมߋՄೳ • ֶशͱਪʢΞϧΰϦζϜɾίϯςφʣ Built-inΞϧΰϦζϜɺTensorflowʗChainerͳͲਂֶशϑϨʔϜϫʔΫॆ࣮ ֶशίϯςφ͕Γ͞ΕΔͷͰɺ࣮ߦதͷδϣϒϦιʔεΛؾʹ͠ͳͯ͘ࡁΉ ϊʔτϒοΫΛෳਓͰར༻Ͱ͖Δ
ϞσϧΛ؆୯ʹΤϯυϙΠϯτͱͯ͠σϓϩΠͰ͖ɺΦʔτεέʔϧՄೳ ϋΠύʔύϥϝλௐδϣϒΛͬͯɺҰ൪ྑ͍ϋΠύʔύϥϝλΛࣗಈઃఆͰ͖Δ
ࠓޙͷల • ৯ࡐͷ ༨Γ ͢ ͞ Λ ߟྀ͠ ͨ
Ϩ γ ϐ ఏҊ 1. աڈʹ ࢹௌ͠ ͨ Ϩ γ ϐ ͷ தͰ ༨Γ ͢ ͍ ৯ࡐΛ ผ 2. ͦ ͷ ৯ࡐΛ ޮΑ ͘ ফඅͰ ͖ Δ Ϩ γ ϐ Λ ఏҊ • ύʔιφϥΠζͨ͠ϨγϐͷఏҊ 1. ʰ ਏ ͍ ʗ ͍ ʱ ɺ ʰ ͜ ͬ ͯ Γ ʗ ͞ ͬ ͺ Γ ʱ ͳ Ͳ ɺ Α Γ Ϣ ʔ β ͷ Έ ϥ Π ϑ ε λ Π ϧ ʹ ߹ ͬ ͨ Ϩ γ ϐ ͷ ఏ Ҋ 2. ༨ ͬ ͨ ৯ ࡐ ʹ ͪ ΐ ͍ ͠ ͠ ͯ Ͱ ͖ Δ Ϩ γ ϐ ͷ ఏ Ҋ
delyͰػցֶशΤϯδχΞΛืू͍ͯ͠·͢ʂ • ΫϥγϧγΣϑ͕࡞ͬͨϨγϐຊʹඒຯ͍͠ΜͰ͢Αɻ ඒ ຯ ͠ ͦ ͏ ͳ ͷ
ݟ ͨ ͩ ͚ ͳ Μ Ͱ ͠ ΐ ͏ ʁ ͍ ͍ ɺ ͦ Μ ͳ ͜ ͱ ͳ ͍ Μ Ͱ ͢ɻ ຯ Θ ͬ ͯ Έ Δ ͭ ͍ Ͱ ʹ ػ ց ֶ श Γ ͨ ͍ ͱ ͍ ͏ ํ ͥ ͻ ͓ ͪ ͠ ͯ ͓ Γ · ͢ ʂ • ػցֶशʹؔ࿈͢Δ͜ͱશ෦ܦݧͰ͖·͢ɻ ͍ · ͷ ͱ ͜ Ζ σ ʔ λ ੳ ɺ α ʔ Ϗ ε ఏ ڙ ɺ ֶ श Ξ ϧ ΰ Ϧ ζ Ϝ બ ఆ ɺ ج ൫ ߏ ங ɾ ӡ ༻ · Ͱ શ ෦ Ұ ਓ Ͱ ͬ ͯ · ͢ɻ গ ͠ େ ͖ ͍ ن ͷ ৫ ͩ ͱ ෳ ਓ Ͱ Δ Α ͏ ͳ ͜ ͱ Λ ڽ ॖ ͠ ͯ ܦ ݧ Ͱ ͖ · ͢ ʂ