Upgrade to Pro
— share decks privately, control downloads, hide ads and more …
Speaker Deck
Features
Speaker Deck
PRO
Sign in
Sign up for free
Search
Search
kurashiruにおけるSageMakerの活用
Search
RytaroTsuji
October 15, 2018
Technology
250
1
Share
Embed
Copy iframe code
Copy JS code
Copy link
Start on current slide
kurashiruにおけるSageMakerの活用
aws loft ML night 2018/10/9
RytaroTsuji
October 15, 2018
More Decks by RytaroTsuji
See All by RytaroTsuji
Enterprise Generative AI on CloudNative
kametaro
0
250
2020_IR_Reading_dely_tsuji.pdf
kametaro
0
97
Other Decks in Technology
See All in Technology
Platform Engineering as a Product: Criteria for Improvement and Multi-Tenant Design
kumorn5s
0
510
Databricks 月刊サービスアップデート 2026年05月号
tyosi1212
0
210
AI活用を推進するために ファインディが下した、一つの小さな決断
starfish719
0
250
関西に縁あるMicrosoft MVPsが語るCopilotの未来
kasada
0
1.2k
GoとSIMDとWasmの今。
askua
3
510
データ基盤をDataformで整えた話 〜 開発環境を添えて 〜
takapy
0
110
Dynamic Workersについて
yusukebe
2
590
AIにフローを作らせようとして挫折した話
hamatsutaichi
0
200
Oracle Cloud Infrastructure IaaS 新機能アップデート 2026/3 - 2026/5
oracle4engineer
PRO
1
190
Platform engineering for developers, architects & the rest of us (AI agents)
danielbryantuk
0
180
ITエンジニアを取り巻く環境とキャリアパス / A career path for Japanese IT engineers
takatama
4
1.8k
【Gen-AX】20260530開催_JJUG CCC 2026 Spring
genax
0
420
Featured
See All Featured
Art, The Web, and Tiny UX
lynnandtonic
304
22k
How to build an LLM SEO readiness audit: a practical framework
nmsamuel
1
770
The Success of Rails: Ensuring Growth for the Next 100 Years
eileencodes
47
8.2k
Become a Pro
speakerdeck
PRO
31
6k
Easily Structure & Communicate Ideas using Wireframe
afnizarnur
194
17k
Scaling GitHub
holman
464
140k
How to Create Impact in a Changing Tech Landscape [PerfNow 2023]
tammyeverts
55
3.4k
The #1 spot is gone: here's how to win anyway
tamaranovitovic
2
1.1k
Faster Mobile Websites
deanohume
310
31k
StorybookのUI Testing Handbookを読んだ
zakiyama
31
6.8k
AI in Enterprises - Java and Open Source to the Rescue
ivargrimstad
0
1.3k
Neural Spatial Audio Processing for Sound Field Analysis and Control
skoyamalab
0
320
Transcript
A m a z o n S a g e
M a k e r ͷ ׆ ༻ ࣄ ྫ
ձ ࣾ ɾαʔϏε հ • delyגࣜձࣾ • 20144݄ۀ • ࣾһ70ਓɺैۀһ130ਓ
• kurashiru (Ϋ ϥ γϧ ) • 20162݄ ɺα ʔ Ϗ ε ։࢝ • 20165݄ ɺΞ ϓ Ϧ Ϧ Ϧ ʔε • 20174݄ɺશࠃTVCM์ૹ։࢝ • 201712݄ɺྦྷܭ1000ສDLಥഁ
ࣗ ݾ հ • ⁋ོଠ(@kametaro) github/twitter • dely גࣜձࣾ
• ։ൃ෦ΤϯδχΞɾػցֶश୲ • झຯ • ʢପԁۂઢͱอܕܗࣜͷษڧதʣ • ུྺ • ڈ·ͰΞϓϦˍαʔόʔαΠυͷΤϯδχΞΛϝΠϯͰͬͯ·ͨ͠ɻػցֶश ΤϯδχΞͱͯ͠·ͩ·ͩϖʔϖʔͰ͢ɻ
ϨγϐఏҊʹ๊͓͍͍ͯ͑ͯͨ՝ 1Ґ 2Ґ 3Ґ 4Ґ 5Ґ 6Ґ શϢʔβʔʹڞ௨ͷϨγϐ܈Λදࣔ ਓͦΕͧΕͷΈʹ߹ͬͨϨγϐఏҊ͕Ͱ͖͍ͯͳ͍
ཧͷϨγϐఏҊ ਓͦΕͧΕͷΈʹج͍ͮͯύʔιφϥΠζ͞ΕͨఏҊ 1Ґ 2Ґ 3Ґ 1Ґ 2Ґ 3Ґ 1Ґ 2Ґ
3Ґ
Amazon SageMaker ͷಋೖΛܾఆ • ཧͷϨγϐఏҊΛ࣮ݱ͢Δʹػցֶशٕज़͕ඞਢ • ػցֶशΤϯδχΞ1໊ͷΈɺͰ࠷ͰϦϦʔε͍ͨ͠ • SageMakerϑϧϚωʔδυͳػցֶशαʔϏε •
ϞσϧߏஙɺτϨʔχϯάɺσϓϩΠ·ͰΛҰؾ௨؏ͰରԠ • ։ൃணख͔Β1.5ϲ݄ͰProductionڥͷөʹޭ
࣮ ̍ ɿ Ϋ ϥ ε λ Ϧ ϯ
ά Ϣʔ β ʔ ૉੑ • ͓ ؾ ʹ ೖ Γ / ݕࡧճ • ࢹௌճ/ ࢹ ௌ ࣌ ؒ • ϩ άΠ ϯ ༗ແ • ฏ/ ٳͷ ىಈճ • ேனͷ ىಈճ etc… Ϩ γ ϐ ૉੑ • Χ ς ΰ Ϧ ɺ ༸ த • ०ͳ৯ࡐ • ௐཧ࣌ؒɺ৯ࡐ • Χ ϩ Ϧ ʔ ɺ Ԙ ྔ • ਏ ͍ ɾ ͍ etc… Ϣ ʔ β ʔ ͓ Α ͼ Ϩ γ ϐ ͷ ಛ ྔ Λ ந ग़ ͯ͠ Ϋ ϥε λ Ϧϯ ά
࣮̎ɿڠௐϑ Ο ϧ λ Ϧ ϯ ά ڠௐϑ Ο ϧ
λ Ϧ ϯ ά 1. ࣗʹࣅ͍ͯΔਓͷΈͱ ࣗ ͷΈࣅ͍ͯΔͣʂ 2. ࣗ ʹࣅ ͍ͯΔਓ ͕ ΜͩϨγϐ ࣗ ͕ · ͩ ݟ ͨ ͜ͱͳ ͯ͘ ͖ ͳ ͣ ʂ ֤ Ϣ ʔ β ʔ Ϋ ϥ ε λ ͕ Ή Ͱ ͋ Ζ ͏ Ϩ γ ϐ Λ ਪ ʹ Α ΓϨ ʔ ς Ο ϯ ά Λ औ ಘ ɺίϯ ς ϯ π ϓʔϧ ʹ ֨ ೲ
࣮ ̏ɿίϯ ς ϯ π ϓʔϧ ͷ ࠷ ద
Խ ࣌ؒܦա܁Γฦ͠ࢹௌʹ ΑΓί ϯ ς ϯ π ຏ ͠ ͯ ͍ ͘ ↓ ಉ ͡ Ϋ ϥε λ ͷ ະ ࢹ ௌ Ϩ γ ϐ ʹ ೖ Ε ସ ͑ ͯ ɺ ί ϯ ς ϯ π ϓʔ ϧ Λ Ϧ ϑ Ϩ ο γ ϡ
Ϩ γ ϐ ఏ Ҋ · Ͱ ͷ σ ʔ
λ ͷ ྲྀ Ε
Ϩ γ ϐ ఏ Ҋ · Ͱ ͷ σ ʔ
λ ͷ ྲྀ Ε 1. Έ ࠐ Έ ͢ ͘ ɺ Έ ͑ ָ • ֶशίϯςφ͕Γग़ͤΔͷͰɺ δϣϒϑϩʔͷՃฒྻԽ͕ྟ ػԠมʹߦ͑Δ SageMaker
ϩά ऩूج൫ data ETL Machine Learning Service development Container vm(minicube)
[[etl]] ap-northeast-1 us-east-1 ap-northeast-1 Amazon Athena kops kops cronjobs extract transform train predict load [[etl]] Transform train predict load Amazon SageMaker predict endpoint container train job container Predict endpoint container - instance type - instance count train job container - instance type - instance count DynamoDB recommendation RDB recommendation AWS Glue staging production apply staging apply feature input feature CRR CRR apply application endpoint
ϩά ऩूج൫ data ETL Machine Learning Service development Container vm(minicube)
[[etl]] ap-northeast-1 us-east-1 ap-northeast-1 Amazon Athena kops kops cronjobs extract transform train predict load [[etl]] Transform train predict load Amazon SageMaker predict endpoint container train job container Predict endpoint container - instance type - instance count train job container - instance type - instance count DynamoDB recommendation RDB recommendation AWS Glue staging production apply staging apply feature input feature CRR CRR apply application endpoint SageMaker 1. ॊೈͳόονγεςϜ • τϨʔχϯάδϣϒʹ͔͔ΔෛՙΛ ผΠϯελϯεʹҕৡՄೳ • ඇಉظͰδϣϒ࣮ߦՄೳ 2. ࣗ༝ʹΤϯυϙΠϯτԽ • ӬଓԽͨ͠API͔Βਪ݁ՌΛฦ٫ • Φʔτεέʔϧػೳ͋Γ
Amazon SageMakerͷ׆༻ • ੳʢϊʔτϒοΫΠϯελϯεʣ • ֶशͱਪʢΞϧΰϦζϜɾίϯςφʣ ͜ΕΒͷओʹͭ·͍ͣͨΛհ
ੳᶃ ϊʔτϒοΫΠϯελϯε ‣ Jupyter NotebookͷΠϯελϯεΛ؆୯ʹىಈͰ͖Δɻ ‣ ΠϯελϯεαΠζΛ࡞ޙʹมߋՄೳɻ
ੳᶄ ϥΠϑαΠΫϧઃఆ #!/bin/bash set -e sudo yum install -y gcc72
gcc72-c++ echo ". /home/ec2-user/anaconda3/etc/profile.d/ conda.sh" >> ~/.bashrc source ~/.bashrc conda activate python3 pip install --upgrade pip pip install sshtunnel --no-warn-conflicts pip install pymysql --no-warn-conflicts pip install gensim --no-warn-conflicts pip install msgpack --no-warn-conflicts pip install janome --no-warn-conflicts pip install jupyter-emacskeys --no-warn-conflicts pip install fasttext --no-warn-conflicts ϊʔτϒοΫΠϯελϯεىಈޙʹ ඞཁͳϥΠϒϥϦͷΠϯετʔϧͳͲ Λࡁ·ͤΔɻ Lifecycle configurations ex)
ੳᶅ • ϊʔτϒοΫͰͭ·͍ͮͨͱ͜Ζ ϊʔτϒοΫͷىಈʹࣦഊ͢Δͱίϯιʔϧը໘͔ΒىಈͰ͖ͳ͘ͳΔɻ ϥΠϑαΠΫϧઃఆͷpip install͕҆ఆ͠ͳ͍ɻ ‣ ϥΠϑαΠΫϧઃఆͰίέΔ ‣ େ͖ͳϑΝΠϧΛuploadͯ͠ΠϯελϯεͷσΟεΫ༰ྔ͕͍ͬͺ͍
‣ sagemakerͷpython packageͱpipͷىಈλΠϛϯά͕όοςΟϯά͢Δͱى͜Δɻ ✓pip install numpy —no-warn-conflicts # ͜ͷΦϓγϣϯΛ͚Δ ‣ ͜ͷΑ͏ʹԿૢ࡞Ͱ͖ͳ͘ͳΔ ✓awscli͔Βىಈ͢Δ # aws sagemaker start-notebook-instance --notebook-instance-name my_note
ֶशͱਪᶃ • Built-InΞϧΰϦζϜ k-means PCA LDA Factorization Machines Linear Learner
Neural Topic Model Random Cut Forest Seq2Seq Modeling XGBoost Object Detection Image Classification DeepAR Forecasting BlazingText k-nearest-neighbor (k-NN) ‣ Factorization Machines => Ϩίϝϯυ ‣ XGBoost => ଞΫϥεྨ ‣ Image Classification => αϜωΠϧը૾ྨ ‣ k-means => ΫϥελϦϯά
ֶशͱਪᶄ • Factorization MachinesͰͭ·͍ͣͨͱ͜Ζ ՝ɿnumpyͰѻ͏ʹେ͖͗͢ΔτϨʔχϯάσʔληοτ
ֶशͱਪᶄ • Factorization MachinesͰͭ·͍ͣͨͱ͜Ζ ରࡦɿscipy.sparse.lil_matrixʹΑΔεύʔεߦྻͷੜ͢Δ େ͖ͳεύʔεߦྻΛ̍ͰຒΊ͍ͯ͘
ֶशͱਪᶄ • Factorization MachinesͰͭ·͍ͣͨͱ͜Ζ ՝ɾରࡦɿਪྔ͕ଟ͍numpy:1ߦ -> scr:10000ߦʢ16࣌ؒ -> 20ʣ Compressed
Sparse Row matrix ʹѹॖ csrߦྻ͕ࢦఆͰ͖Δ ※) Batch transform job ʹमਖ਼த
ֶशͱਪᶅ • XGBoostͰͭ·͍ͣͨͱ͜Ζ ՝ɿϋΠύʔύϥϝλௐδϣϒͬͯͲ͏ͬͯ͏ͷʁ
ֶशͱਪᶅ • XGBoostͰͭ·͍ͣͨͱ͜Ζ ରࡦɿϋΠύʔύϥϝλௐδϣϒͷҾʹrangesύϥϝλΛ͢
ֶशͱਪᶅ • XGBoostͰͭ·͍ͣͨͱ͜Ζ ରࡦɿϋΠύʔύϥϝλௐδϣϒͷ࣮ߦ
ֶशͱਪᶅ • XGBoostͰͭ·͍ͣͨͱ͜Ζ ରࡦɿϋΠύʔύϥϝλௐδϣϒΛίϯιʔϧͰ֬ೝ validation:auc
ֶशͱਪᶆ • Image ClassificationͰͭ·͍ͣͨͱ͜Ζ ՝: τϨʔχϯάσʔληοτͬͯͲ͏ͬͯ༻ҙ͢Δͷʁ MXNetͷrecϑΝΠϧΛࢦఆ͢Δ
ֶशͱਪᶆ • Image ClassificationͰͭ·͍ͣͨͱ͜Ζ ରࡦɿMXNetͷlstϑΝΠϧͱrecϑΝΠϧͷ࡞ MXNET_HOME = ‘~/incubator-mxnet/' RESOURCE_DIR =
‘~/thumbnails/' os.system('python {0}/tools/im2rec.py --list --recursive --train-ratio 0.8 --test-ratio 0.2 {1}/im2rec/target {1}'.format(MXNET_HOME, RESOURCE_DIR)) os.system('python {0}/tools/im2rec.py --resize 480 --quality 95 --num-thread 64 {1}/im2rec/train {1}'.format(MXNET_HOME, RESOURCE_DIR)) os.system('python {0}/tools/im2rec.py --resize 480 --quality 95 --num-thread 64 {1}/im2rec/test {1}'.format(MXNET_HOME, RESOURCE_DIR)) 1.https://github.com/apache/incubator-mxnet.git 2.ֶश͢ΔαϜωΠϧը૾ΛPCʹμϯϩʔυ 3.࡞ͨ͠recϑΝΠϧΛS3ͷॴఆͷॴʹΞοϓϩʔυ
ֶशͱਪᶇ • k-meansͰͭ·͍ͣͨͱ͜Ζ ՝ɾରࡦɿkΫϥελʔͷ࠷దͲ͏ͬͯௐΔͷʁ͜Εʹؔͯ͠ϋΠύʔύϥ ϝλௐδϣϒͰݱ࣌ͰͰ͖ͳ͍ͷͰҎԼͷํ๏ͰಓʹௐΔɻ ΤϧϘʔ๏ γϧΤοτੳ
ETLɾֶशόονγεςϜ • Kubernetes(kops)Λج൫ʹબͨ͠ཧ༝ step functionsʗAWS BatchͰɺδϣϒͱδϣϒϑϩʔΛҰॹʹཧͰ͖ͳ͍ɻ εέδϡʔϥʔ͕cronjobs͚ͩͰγϯϓϧʹཧͰ͖ɺίϚϯυͰ؆୯ʹมߋͰ͖Δɻ ΦϯϥΠϯֶशͰBatchͱAPIΛ࿈ܞ͢Δඞཁ͕͋ͬͨɻ কདྷతʹEKSʢ౦ژϦʔδϣϯʣͰཧͰ͖Δɻ step
functionsAWS Batch෦తʹ༻Մೳɻ SageMakerͰֶश͕ίϯςφʹΓͤΔͷͰɺόονγεςϜͷઃܭ͕ॊೈʹߦ ͑Δɻ
SageMakerΛ̑ϲ݄ͬͯΈͨײ • ੳʢϊʔτϒοΫΠϯελϯεʣ ϥΠϑαΠΫϧઃఆ͕ศརʗ͓खܰʹڥΛηοτΞοϓͰ͖Δ ͪΐͬͱॲཧ͕ॏ͘ͳͬͨͱࢥͬͨΒɺ͋ͱ͔ΒΠϯελϯελΠϓΛมߋՄೳ • ֶशͱਪʢΞϧΰϦζϜɾίϯςφʣ Built-inΞϧΰϦζϜɺTensorflowʗChainerͳͲਂֶशϑϨʔϜϫʔΫॆ࣮ ֶशίϯςφ͕Γ͞ΕΔͷͰɺ࣮ߦதͷδϣϒϦιʔεΛؾʹ͠ͳͯ͘ࡁΉ ϊʔτϒοΫΛෳਓͰར༻Ͱ͖Δ
ϞσϧΛ؆୯ʹΤϯυϙΠϯτͱͯ͠σϓϩΠͰ͖ɺΦʔτεέʔϧՄೳ ϋΠύʔύϥϝλௐδϣϒΛͬͯɺҰ൪ྑ͍ϋΠύʔύϥϝλΛࣗಈઃఆͰ͖Δ
ࠓޙͷల • ৯ࡐͷ ༨Γ ͢ ͞ Λ ߟྀ͠ ͨ
Ϩ γ ϐ ఏҊ 1. աڈʹ ࢹௌ͠ ͨ Ϩ γ ϐ ͷ தͰ ༨Γ ͢ ͍ ৯ࡐΛ ผ 2. ͦ ͷ ৯ࡐΛ ޮΑ ͘ ফඅͰ ͖ Δ Ϩ γ ϐ Λ ఏҊ • ύʔιφϥΠζͨ͠ϨγϐͷఏҊ 1. ʰ ਏ ͍ ʗ ͍ ʱ ɺ ʰ ͜ ͬ ͯ Γ ʗ ͞ ͬ ͺ Γ ʱ ͳ Ͳ ɺ Α Γ Ϣ ʔ β ͷ Έ ϥ Π ϑ ε λ Π ϧ ʹ ߹ ͬ ͨ Ϩ γ ϐ ͷ ఏ Ҋ 2. ༨ ͬ ͨ ৯ ࡐ ʹ ͪ ΐ ͍ ͠ ͠ ͯ Ͱ ͖ Δ Ϩ γ ϐ ͷ ఏ Ҋ
delyͰػցֶशΤϯδχΞΛืू͍ͯ͠·͢ʂ • ΫϥγϧγΣϑ͕࡞ͬͨϨγϐຊʹඒຯ͍͠ΜͰ͢Αɻ ඒ ຯ ͠ ͦ ͏ ͳ ͷ
ݟ ͨ ͩ ͚ ͳ Μ Ͱ ͠ ΐ ͏ ʁ ͍ ͍ ɺ ͦ Μ ͳ ͜ ͱ ͳ ͍ Μ Ͱ ͢ɻ ຯ Θ ͬ ͯ Έ Δ ͭ ͍ Ͱ ʹ ػ ց ֶ श Γ ͨ ͍ ͱ ͍ ͏ ํ ͥ ͻ ͓ ͪ ͠ ͯ ͓ Γ · ͢ ʂ • ػցֶशʹؔ࿈͢Δ͜ͱશ෦ܦݧͰ͖·͢ɻ ͍ · ͷ ͱ ͜ Ζ σ ʔ λ ੳ ɺ α ʔ Ϗ ε ఏ ڙ ɺ ֶ श Ξ ϧ ΰ Ϧ ζ Ϝ બ ఆ ɺ ج ൫ ߏ ங ɾ ӡ ༻ · Ͱ શ ෦ Ұ ਓ Ͱ ͬ ͯ · ͢ɻ গ ͠ େ ͖ ͍ ن ͷ ৫ ͩ ͱ ෳ ਓ Ͱ Δ Α ͏ ͳ ͜ ͱ Λ ڽ ॖ ͠ ͯ ܦ ݧ Ͱ ͖ · ͢ ʂ