AWS における機械学習の取り組み / AWS ML

© 2019, Amazon Web Services, Inc. or its Affiliates. アマゾン
ウェブサービスジャパン株式会社 Solutions Architect 針原佳貴 2019-11-07 15:20-16:10 九州⼤学情報基盤研究開発センター (伊都キャンパス) AWS における機械学習の取り組みクラウドでの機械学習基盤と深層学習の研究開発 @_hariby

© 2019, Amazon Web Services, Inc. or its Affiliates. 自己紹介
針原佳貴 (はりばらよしたか) • 大阪府箕面市出身 • 博士 (情報理工学) 経歴 • 2007-2013 阪大理学部数学科 • 軽音楽部 (ドラム) • 2013-2015 東大情報理工電子情報 (修士) • 2015-2018 東大情報理工数理情報 (博士) • 2018- AWS Japan, Startup Solutions Architect • スタートアップ・機械学習担当 • 好きなサービスは Amazon SageMaker

研究紹介 (大学院時代) 博士論文「組合せ最適化問題のための測定フィードバック型コヒーレント・イジングマシンの実現と評価」 • 量子光学系を使ってグラフ最大カットに代表される組合せ最適化問題を高速に解く手法の研究
• 物理系を用いた近似解法 • 乱数生成 • アナログメモリ • 行列積 Haribara, Yoshitaka, et al. "Performance evaluation of coherent Ising machines against classical neural networks." Quantum Science and Technology 2.4 (2017): 044002.

研究とクラウドの関わり (大学院時代) 研究成果をクラウド上に構築・公開して、ユーザー・企業の方に試してほしい！出典: https://www.jst.go.jp/pr/announce/20171120/index.html

© 2019, Amazon Web Services, Inc. or its Affiliates. 機械学習
(を始める以前) の課題 • 高性能なプロセッサ、特に GPU が必要 • NVIDIA Tesla V100 GPU • GPU を買うと高い。必要なリソース見積もりが難しい • 必要な時に、必要な分だけの、柔軟なリソース調達が必要 • 開発時は手元の GPU サーバーで行うが、長いトレーニングジョブはリモート (クラウドなど) に投げたい • 研究室のサーバーだと足りなくなる • クラウドだとインスタンスの落とし忘れ • 環境構築が面倒 • GPU ドライバ、CUDA、cuDNN、フレームワーク (TensorFlow, PyTorch, MXNet, Chainer, …)、Python ライブラリ (NumPy, Pandas, OpenCV, …)

© 2019, Amazon Web Services, Inc. or its Affiliates. ©
2019, Amazon Web Services, Inc. or its Aﬃliates. 全てのデベロッパーの手に機械学習を Our mission at AWS

© 2019, Amazon Web Services, Inc. or its Aﬃliates. Customer-focused
90%以上の ML ロードマップはお客様の声によるもの Multi-framework 主要なフレームワークのサポート Pace of innovation 去年200以上の ML 関連の発表・主要な機能追加 Breadth and depth 幅広い AI/ML サービスをプロダクション導入 Security and analytics 豊富なセキュリティ・暗号化に関する機能と頑強な分析基盤 Embedded R&D 顧客中心のアプローチで state-of-the-art を実現 AWS の機械学習に対するアプローチ

© 2019, Amazon Web Services, Inc. or its Aﬃliates. Customer-focused
90%以上の ML ロードマップはお客様の声によるもの Multi-framework 主要なフレームワークのサポート Pace of innovation 去年200以上の ML 関連の発表・主要な機能追加 Breadth and depth 幅広い AI/ML サービスをプロダクション導入 Security and analytics 豊富なセキュリティ・暗号化に関する機能と頑強な分析基盤 AWS の機械学習に対するアプローチ Embedded R&D 顧客中心のアプローチで state-of-the-art を実現 What is ?

AWS 上で機械学習ワークロードを実行中のお客様 (ごく一部)

© 2019, Amazon Web Services, Inc. or its Affiliates. 深層学習フレームワーク
& インフラストラクチャ AI サービス ML サービス A M A Z O N E C 2 C 5 I n s t a n c e s A M A Z O N E C 2 P 3 / G 4 I n s t a n c e s F P G A s Frameworks AWS の機械学習スタックを用途に合わせて A m a z o n R e k o g n i t i o n I m a g e / V i d e o A m a z o n P o l l y A m a z o n T r a n s c r i b e A m a z o n T r a n s l a t e A m a z o n C o m p r e h e n d A m a z o n L e x Chatbots A m a z o n F o r e c a s t Forecasting A m a z o n T e x t r a c t A m a z o n P e r s o n a l i z e Recommendations Vision Speech Language E l a s t i c I n f e r e n c e Infrastructure Interfaces AutoML Amazon SageMaker

& インフラストラクチャ AI サービス ML サービス Amazon SageMaker A M A Z O N E C 2 C 5 I n s t a n c e s A M A Z O N E C 2 P 3 I n s t a n c e s F P G A s Frameworks AWS の機械学習スタック: AI サービス (学習済みAPI/AutoML) A m a z o n R e k o g n i t i o n I m a g e / V i d e o A m a z o n P o l l y A m a z o n T r a n s c r i b e A m a z o n T r a n s l a t e A m a z o n C o m p r e h e n d A m a z o n L e x Chatbots A m a z o n F o r e c a s t Forecasting A m a z o n T e x t r a c t A m a z o n P e r s o n a l i z e Recommendations Vision Speech Language E l a s t i c I n f e r e n c e Infrastructure Interfaces • コンピュータビジョン (画像及び動画内の物体検出・顔認識・不適切コンテンツの検出)、音声 (読み上げ・書き起こし)、自然言語処理 (翻訳・文書意味理解)・チャットボットのトレーニング済みモデルを API で簡単に呼び出せる • AutoML: お客様のデータをアップロードして、モデル選択・トレーニング・パラメータチューニングを行いエンドポイントにデプロイ AutoML

& インフラストラクチャ AI サービス ML サービス A M A Z O N E C 2 C 5 I n s t a n c e s A M A Z O N E C 2 P 3 / G 4 I n s t a n c e s F P G A s Frameworks AWS の機械学習スタック: ML サービス (ML プラットフォーム) A m a z o n R e k o g n i t i o n I m a g e / V i d e o A m a z o n P o l l y A m a z o n T r a n s c r i b e A m a z o n T r a n s l a t e A m a z o n C o m p r e h e n d A m a z o n L e x Chatbots A m a z o n F o r e c a s t Forecasting A m a z o n T e x t r a c t A m a z o n P e r s o n a l i z e Recommendations Vision Speech Language E l a s t i c I n f e r e n c e Infrastructure Interfaces AutoML データ収集クリーンアップデータ変換・ラベル付けトレーニングモデル評価本番環境にデプロイ推論・監視に求められること:

& インフラストラクチャ AI サービス ML サービス Amazon SageMaker A M A Z O N E C 2 C 5 I n s t a n c e s A M A Z O N E C 2 P 3 / G 4 I n s t a n c e s F P G A s Frameworks AWS の機械学習スタック: ML サービス (ML プラットフォーム) A m a z o n R e k o g n i t i o n I m a g e / V i d e o A m a z o n P o l l y A m a z o n T r a n s c r i b e A m a z o n T r a n s l a t e A m a z o n C o m p r e h e n d A m a z o n L e x Chatbots A m a z o n F o r e c a s t Forecasting A m a z o n T e x t r a c t A m a z o n P e r s o n a l i z e Recommendations Vision Speech Language E l a s t i c I n f e r e n c e Infrastructure Interfaces AutoML • ワークフロー構築を助けるためのプラットフォームサービス • 機械学習ワークロードに特化した機能 • 単純な仮想マシンに比べて運用が楽 • 他の AWS サービスと組合せて研究により得られた価値をユーザーに還元

& インフラストラクチャ AI サービス ML サービス Amazon SageMaker A M A Z O N E C 2 C 5 I n s t a n c e s A M A Z O N E C 2 P 3 / G 4 I n s t a n c e s F P G A s Frameworks AWS の機械学習スタック: インフラ A m a z o n R e k o g n i t i o n I m a g e / V i d e o A m a z o n P o l l y A m a z o n T r a n s c r i b e A m a z o n T r a n s l a t e A m a z o n C o m p r e h e n d A m a z o n L e x Chatbots A m a z o n F o r e c a s t Forecasting A m a z o n T e x t r a c t A m a z o n P e r s o n a l i z e Recommendations Vision Speech Language E l a s t i c I n f e r e n c e Infrastructure Interfaces AutoML • 深層学習フレームワーク • MXNet/Gluon はじめ TensorFlow, PyTorch, Chainer を幅広くサポート • CPU/GPU/FPGA/ASIC (インスタンスタイプ例) • (C5/C5d) Intel Xeon Scalable Processors • (P3/G4) NVIDIA Tesla V100/T4 • (F1) Xilinx UltraScale Plus • ストレージ・ネットワーク • Lustre, ≦ 100 Gbps Ethernet

& インフラストラクチャ AI サービス ML サービス A M A Z O N E C 2 C 5 I n s t a n c e s A M A Z O N E C 2 P 3 / G 4 I n s t a n c e s F P G A s Frameworks AWS の機械学習スタックを用途に合わせて A m a z o n R e k o g n i t i o n I m a g e / V i d e o A m a z o n P o l l y A m a z o n T r a n s c r i b e A m a z o n T r a n s l a t e A m a z o n C o m p r e h e n d A m a z o n L e x Chatbots A m a z o n F o r e c a s t Forecasting A m a z o n T e x t r a c t A m a z o n P e r s o n a l i z e Recommendations Vision Speech Language E l a s t i c I n f e r e n c e Infrastructure Interfaces AutoML Amazon SageMaker

2019, Amazon Web Services, Inc. or its Affiliates. Amazon SageMaker

© 2019, Amazon Web Services, Inc. or its Affiliates. あらゆる規模で機械学習・深層学習モデルを構築・トレーニング・デプロイ
Amazon SageMaker

Amazon SageMaker トレーニングとパラメーターチューニングトレーニング環境の整備と運用本番環境へのデプロイ学習データの収集と準備 ML アルゴリズムの選択と最適化 1 2 3 本番環境での運用とスケーリング

トレーニング環境の整備と運用本番環境へのデプロイ学習データの収集と準備 ML アルゴリズムの選択と最適化 1 2 3 本番環境での運用とスケーリング Amazon SageMaker Ground Truth トレーニングとパラメーターチューニング Amazon SageMaker

トレーニング環境の整備と運用本番環境へのデプロイ学習データの収集と準備 ML アルゴリズムの選択と最適化 1 2 3 本番環境での運用とスケーリング Amazon SageMaker Ground Truth AWS Marketplace for Machine Learning トレーニングとパラメーターチューニング • k-means クラスタリング • Factorization Machines (レコメンド) • DeepAR (時系列予測) • BlazingText (Word2Vec) • XGBoost • 画像分類・物体検出 • Seq2Seq • LDA / Neural Topic Modelling (トピックモデル) • 主成分分析 • 線型学習器 (回帰 / 分類) Amazon SageMaker

© 2019, Amazon Web Services, Inc. or its Aﬃliates. あらゆる規模で機械学習・深層学習モデルを構築・トレーニング・デプロイ
トレーニング環境の整備と運用本番環境へのデプロイ学習データの収集と準備 ML アルゴリズムの選択と最適化 1 2 3 本番環境での運用とスケーリング Amazon EC2 P3 Instances Managed Spot Training Amazon SageMaker Ground Truth AWS Marketplace for Machine Learning トレーニングとパラメーターチューニング Amazon SageMaker

© 2019, Amazon Web Services, Inc. or its Affiliates. トレーニングと
パラメーターチューニングあらゆる規模で機械学習・深層学習モデルを構築・トレーニング・デプロイトレーニング環境の整備と運用本番環境へのデプロイ学習データの収集と準備 ML アルゴリズムの選択と最適化 1 2 3 本番環境での運用とスケーリング Amazon EC2 P3 Instances Managed Spot Training Amazon SageMaker Ground Truth AWS Marketplace for Machine Learning Amazon SageMaker

パラメーターチューニングあらゆる規模で機械学習・深層学習モデルを構築・トレーニング・デプロイトレーニング環境の整備と運用本番環境へのデプロイ学習データの収集と準備 ML アルゴリズムの選択と最適化 1 2 3 本番環境での運用とスケーリング Amazon EC2 P3 Instances Managed Spot Training Amazon SageMaker Ground Truth AWS Marketplace for Machine Learning Amazon SageMaker Neo Amazon SageMaker

パラメーターチューニングあらゆる規模で機械学習・深層学習モデルを構築・トレーニング・デプロイトレーニング環境の整備と運用本番環境へのデプロイ学習データの収集と準備 ML アルゴリズムの選択と最適化 1 2 3 本番環境での運用とスケーリング Amazon EC2 P3 Instances Managed Spot Training Amazon SageMaker Ground Truth Amazon Elastic Inference AWS Marketplace for Machine Learning Amazon SageMaker Neo Amazon SageMaker

2019, Amazon Web Services, Inc. or its Affiliates. Amazon SageMaker の使い方

© 2019, Amazon Web Services, Inc. or its Affiliates. 開発環境として
Jupyter Notebook/Lab を簡単に使える • インスタンスタイプを選んで立ち上げるだけ • フレームワークがプリインストール • ノートブックインスタンス作成・起動時にスクリプト実行 • https://github.com/aws-samples/amazon-sagemaker-notebook-instance-lifecycle-config-samples

© 2019, Amazon Web Services, Inc. or its Affiliates. Amazon
SageMaker 開発 Jupyter Notebook/Lab Amazon S3 The Jupyter Trademark is registered with the U.S. Patent & Trademark Office.

SageMaker 開発 Jupyter Notebook/Lab Amazon S3 学習 Amazon EC2 P3 Instances Amazon ECR The Jupyter Trademark is registered with the U.S. Patent & Trademark Oﬃce. ビルド済みのコンテナイメージが予め用意されている

SageMaker 開発学習 Amazon EC2 P3 Instances Jupyter Notebook/Lab Amazon S3 The Jupyter Trademark is registered with the U.S. Patent & Trademark Office. トレーニングでのメリット: • API 経由で学習用インスタンスを起動、学習が完了すると自動停止 • 高性能なインスタンスを秒課金で、簡単にコスト削減 • 指定した数のインスタンスを同時起動、分散学習も容易

SageMaker 開発学習推論 Amazon EC2 P3 Instances Jupyter Notebook/Lab Endpoint/ Batch transform Amazon S3 Amazon ECR The Jupyter Trademark is registered with the U.S. Patent & Trademark Office.

© 2019, Amazon Web Services, Inc. or its Affiliates. 「コンテナ」による環境の統⼀化
CUDA, cuDNN トレーニングスクリプト train.py Deep Learning Framework スクリプトの実行に必要なものをコードで記述し一箇所にまとめる Docker image

2019, Amazon Web Services, Inc. or its Affiliates. SageMaker Deep Dive

SageMaker Ground Truth

SageMaker Ground Truth • アノテーションの一般的なワークフローをサポート • 5種類の組み込みラベリングツールとワーカー (クラウドソーシング) 連携 • ラベル統合 (label consolidation) による高精度なラベル付け • アクティブラーニング・自動ラベリング機能で最大70%のコスト削減 • AWS StepFunctions からの呼び出しにも対応迅速・効率的簡単高精度

© 2019, Amazon Web Services, Inc. or its Affiliates. 5種類の組み込みツール+カスタムジョブと
3種類のワーカーカスタム

© 2019, Amazon Web Services, Inc. or its Affiliates. クラウドソーシングにおける品質管理の課題
• 複数ワーカーが異なるラベルを返した場合 • 多数決よりも「賢く」結果を統合したい • 統計的品質管理 bulldog sharpei bulldog bulldog ?

© 2019, Amazon Web Services, Inc. or its Affiliates. クラウドソーシングにおける品質管理の課題
• 複数ワーカーが異なるラベルを返した場合 • ワーカーが複数タスクに回答することを利用して能力を評価、正解ラベルの推定に使用 bulldog sharpei bulldog bulldog 0.1 Bulldog 0.9 Sharpei 0.7 0.9 0.5 0.3

© 2019, Amazon Web Services, Inc. or its Aﬃliates. クラウドソーシングにおける品質管理の課題
• 複数ワーカーが異なるラベルを返した場合 • Bulldog • Sharpei bulldog sharpei bulldog bulldog 0.7 0.9 0.5 0.3 P(x1, x2, x3, x4 |B) = Y i P(xi |B) = 0.7 ⇤ 0.1 ⇤ 0.5 ⇤ 0.3 ⇡ 0.01 P(x1, x2, x3, x4 |S) = Y i P(xi |S) = 0.3 ⇤ 0.9 ⇤ 0.5 ⇤ 0.7 ⇡ 0.1 P(S|x1, . . . x4) P(B|x1, . . . x4) = P(x1, . . . x4 |S) P(x1, . . . x4 |B) ⇡ 10 <latexit sha1_base64="UUx5g10II1jlXy57R66lrcTmrlU=">AAADCHicbVJLa9tAEF4pfaTuy0mPhTLUtDilBG3i4vYQCO6lRxfXScAyYrVeJUv0YndVbGQfe8lfySWHltJrf0Jv/TcdbURxHA/M8PHNzDea0YZ5LLXxvL+Ou3Hn7r37mw8aDx89fvK0ubV9pLNCcTHkWZypk5BpEctUDI00sTjJlWBJGIvj8PxjlT/+KpSWWfrFzHIxTthpKiPJmUEq2HJevIZ+exrQt9NgD30fvTPv7cAB+LnKJoG0aQlzQBJZb7cLbzBSG9/ZuA8+y7F6itij4PuNdaKDdaKWrBQqnQ9Lmt0lTWoV/UgxXvbbA2yrtMGfZEYj7OwskO6toauBddfN1HwAtmmlw265+D+Zeo2g2cKdrMFtQGvQIrX1g+YfVONFIlLDY6b1iHq5GZdMGcljsWj4hRY54+fsVIwQpiwRelzaH7mAV8hMIMoUemrAsssdJUu0niUhVibMnOnVXEWuy40KE70flzLNCyNSfj0oKmIwGVSvAiZSCW7iGQLGlcRvBX7G8HIG3051BLq68m1wtLdLEX/utA579Tk2yXPykrQJJV1ySD6RPhkS7nxzLp3vzg/3wr1yf7q/rktdp+55Rm6Y+/sflAHeBw==</latexit> <latexit sha1_base64="UUx5g10II1jlXy57R66lrcTmrlU=">AAADCHicbVJLa9tAEF4pfaTuy0mPhTLUtDilBG3i4vYQCO6lRxfXScAyYrVeJUv0YndVbGQfe8lfySWHltJrf0Jv/TcdbURxHA/M8PHNzDea0YZ5LLXxvL+Ou3Hn7r37mw8aDx89fvK0ubV9pLNCcTHkWZypk5BpEctUDI00sTjJlWBJGIvj8PxjlT/+KpSWWfrFzHIxTthpKiPJmUEq2HJevIZ+exrQt9NgD30fvTPv7cAB+LnKJoG0aQlzQBJZb7cLbzBSG9/ZuA8+y7F6itij4PuNdaKDdaKWrBQqnQ9Lmt0lTWoV/UgxXvbbA2yrtMGfZEYj7OwskO6toauBddfN1HwAtmmlw265+D+Zeo2g2cKdrMFtQGvQIrX1g+YfVONFIlLDY6b1iHq5GZdMGcljsWj4hRY54+fsVIwQpiwRelzaH7mAV8hMIMoUemrAsssdJUu0niUhVibMnOnVXEWuy40KE70flzLNCyNSfj0oKmIwGVSvAiZSCW7iGQLGlcRvBX7G8HIG3051BLq68m1wtLdLEX/utA579Tk2yXPykrQJJV1ySD6RPhkS7nxzLp3vzg/3wr1yf7q/rktdp+55Rm6Y+/sflAHeBw==</latexit> <latexit sha1_base64="UUx5g10II1jlXy57R66lrcTmrlU=">AAADCHicbVJLa9tAEF4pfaTuy0mPhTLUtDilBG3i4vYQCO6lRxfXScAyYrVeJUv0YndVbGQfe8lfySWHltJrf0Jv/TcdbURxHA/M8PHNzDea0YZ5LLXxvL+Ou3Hn7r37mw8aDx89fvK0ubV9pLNCcTHkWZypk5BpEctUDI00sTjJlWBJGIvj8PxjlT/+KpSWWfrFzHIxTthpKiPJmUEq2HJevIZ+exrQt9NgD30fvTPv7cAB+LnKJoG0aQlzQBJZb7cLbzBSG9/ZuA8+y7F6itij4PuNdaKDdaKWrBQqnQ9Lmt0lTWoV/UgxXvbbA2yrtMGfZEYj7OwskO6toauBddfN1HwAtmmlw265+D+Zeo2g2cKdrMFtQGvQIrX1g+YfVONFIlLDY6b1iHq5GZdMGcljsWj4hRY54+fsVIwQpiwRelzaH7mAV8hMIMoUemrAsssdJUu0niUhVibMnOnVXEWuy40KE70flzLNCyNSfj0oKmIwGVSvAiZSCW7iGQLGlcRvBX7G8HIG3051BLq68m1wtLdLEX/utA579Tk2yXPykrQJJV1ySD6RPhkS7nxzLp3vzg/3wr1yf7q/rktdp+55Rm6Y+/sflAHeBw==</latexit> <latexit sha1_base64="UUx5g10II1jlXy57R66lrcTmrlU=">AAADCHicbVJLa9tAEF4pfaTuy0mPhTLUtDilBG3i4vYQCO6lRxfXScAyYrVeJUv0YndVbGQfe8lfySWHltJrf0Jv/TcdbURxHA/M8PHNzDea0YZ5LLXxvL+Ou3Hn7r37mw8aDx89fvK0ubV9pLNCcTHkWZypk5BpEctUDI00sTjJlWBJGIvj8PxjlT/+KpSWWfrFzHIxTthpKiPJmUEq2HJevIZ+exrQt9NgD30fvTPv7cAB+LnKJoG0aQlzQBJZb7cLbzBSG9/ZuA8+y7F6itij4PuNdaKDdaKWrBQqnQ9Lmt0lTWoV/UgxXvbbA2yrtMGfZEYj7OwskO6toauBddfN1HwAtmmlw265+D+Zeo2g2cKdrMFtQGvQIrX1g+YfVONFIlLDY6b1iHq5GZdMGcljsWj4hRY54+fsVIwQpiwRelzaH7mAV8hMIMoUemrAsssdJUu0niUhVibMnOnVXEWuy40KE70flzLNCyNSfj0oKmIwGVSvAiZSCW7iGQLGlcRvBX7G8HIG3051BLq68m1wtLdLEX/utA579Tk2yXPykrQJJV1ySD6RPhkS7nxzLp3vzg/3wr1yf7q/rktdp+55Rm6Y+/sflAHeBw==</latexit> 0.1 Bulldog 0.9 Sharpei

© 2019, Amazon Web Services, Inc. or its Affiliates. 精度の高い
学習用データセット入力データセット人がラベル付けしたデータからアクティブラーニングのモデルを学習自動アノテーション信頼度の低いデータは人間がアノテーションアクティブラーニングと自動データラベリングアクティブラーニング

© 2019, Amazon Web Services, Inc. or its Affiliates. 参考文献
Dawid, Alexander Philip, and Allan M. Skene. "Maximum likelihood estimation of observer error‐rates using the EM algorithm." Journal of the Royal Sta0s0cal Society: Series C (Applied Sta0s0cs) 28.1 (1979): 20-28. Branson, Steve, Grant Van Horn, and Pietro Perona. "Lean crowdsourcing: Combining humans and machines in an online system." Proceedings of the IEEE Conference on Computer Vision and PaAern Recogni0on. 2017. Van Horn, Grant, et al. "Lean multiclass crowdsourcing." Proceedings of the IEEE Conference on Computer Vision and PaAern Recogni0on. 2018. Yang, Jie, et al. "Leveraging crowdsourcing data for deep active learning an application: Learning intents in Alexa." Proceedings of the 2018 World Wide Web Conference. International World Wide Web Conferences Steering Committee, 2018. AWS re:Invent 2018: Amazon SageMaker Ground Truth: Quality & Accurate Datasets (AIM369) 馬場雪乃. "クラウドソーシングにおける統計的品質管理理⼿手法の研究動向." (2014).

パラメーターチューニングあらゆる規模で機械学習・深層学習モデルを構築・トレーニング・デプロイトレーニング環境の整備と運用本番環境へのデプロイ学習データの収集と準備 ML アルゴリズムの選択と最適化 1 2 3 本番環境での運用とスケーリング Amazon SageMaker Ground Truth AWS Marketplace for Machine Learning Amazon SageMaker • k-means クラスタリング • Factorization Machines (レコメンド) • DeepAR (時系列予測) • BlazingText (Word2Vec) • XGBoost • 画像分類・物体検出 • Seq2Seq • LDA / Neural Topic Modelling (トピックモデル) • 主成分分析 • 線型学習器 (回帰 / 分類)

© 2019, Amazon Web Services, Inc. or its Affiliates. BlazingText
• Word2Vec: 単語の分散表現 (ベクトル空間に埋め込み) • Continuous Bag of Words (CBOW) • Skip-gram

© 2019, Amazon Web Services, Inc. or its Affiliates. BlazingText:
Word2Vec の分散実装 (CPU/GPU) 確率的勾配法 (Stochastic Gradient Descent; SGD)が sequential なことが本質的な難しさ • HogWild parallel SGD (Word2Vecに限らず一般のSGDの並列化) • スレッドごとに別の単語の組を学習しコンフリクトは無視 • 理論的には収束が悪くなるはずだが、複数スレッドが同一の単語を更新しなければ問題ない。実際、大きな語彙数に対してコンフリクトは起こりづらく、収束性は損なわれない • HogBatch parallel SGD • ミニバッチ間でのネガティブサンプルを共有する (“negative sample sharing”) 手法を提案 • Level 3 BLAS の行列積 (SGEMM; Single-precision GEneral Matrix-Matrix operation) で問題を記述 • BlazingText (論文での提案手法; SageMaker built-in では上記の手法も提供) • GPU 実装は深層学習フレームワークを使わず CUDA で高速化 • 各文を thread block に割り当て、その中で埋め込み次元を各スレッドに割り当て並列化 BlazingText CBOW Skip-gram Batch_skipgram Single instance (CPU) ☑ HogWild ☑ HogWild ☑ HogBatch Distributed (CPU) ☑ HogBatch Single instance (≧ 1 GPU) ☑ Proposed ☑ Proposed

© 2019, Amazon Web Services, Inc. or its Affiliates. BlazingText:
結果 • スケーラブルな実装 • コスト効率よく高い精度を実現スループット BlazingText GPU CPU FastText (比較手法) Accuracy コスト ($) BlazingText (p3.2xl)

© 2019, Amazon Web Services, Inc. or its Affiliates. HRNN:
手法 • ユーザー行動のメタデータを系列として入力し、レコメンドに使用 • 過去の動画閲覧履歴など Ma, Yifei, and Murali Balakrishnan Narayanaswamy. "Hierarchical Temporal-Contextual Recommenders." (2018).

© 2019, Amazon Web Services, Inc. or its Affiliates. HRNN:
結果 • Movie Lens 評価予測タスクで Factorization Machines や RNN (セッション情報なし) に比べ、HRNN が誤差が少ないという結果に Ma, Yifei, and Murali Balakrishnan Narayanaswamy. "Hierarchical Temporal-Contextual Recommenders." (2018).

© 2019, Amazon Web Services, Inc. or its Affiliates. DeepAR:
手法自己回帰 RNN Salinas, David, et al. "DeepAR: Probabilistic forecasting with autoregressive recurrent networks." International Journal of Forecasting (2019).

© 2019, Amazon Web Services, Inc. or its Affiliates. DeepAR:
結果 ec データセットでの予測結果 (青) 80%信頼区間を表示 Salinas, David, et al. "DeepAR: Probabilistic forecasting with autoregressive recurrent networks." International Journal of Forecasting (2019).

© 2019, Amazon Web Services, Inc. or its Affiliates. Apache
MXNet • Deep Neural Net 学習・推論のための高速・スケーラブルなフレームワーク • CV、NLP、時系列予測、推薦、RL など産業界のあらゆるユースケースに • 2016年以降 Amazon/AWS の多くのプロダクションサービスで利用 • Alexa, Amazon Go, Amazon Retail Warehouse Systems, Amazon Music, Amazon Rekognition, Amazon Comprehend • 教科書: https://github.com/d2l-ai/d2l-ja Computer Vision (CV) Recommendation Engine Forecasting Sentiment Analysis

© 2019, Amazon Web Services, Inc. or its Affiliates. Apache
MXNet 主なメリット F l e x i b l e D e b u g g a b l e S c a l a b le 8 f r o n t e n d l a n g u a g e s O p t i m i z e d b i n a r i e s Po r t a b le Keras

BlazingText Gupta, Saurabh, and Vineet Khare. "Blazingtext: Scaling and accelerating word2vec using multiple gpus." Proceedings of the Machine Learning on HPC Environments. ACM, 2017. Amazon SageMaker BlazingText: Parallelizing Word2Vec on Multiple CPUs or GPUs HRNN Ma, Yifei, and Murali Balakrishnan Narayanaswamy. "Hierarchical Temporal-Contextual Recommenders." (2018). DeepAR Salinas, David, et al. "DeepAR: Probabilistic forecasting with autoregressive recurrent networks." International Journal of Forecasting (2019).

© 2019, Amazon Web Services, Inc. or its Affiliates. Managed
Sport Training でトレーニング料金を削減 • オンデマンドに比べて最大90%のコスト削減 • 中断が発生する可能性があるので checkpoints に途中経過を書き出し • 最大で待てる時間を指定呼び出し方 (SageMaker Python SDK >= v1.37.2 で対応): estimator = Estimator("train.py", role=sagemaker.get_execution_role(), train_instance_count=1, train_instance_type="ml.p3.2xlarge", framework_version="1.4.0”, train_use_spot_instances=True, train_max_wait=2*24*60*60, # train_max_run (デフォルト1日) より長い時間を指定 checkpoint_s3_uri="s3://mybucket/checkpoints", checkpoint_local_path="/opt/ml/checkpoints/" ) estimator.fit(“s3://mybucket/data/train”) # fit でトレーニングは同様

© 2019, Amazon Web Services, Inc. or its Affiliates. P3/P3dn
p3.16xlarge / p3dn.24xlarge • ネットワークトポロジー p3.16xl p3dn.24xl Processor Intel Xeon E5- 2686 v4 Intel Skylake 8175 (w/ AVX 512) vCPUs 64 96 GPU 8x 16 GB NVIDIA Tesla V100 8x 32 GB NVIDIA Tesla V100 RAM 488 GB 768 GB Network 25 Gbps ENA 100 Gbps ENA + EFA GPU Interconnect NVLink – 300 GB/s

© 2019, Amazon Web Services, Inc. or its Aﬃliates. TensorFlow
でのマルチノードトレーニング ImageNet で分散学習 • 120万枚の画像 • 1000カテゴリ • 256 GPUs • 15分 TF CUDA10 TF CUDA 9 IDEAL 120k img/s 14.6 min w/ 256 GPUs

© 2019, Amazon Web Services, Inc. or its Aﬃliates. P3dn
+ EFA • スケーラブルなスループット • Fairseq (PyTorch seq-to-seq) での分散効率ベンチマーク

© 2019, Amazon Web Services, Inc. or its Aﬃliates. 参考文献
Scalable multi-node training with TensorFlow Launching TensorFlow distributed training easily with Horovod or Parameter Servers in Amazon SageMaker Optimizing deep learning on P3 and P3dn with EFA

© 2019, Amazon Web Services, Inc. or its Aﬃliates. ベイズ最適化
• ブラックボックス関数の最適化手法 • グリッドサーチ・ランダムサーチに比べ効率が良いと考えられており、機械学習のハイパーパラメータ最適化 (HPO) でよく使われる Figure 1 from arXiv:1012.2599 [cs.LG]

© 2019, Amazon Web Services, Inc. or its Aﬃliates. Optuna
• PFN が開発している OSS の HPO フレームワーク • Define-by-Run のインターフェース (複雑なパラメータ空間を記述しやすい) • Tree-structured Parzen Estimator Approach (TPE) を含む複数のサンプリング手法を利用 • SageMaker の分散環境でも使えるようなテンプレートを公開 • https://aws.amazon.com/jp/blogs/news/amazon-sagemaker- optuna-hpo/

© 2019, Amazon Web Services, Inc. or its Aﬃliates. 参考文献
Brochu, Eric, Vlad M. Cora, and Nando De Freitas. "A tutorial on Bayesian optimization of expensive cost functions, with application to active user modeling and hierarchical reinforcement learning." arXiv preprint arXiv:1012.2599 (2010). Bergstra, James S., et al. "Algorithms for hyper-parameter optimization." Advances in neural information processing systems. 2011. Snoek, Jasper, Hugo Larochelle, and Ryan P. Adams. "Practical bayesian optimization of machine learning algorithms." Advances in neural information processing systems. 2012. Akiba, Takuya, et al. "Optuna: A next-generation hyperparameter optimization framework." Proceedings of the 25th ACM SIGKDD International Conference on Knowledge Discovery & Data Mining. ACM, 2019. Amazon SageMaker で Optuna を用いたハイパーパラメータ最適化を実装する

2019, Amazon Web Services, Inc. or its Aﬃliates. Amazon SageMaker Neo

© 2019, Amazon Web Services, Inc. or its Aﬃliates. Amazon
SageMaker Neo トレーニング済のモデルをコンパイルし、様々な環境で動作 K E Y F E A T U R E S Neo-AI デバイスランタイム・コンパイラはオープンソース (Apache license 2.0) ランタイムは DL フレームワークの 1/10 のサイズ https://github.com/neo-ai/

© 2019, Amazon Web Services, Inc. or its Aﬃliates. Parse
Model Optimize Tensors Generate Code Optimize Graph TensorFlow, MXNet, PyTorch, XGBoost のモデルを共通フォーマットに変換 ML モデル (NN) 中のパターンを認識し、実行コストを削減するようグラフ構造の最適化入力データの shape からパターンを抽出し、効率的にメモリを割り当てるターゲットデバイス用に低レベルコンパイラを用いて機械語を生成 TVM によるコンパイル Pruning Operator fusion Nested loop tiling Vectorization / Tensorization Data layout transform

TVM Chen, Tianqi, et al. "{TVM}: An automated end-to-end optimizing compiler for deep learning." 13th {USENIX} Symposium on Operating Systems Design and Implementation ({OSDI} 18). 2018. AutoTVM Using AutoTVM to Automatically Generate Deep Learning Libraries for Mobile Devices https://github.com/apache/incubator-tvm/wiki/Benchmark AutoTVM の紹介 Learning to Optimize Tensor Programs Relay Roesch, Jared, et al. "Relay: a new IR for machine learning frameworks." Proceedings of the 2nd ACM SIGPLAN International Workshop on Machine Learning and Programming Languages. ACM, 2018.

© 2019, Amazon Web Services, Inc. or its Affiliates. AWS
Inferentia (ASIC) • Scalable performance from 32 TOPS to 512 TOPS at INT8 • high scale deployments common to machine learning inference deployments where costs really matter. • Near-linear scale-out, and build on high volume technologies like DRAM rather than HBM. • INT8 for best performance • also support mixed-precision FP16 and bfloat16 for compatibility, • Working with the wider Amazon/AWS Machine learning service teams like • Amazon Go, • Alexa, • Rekognition, and • SageMaker • We support • ONNX (https://onnx.ai/) • interface natively with • MXNet, PyTorch, and TensorFlow

AWS Inferentia Machine Learning Processor

© 2019, Amazon Web Services, Inc. or its Affiliates. おまけ:
イジングマシン Simulated Bifurcation Machine (SBM) Marketplace から EC2 P3 インスタンスを立ち上げ、REST API で計算 http://dfk66cqpwr4ko.cloudfront.net/user_manual_en.pdf REQUEST $ curl -i -H "Content-Type: application/octet-stream" -X POST "http://123.45.67.89:8000/solver/maxcut?steps=10000&loops=10" --data-binary "@testdata/Gset/G22" RESULT HTTP/1.1 100 Continue HTTP/1.1 200 OK Content-Type: application/json; charset=utf-8 Content-Length: 4113 ETag: W/"1011-Z42q2FWKAaAAm6ElkgQkv5LQpok" Date: Mon, 10 Jun 2019 02:28:45 GMT Connection: keep-alive {"id":"r2273333554","time":1.65,"wait":0,"runs":3200,"steps":10000,"messa ge":"finished","value":13359,"result":[0,0,1,0,1,0,0,0,0,1,1,0,1,0,1,0,1, 1,1,1,1,0,0,1,0,0,0,1,1,0,0,0,1,0,0,1,0,1,0,0,0,1,0,1,0,0,1,1,0,1,0,0,0, <multiple lines omitted> 1,0,0,1,1,1,0,0,1,0,0,1,0,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,0,1,1,1,1,0,0 ,1,0,0,0,1,0,1,1,0,1,1,0,0,1,0,1,1,1,1]} 1-bit Simulated Annealing https://github.com/hariby/SA-complete-graph

© 2019, Amazon Web Services, Inc. or its Affiliates. おまけ:
量子コンピュータのシミュレーション on AWS Blueqat+Qgate https://qiita.com/YuichiroMinato/items/b562d7b8e0975840a85f

© 2019, Amazon Web Services, Inc. or its Affiliates. ML@Loft
#8 量子コンピュータ × 機械学習 (11/20 Wed.) https://mlloft8.splashthat.com/ • 根来誠氏 (大阪大学先導的学際研究機構特任准教授) • 「量子機械学習実装」 • 久保健治氏 (株式会社メルカリ mercari R4D Researcher) • 「量子機械学習アルゴリズム」 • 藤井啓祐氏 (大阪大学大学院基礎工学研究科システム創成専攻教授) • 「量子コンピュータの現状と課題」 • 湊雄一郎氏 (MDR株式会社 CEO) • 「量子コンピュータと世界のベンチャー企業」 • 後藤隼人氏 (株式会社東芝研究開発センター主任研究員) • 「量子インスパイアド古典アルゴリズム」

© 2019, Amazon Web Services, Inc. or its Affiliates. 研究ファンド:
AWS Machine Learning Research Awards (MLRA) https://aws.amazon.com/blogs/machine-learning/aws-machine-learning- research-awards-call-for-proposal/ The following types of projects are eligible for MLRA funding: • Development of open-source tools and research that benefit the ML community at large. • Impactful research that uses any of the following AWS ML solutions: Amazon SageMaker, Amazon SageMaker Ground Truth, Amazon SageMaker Neo, Apache MXNet on AWS, and AWS AI Services. The average awarded amount is no more than $70,000 cash and $100,000 AWS Promotional Credits for individual projects.

AWS における機械学習の取り組み / AWS ML

AWS における機械学習の取り組み / AWS ML

More Decks by Yoshitaka Haribara

Other Decks in Technology

Featured

Transcript