Slide 1

Slide 1 text

© 2020, Amazon Web Services, Inc. or its affiliates. All rights reserved. In Partnership with Anatomy of Amazon SageMaker Python SDK Yoshitaka Haribara, Ph.D. Startup ML Solutions Architect, AWS @_hariby D - 2 2 1 . 1 0 . 2 0 2 0

Slide 2

Slide 2 text

© 2020, Amazon Web Services, Inc. or its affiliates. All rights reserved. In Partnership with このセッションの狙い SageMaker Python SDK を知って、 改めて SageMaker のインターフェースについて考えを整理する。 対象 • Amazon SageMaker は使ったことあり。 • SageMaker Python SDK も使い勝手はイメージできている。 • でも実際、裏でどう動いているかはそれほど深く考えたことがない。 • boto3 とかはボチボチ使ったことがある。

Slide 3

Slide 3 text

© 2020, Amazon Web Services, Inc. or its affiliates. All rights reserved. In Partnership with 注意事項 ⚠ このセッションは SageMaker Python SDK の実装自体という 結構マニアックな内容が含まれています。 スライドにコードが書かれていますが、かなり簡略化されています。 正しくは実装を御覧下さい。

Slide 4

Slide 4 text

© 2020, Amazon Web Services, Inc. or its affiliates. All rights reserved. In Partnership with 自己紹介 • 針原 佳貴 (はりばら よしたか) • 博士 (情報理工学) • Startup Machine Learning Solutions Architect • スタートアップの技術支援・機械学習導入支援 • 好きなサービスは Amazon SageMaker, ⚛ Amazon Braket • 明日、Kubeflow on EKSワークショップやります (Kubeflow Pipelines → SageMaker を呼ぶ)

Slide 5

Slide 5 text

© 2020 Amazon Web Services, Inc. or its affiliates. All rights reserved. In Partnership with “Amazon SageMaker を 使いこなすことは即ち、 SageMaker Python SDK を 使いこなすこと” (私見です)

Slide 6

Slide 6 text

© 2020, Amazon Web Services, Inc. or its affiliates. All rights reserved. In Partnership with

Slide 7

Slide 7 text

© 2020, Amazon Web Services, Inc. or its affiliates. All rights reserved. In Partnership with SageMaker Python SDK https://github.com/aws/sagemaker-python-sdk

Slide 8

Slide 8 text

© 2020, Amazon Web Services, Inc. or its affiliates. All rights reserved. In Partnership with インストール $ pip install sagemaker • SageMaker Notebook にはインストール済み • 好きな環境 (e.g., ローカルラップトップ) に入れられる • 2020-10-20 時点、最新版は v2.15.2

Slide 9

Slide 9 text

© 2020, Amazon Web Services, Inc. or its affiliates. All rights reserved. In Partnership with 関心のある領域 Amazon SageMaker API Developer (you) SageMaker Python SDK

Slide 10

Slide 10 text

© 2020, Amazon Web Services, Inc. or its affiliates. All rights reserved. In Partnership with 関心のある領域 Amazon SageMaker API Developer (you) SageMaker Python SDK

Slide 11

Slide 11 text

© 2020, Amazon Web Services, Inc. or its affiliates. All rights reserved. In Partnership with 関心のある領域 Amazon SageMaker API Developer (you) SageMaker Python SDK AWS SDK for Python (boto3)

Slide 12

Slide 12 text

© 2020, Amazon Web Services, Inc. or its affiliates. All rights reserved. In Partnership with 今日は SageMaker Python SDK を詳しく調べる SageMaker Python SDK

Slide 13

Slide 13 text

© 2020, Amazon Web Services, Inc. or its Affiliates. Amazon SageMaker Jupyter Notebook/Lab Amazon S3 The Jupyter Trademark is registered with the U.S. Patent & Trademark Office. 開発 データは予め Amazon S3 にアップロード。 SageMaker Python SDK を使う方法: sagemaker_session.upload_data( path='data', key_prefix='data/DEMO') ☁ /

Slide 14

Slide 14 text

© 2020, Amazon Web Services, Inc. or its Affiliates. Amazon SageMaker 開発 Jupyter Notebook/Lab Amazon S3 The Jupyter Trademark is registered with the U.S. Patent & Trademark Office.

Slide 15

Slide 15 text

© 2020, Amazon Web Services, Inc. or its Affiliates. Amazon SageMaker 開発 Jupyter Notebook/Lab Amazon S3 学習 Amazon EC2 P3 Instances Amazon ECR The Jupyter Trademark is registered with the U.S. Patent & Trademark Office. ビルド済みコンテナイメージ , etc. or BYOC

Slide 16

Slide 16 text

© 2020, Amazon Web Services, Inc. or its Affiliates. Amazon SageMaker 開発 学習 Amazon EC2 P3 Instances Jupyter Notebook/Lab Amazon S3 The Jupyter Trademark is registered with the U.S. Patent & Trademark Office. トレーニングでのメリット: • API 経由で学習用インスタ ンスを起動、 学習が完了すると自動停止 • 高性能なインスタンスを 秒課金で、 簡単にコスト削減 • 指定した数のインスタンス を同時起動、 分散学習も容易

Slide 17

Slide 17 text

© 2020, Amazon Web Services, Inc. or its Affiliates. Amazon SageMaker 開発 学習 Amazon EC2 P3 Instances Jupyter Notebook/Lab Amazon S3 The Jupyter Trademark is registered with the U.S. Patent & Trademark Office.

Slide 18

Slide 18 text

© 2020, Amazon Web Services, Inc. or its Affiliates. Amazon SageMaker 開発 推論 Amazon EC2 P3 Instances Jupyter Notebook/Lab Endpoint Amazon S3 Amazon ECR The Jupyter Trademark is registered with the U.S. Patent & Trademark Office.

Slide 19

Slide 19 text

© 2020, Amazon Web Services, Inc. or its affiliates. All rights reserved. In Partnership with 最もベーシックな使い方: Fit, Deploy, Predict # Train my estimator pytorch_estimator = PyTorch(entry_point='train_and_deploy.py', instance_type='ml.p3.2xlarge’, instance_count=1, framework_version='1.5.0’, py_version='py3') pytorch_estimator.fit('s3://my_bucket/my_training_data/')

Slide 20

Slide 20 text

© 2020, Amazon Web Services, Inc. or its affiliates. All rights reserved. In Partnership with 最もベーシックな使い方: Fit, Deploy, Predict # Train my estimator pytorch_estimator = PyTorch(entry_point='train_and_deploy.py', instance_type='ml.p3.2xlarge’, instance_count=1, framework_version='1.5.0’, py_version='py3') pytorch_estimator.fit('s3://my_bucket/my_training_data/') # Deploy my estimator to a SageMaker Endpoint and get a Predictor predictor = pytorch_estimator.deploy(instance_type='ml.m5.xlarge', initial_instance_count=1)

Slide 21

Slide 21 text

© 2020, Amazon Web Services, Inc. or its affiliates. All rights reserved. In Partnership with 最もベーシックな使い方: Fit, Deploy, Predict # Train my estimator pytorch_estimator = PyTorch(entry_point='train_and_deploy.py', instance_type='ml.p3.2xlarge’, instance_count=1, framework_version='1.5.0’, py_version='py3') pytorch_estimator.fit('s3://my_bucket/my_training_data/') # Deploy my estimator to a SageMaker Endpoint and get a Predictor predictor = pytorch_estimator.deploy(instance_type='ml.m5.xlarge', initial_instance_count=1) # `data` is a NumPy array or a Python list. # `response` is a NumPy array. response = predictor.predict(data)

Slide 22

Slide 22 text

© 2020, Amazon Web Services, Inc. or its affiliates. All rights reserved. In Partnership with entry_point のコード import argparse if __name__ == '__main__’: parser = argparse.ArgumentParser() # hyperparameters parser.add_argument('--epochs', type=int, default=10) # input data and model directories parser.add_argument('--train', type=str, default=os.environ['SM_CHANNEL_TRAIN']) parser.add_argument('--test', type=str, default=os.environ['SM_CHANNEL_TEST']) parser.add_argument('--model-dir', type=str, default=os.environ['SM_MODEL_DIR']) args, _ = parser.parse_known_args() … (以下省略) コンテナ内のパス (環境変数の中⾝): /opt/ml/input/data/train /opt/ml/input/data/test /opt/ml/model 環境変数 から取得 Script Mode では普通の Python スクリプトとして実⾏される。 はじめに環境変数からデータ・モデル⼊出⼒のパスを取得して、 そこを読み書きするように train.py を書く:

Slide 23

Slide 23 text

© 2020, Amazon Web Services, Inc. or its affiliates. All rights reserved. In Partnership with

Slide 24

Slide 24 text

© 2020, Amazon Web Services, Inc. or its affiliates. All rights reserved. In Partnership with Fit method (Train) # Train my estimator pytorch_estimator = PyTorch(entry_point='train_and_deploy.py’, instance_type='ml.p3.2xlarge’, instance_count=1, framework_version='1.5.0’, py_version='py3’) pytorch_estimator.fit('s3://my_bucket/my_training_data/’) SageMaker CreateTrainingJob API を呼び出しトレーニングを開始。 Estimator 作成時の設定と、指定したトレーニングデータを CreateTrainingJob に投げる。

Slide 25

Slide 25 text

© 2020, Amazon Web Services, Inc. or its affiliates. All rights reserved. In Partnership with Fit の実装 (Estimator class) def fit(self, inputs=None, wait=True, logs="All", job_name=None, experiment_config=None): self._prepare_for_training(job_name=job_name) self.latest_training_job = _TrainingJob.start_new(self, inputs, experiment_config) self.jobs.append(self.latest_training_job) if wait: self.latest_training_job.wait(logs=logs)

Slide 26

Slide 26 text

© 2020, Amazon Web Services, Inc. or its affiliates. All rights reserved. In Partnership with _TrainingJob class の実装 class _TrainingJob(_Job): @classmethod def start_new(cls, estimator, inputs, experiment_config): """Create a new Amazon SageMaker training job from the estimator. """ train_args = cls._get_train_args(estimator, inputs, experiment_config) estimator.sagemaker_session.train(**train_args) return cls(estimator.sagemaker_session, estimator._current_job_name) ここで、 _get_train_args を使ってパラメータを変換している点に注意。

Slide 27

Slide 27 text

© 2020, Amazon Web Services, Inc. or its affiliates. All rights reserved. In Partnership with Session class class sagemaker.session.Session(boto_session=None, sagemaker_client=None, sagemaker_runtime_client=None, default_bucket=None) Amazon SageMaker API や、その他必要な AWS サービスとの連携を管理している。 このクラスは、トレーニングジョブやエンドポイント、S3 の⼊⼒データセットなど、Amazon SageMaker が使⽤するエンティティやリソースを操作するための便利なメソッドを提供。 AWS サービスの呼び出しは、Boto3 セッションに委譲。 デフォルトでは AWS credential provider chain (様々な認証情報から適切なもの) を使⽤し初期 化。 S3 バケットにアクセスする Amazon SageMaker API コールを⾏い、バケットが存在しない場合、 セッションは AWS アカウント ID を含む命名規則に基づいてデフォルトのバケットを作成。 命名規則: sagemaker-{region}-{AWS account ID}

Slide 28

Slide 28 text

© 2020, Amazon Web Services, Inc. or its affiliates. All rights reserved. In Partnership with session.train ≠ SageMaker.Client.create_training_job train(input_mode, input_config, role, job_name, output_config, resource_config, vpc_config, hyperparameters, stop_condition, tags, metric_definitions, enable_network_isolation=False, image_uri=None, algorithm_arn=None, encrypt_inter_container_traffic=False, use_spot_instances=False, checkpoint_s3_uri=None, checkpoint_local_path=None, experiment_config=None, debugger_rule_configs=None, debugger_hook_config=None, tensorboard_output_config=None, enable_sagemaker_metrics=None) _get_train_request: session.train → SageMaker.Client.create_training_job によりパラメータ変換していることに注意。 こんな感じ def train(input_param): train_request = self._get_train_request(**input_param) self.sagemaker_client.create_training_job(**train_request)

Slide 29

Slide 29 text

© 2020, Amazon Web Services, Inc. or its affiliates. All rights reserved. In Partnership with ここまでで発見したもの • Estimator クラスのインスタンスメソッド fit は estimator, inputs, experiment_config を渡して クラスメソッド_TrainingJob.start_new を呼んでいる。 • 更にその中で _get_train_request で変換したものを渡して sagemaker.session.Session.train が呼ばれている。 • Session.train の中では _get_train_request でパラメータを変 換し、Boto3 の SageMaker.Client.create_training_job (SageMaker CreateTrainingJob API) を呼んでいる。

Slide 30

Slide 30 text

© 2020, Amazon Web Services, Inc. or its affiliates. All rights reserved. In Partnership with

Slide 31

Slide 31 text

© 2020, Amazon Web Services, Inc. or its affiliates. All rights reserved. In Partnership with model1 model2 Production variants Endpoint config 推論 Endpoint create / update Estimator.deploy の裏で起こっていること

Slide 32

Slide 32 text

© 2020, Amazon Web Services, Inc. or its affiliates. All rights reserved. In Partnership with エンドポイント作成の流れ (AWS CLI) Model aws sagemaker create-model --model-name model1 --primary-container ‘{“Image”: “123.dkr.ecr.amazonaws.com/algo”, “ModelDataUrl”: “s3://bkt/model1.tar.gz”}’ --execution-role-arn arn:aws:iam::123:role/me

Slide 33

Slide 33 text

© 2020, Amazon Web Services, Inc. or its affiliates. All rights reserved. In Partnership with エンドポイント作成の流れ (AWS CLI) Model Endpoint configuration aws sagemaker create-model --model-name model1 --primary-container ‘{“Image”: “123.dkr.ecr.amazonaws.com/algo”, “ModelDataUrl”: “s3://bkt/model1.tar.gz”}’ --execution-role-arn arn:aws:iam::123:role/me aws sagemaker create-endpoint-config --endpoint-config-name model1-config --production-variants ‘{“InitialInstanceCount”: 2, “InstanceType”: “ml.m4.xlarge”, “InitialVariantWeight”: 1, “ModelName”: “model1”, “VariantName”: “AllTraffic”}’

Slide 34

Slide 34 text

© 2020, Amazon Web Services, Inc. or its affiliates. All rights reserved. In Partnership with エンドポイント作成の流れ (AWS CLI) Model Endpoint configuration Endpoint aws sagemaker create-model --model-name model1 --primary-container ‘{“Image”: “123.dkr.ecr.amazonaws.com/algo”, “ModelDataUrl”: “s3://bkt/model1.tar.gz”}’ --execution-role-arn arn:aws:iam::123:role/me aws sagemaker create-endpoint-config --endpoint-config-name model1-config --production-variants ‘{“InitialInstanceCount”: 2, “InstanceType”: “ml.m4.xlarge”, “InitialVariantWeight”: 1, “ModelName”: “model1”, “VariantName”: “AllTraffic”}’ aws sagemaker create-endpoint --endpoint-name my-endpoint --endpoint-config-name model1-config

Slide 35

Slide 35 text

© 2020, Amazon Web Services, Inc. or its affiliates. All rights reserved. In Partnership with

Slide 36

Slide 36 text

© 2020, Amazon Web Services, Inc. or its affiliates. All rights reserved. In Partnership with Deploy method # Deploy my estimator to a SageMaker Endpoint and get a Predictor predictor = pytorch_estimator.deploy(instance_type='ml.m5.xlarge’, initial_instance_count=1) トレーニング済のモデルを Amazon SageMaker エンドポイントへデプロイし、 sagemaker.Predictor オブジェクトを返す

Slide 37

Slide 37 text

© 2020, Amazon Web Services, Inc. or its affiliates. All rights reserved. In Partnership with deploy の実装 (Estimator class) def deploy(…): … if SageMaker Neo (後述) でコンパイル済: model = self._compiled_models[family] else: model = self.create_model(**kwargs) … return model.deploy()

Slide 38

Slide 38 text

© 2020, Amazon Web Services, Inc. or its Affiliates. AWS Neuron SDK https://github.com/aws/aws-neuron-sdk コンパイル Neuron コンパイラ (NCC) NEFF を出⼒ Neuron バイナリ (NEFF) デプロイ Neuron ランタイム (NRT) プロファイル Neuron ツール C:\>code --version 1.1.1

Slide 39

Slide 39 text

© 2020, Amazon Web Services, Inc. or its affiliates. All rights reserved. In Partnership with create_model def create_model(…): … return Model(…) Model オブジェクトを返していて、 それに対して model.deploy を呼んでいた class Model(object): def deploy(…): # この中で⾊々やっている

Slide 40

Slide 40 text

© 2020, Amazon Web Services, Inc. or its affiliates. All rights reserved. In Partnership with Model class class Model(object): def deploy(…): … # self.sagemaker_session.create_model(…) self._create_sagemaker_model(…) # sagemaker.production_variant(…) production_variant = sagemaker.production_variant(…) # self.create_endpoint(…) self.sagemaker_session.endpoint_from_production_variants(…) この3つを順番に調べる。

Slide 41

Slide 41 text

© 2020, Amazon Web Services, Inc. or its affiliates. All rights reserved. In Partnership with 1. self._create_sagemaker_model(…) self.sagemaker_session.create_model(…) これは理解できる。実際、中では boto3 を使って self.sagemaker_client.create_model(**create_model_reque st) を呼んでいる (SageMaker CreateModel API)。

Slide 42

Slide 42 text

© 2020, Amazon Web Services, Inc. or its affiliates. All rights reserved. In Partnership with def production_variant(): production_variant_configuration = { "ModelName": model_name, "InstanceType": instance_type, "InitialInstanceCount": initial_instance_count, "VariantName": variant_name, "InitialVariantWeight": initial_weight, } return production_variant_configuration この時点では単純に Python の dict を返している。 2. sagemaker.production_variant(…)

Slide 43

Slide 43 text

© 2020, Amazon Web Services, Inc. or its affiliates. All rights reserved. In Partnership with def endpoint_from_production_variants(): if not _deployment_entity_exists(…): self.sagemaker_client.create_endpoint_config(**config_options) return self.create_endpoint(…) boto3 の create_endpoint_config (SageMaker CreateEndpointConfig API) を呼び出し ている。 その後、 self.create_endpoint を呼んでいる。この中身は、 self.sagemaker_client.create_endpoint (Sagemaker CreateEndpiont API)。 3. sagemaker_session .endpoint_from_production_variants(…)

Slide 44

Slide 44 text

© 2020, Amazon Web Services, Inc. or its affiliates. All rights reserved. In Partnership with ここまでに見つけたもの • estimator.deploy の中では、self.create_model で Model オブジェクト を作成。 • model.deploy の中で色々やっていて、 • sagemaker_session.create_model (SageMaker CreateModel API) • sagemaker.production_variant (dict を返す) • sagemaker_client.create_endpoint_config (SageMaker CreateEndpointConfig API) と sagemaker_client.create_endpoint (Sagemaker CreateEndpiont API) を呼 び出し。 • Predictor を返す。

Slide 45

Slide 45 text

© 2020, Amazon Web Services, Inc. or its affiliates. All rights reserved. In Partnership with

Slide 46

Slide 46 text

© 2020, Amazon Web Services, Inc. or its affiliates. All rights reserved. In Partnership with Predict method # `data` is a NumPy array or a Python list. # `response` is a NumPy array. response = predictor.predict(data)

Slide 47

Slide 47 text

© 2020, Amazon Web Services, Inc. or its affiliates. All rights reserved. In Partnership with predictor.predict def predict(…): request_args = self._create_request_args() response = self.sagemaker_session.sagemaker_runtime_client.invoke_ endpoint(**request_args) return self._handle_response(response) 指定したエンドポイントからの推論結果を返す sagemaker_session.sagemaker_runtime_client.invoke_endpo int (SageMaker Client InvokeEndpoint)

Slide 48

Slide 48 text

© 2020, Amazon Web Services, Inc. or its affiliates. All rights reserved. In Partnership with

Slide 49

Slide 49 text

© 2020, Amazon Web Services, Inc. or its affiliates. All rights reserved. In Partnership with EstimatorBase, Estimator classes

Slide 50

Slide 50 text

© 2020, Amazon Web Services, Inc. or its affiliates. All rights reserved. In Partnership with EstimatorBase class class sagemaker.estimator.EstimatorBase(role, instance_count=None, instance_type=None, volume_size=30, volume_kms_key=None, max_run=86400, input_mode='File', output_path=None, output_kms_key=None, base_job_name=None, sagemaker_session=None, tags=None, subnets=None, security_group_ids=None, model_uri=None, model_channel_name='model', metric_definitions=None, encrypt_inter_container_traffic=False, use_spot_instances=False, max_wait=None, checkpoint_s3_uri=None, checkpoint_local_path=None, rules=None, debugger_hook_config=None, tensorboard_output_config=None, enable_sagemaker_metrics=None, enable_network_isolation=False, **kwargs) Amazon SageMaker のトレーニングとデプロイのタスクを End-to-End で処理。

Slide 51

Slide 51 text

© 2020, Amazon Web Services, Inc. or its affiliates. All rights reserved. In Partnership with EstimatorBase class abstract training_image_uri() abstract hyperparameters() サブクラスで、どの Docker イメージをトレーニングに使うか、どのハイパーパラメータを使う か、それとどのように適切な推論インスタンスを作成する方法を定義する。

Slide 52

Slide 52 text

© 2020, Amazon Web Services, Inc. or its affiliates. All rights reserved. In Partnership with EstimatorBase class fit(inputs=None, wait=True, logs='All', job_name=None, experiment_config=None) compile_model(target_instance_family, input_shape, output_path, framework=None, framework_version=None, compile_max_run=900, tags=None, target_platform_os=None, target_platform_arch=None, target_platform_accelerator=None, compiler_options=None, **kwargs) deploy(initial_instance_count, instance_type, serializer=None, deserializer=None, accelerator_type=None, endpoint_name=None, use_compiled_model=False, wait=True, model_name=None, kms_key=None, data_capture_config=None, tags=None, **kwargs) transformer(instance_count, instance_type, strategy=None, assemble_with=None, output_path=None, output_kms_key=None, accept=None, env=None, max_concurrent_transforms=None, max_payload=None, tags=None, role=None, volume_kms_key=None, vpc_config_override='VPC_CONFIG_DEFAULT', enable_network_isolation=None, model_name=None)

Slide 53

Slide 53 text

© 2020, Amazon Web Services, Inc. or its affiliates. All rights reserved. In Partnership with Estimator class class sagemaker.estimator.Estimator(image_uri, role, instance_count=None, instance_type=None, volume_size=30, volume_kms_key=None, max_run=86400, input_mode='File', output_path=None, output_kms_key=None, base_job_name=None, sagemaker_session=None, hyperparameters=None, tags=None, subnets=None, security_group_ids=None, model_uri=None, model_channel_name='model', metric_definitions=None, encrypt_inter_container_traffic=False, use_spot_instances=False, max_wait=None, checkpoint_s3_uri=None, checkpoint_local_path=None, enable_network_isolation=False, rules=None, debugger_hook_config=None, tensorboard_output_config=None, enable_sagemaker_metrics=None, **kwargs) 任意のアルゴリズムを使ってトレーニングするための汎⽤的な Estimator クラス。 このクラスは、独⾃のカスタムクラスを持たないアルゴリズムで使⽤するために設計されている。

Slide 54

Slide 54 text

© 2020, Amazon Web Services, Inc. or its affiliates. All rights reserved. In Partnership with

Slide 55

Slide 55 text

© 2020, Amazon Web Services, Inc. or its affiliates. All rights reserved. In Partnership with Framework classes

Slide 56

Slide 56 text

© 2020, Amazon Web Services, Inc. or its affiliates. All rights reserved. In Partnership with PyTorch classes

Slide 57

Slide 57 text

© 2020, Amazon Web Services, Inc. or its affiliates. All rights reserved. In Partnership with

Slide 58

Slide 58 text

© 2020, Amazon Web Services, Inc. or its affiliates. All rights reserved. In Partnership with v2 への移行 $ sagemaker-upgrade-v2 --in-file input.py --out-file output.py $ sagemaker-upgrade-v2 --in-file input.ipynb --out-file output.ipynb https://sagemaker.readthedocs.io/en/stable/v2.html#automatically-upgrade-your-code

Slide 59

Slide 59 text

© 2020, Amazon Web Services, Inc. or its affiliates. All rights reserved. In Partnership with

Slide 60

Slide 60 text

Thank you! © 2020, Amazon Web Services, Inc. or its affiliates. All rights reserved. In Partnership with Yoshitaka Haribara @_hariby