Anatomy of Amazon SageMaker Python SDK

Anatomy of Amazon SageMaker Python SDK

✨講師:針原 佳貴(アマゾン ウェブ サービス ジャパン株式会社)

✨セッション概要:
Amazon SageMaker Python SDK は Amazon SageMaker 上で機械学習モデルをトレーニング・デプロイするためのオープンソースのライブラリです。その利便性の高さから、SageMaker を使いこなすことは即ち、SageMaker Python SDK を使いこなすことと言っても過言ではありません。その理由について、このセッションでは SDK の挙動を確認しながら解説します。最後に SageMaker Python SDK v2 移行の注意点についても触れます。

✨トピック:人工知能 & 機械学習

✨レベル:300

✨講師プロフィール:
AWS Startup / ML ソリューションアーキテクト。好きな AWS サービスは Amazon SageMaker と Amazon Braket です。好きなミュージシャンは主に Red Hot Chili Peppers, Michael Jackson などです。

5c772b62f1974e9da3a88fbb4ef02696?s=128

Yoshitaka Haribara

October 21, 2020
Tweet

Transcript

  1. © 2020, Amazon Web Services, Inc. or its affiliates. All

    rights reserved. In Partnership with Anatomy of Amazon SageMaker Python SDK Yoshitaka Haribara, Ph.D. Startup ML Solutions Architect, AWS @_hariby D - 2 2 1 . 1 0 . 2 0 2 0
  2. © 2020, Amazon Web Services, Inc. or its affiliates. All

    rights reserved. In Partnership with このセッションの狙い SageMaker Python SDK を知って、 改めて SageMaker のインターフェースについて考えを整理する。 対象 • Amazon SageMaker は使ったことあり。 • SageMaker Python SDK も使い勝手はイメージできている。 • でも実際、裏でどう動いているかはそれほど深く考えたことがない。 • boto3 とかはボチボチ使ったことがある。
  3. © 2020, Amazon Web Services, Inc. or its affiliates. All

    rights reserved. In Partnership with 注意事項 ⚠ このセッションは SageMaker Python SDK の実装自体という 結構マニアックな内容が含まれています。 スライドにコードが書かれていますが、かなり簡略化されています。 正しくは実装を御覧下さい。
  4. © 2020, Amazon Web Services, Inc. or its affiliates. All

    rights reserved. In Partnership with 自己紹介 • 針原 佳貴 (はりばら よしたか) • 博士 (情報理工学) • Startup Machine Learning Solutions Architect • スタートアップの技術支援・機械学習導入支援 • 好きなサービスは Amazon SageMaker, ⚛ Amazon Braket • 明日、Kubeflow on EKSワークショップやります (Kubeflow Pipelines → SageMaker を呼ぶ)
  5. © 2020 Amazon Web Services, Inc. or its affiliates. All

    rights reserved. In Partnership with “Amazon SageMaker を 使いこなすことは即ち、 SageMaker Python SDK を 使いこなすこと” (私見です)
  6. © 2020, Amazon Web Services, Inc. or its affiliates. All

    rights reserved. In Partnership with
  7. © 2020, Amazon Web Services, Inc. or its affiliates. All

    rights reserved. In Partnership with SageMaker Python SDK https://github.com/aws/sagemaker-python-sdk
  8. © 2020, Amazon Web Services, Inc. or its affiliates. All

    rights reserved. In Partnership with インストール $ pip install sagemaker • SageMaker Notebook にはインストール済み • 好きな環境 (e.g., ローカルラップトップ) に入れられる • 2020-10-20 時点、最新版は v2.15.2
  9. © 2020, Amazon Web Services, Inc. or its affiliates. All

    rights reserved. In Partnership with 関心のある領域 Amazon SageMaker API Developer (you) SageMaker Python SDK
  10. © 2020, Amazon Web Services, Inc. or its affiliates. All

    rights reserved. In Partnership with 関心のある領域 Amazon SageMaker API Developer (you) SageMaker Python SDK
  11. © 2020, Amazon Web Services, Inc. or its affiliates. All

    rights reserved. In Partnership with 関心のある領域 Amazon SageMaker API Developer (you) SageMaker Python SDK AWS SDK for Python (boto3)
  12. © 2020, Amazon Web Services, Inc. or its affiliates. All

    rights reserved. In Partnership with 今日は SageMaker Python SDK を詳しく調べる SageMaker Python SDK
  13. © 2020, Amazon Web Services, Inc. or its Affiliates. Amazon

    SageMaker Jupyter Notebook/Lab Amazon S3 The Jupyter Trademark is registered with the U.S. Patent & Trademark Office. 開発 データは予め Amazon S3 にアップロード。 SageMaker Python SDK を使う方法: sagemaker_session.upload_data( path='data', key_prefix='data/DEMO') ☁ /
  14. © 2020, Amazon Web Services, Inc. or its Affiliates. Amazon

    SageMaker 開発 Jupyter Notebook/Lab Amazon S3 The Jupyter Trademark is registered with the U.S. Patent & Trademark Office.
  15. © 2020, Amazon Web Services, Inc. or its Affiliates. Amazon

    SageMaker 開発 Jupyter Notebook/Lab Amazon S3 学習 Amazon EC2 P3 Instances Amazon ECR The Jupyter Trademark is registered with the U.S. Patent & Trademark Office. ビルド済みコンテナイメージ , etc. or BYOC
  16. © 2020, Amazon Web Services, Inc. or its Affiliates. Amazon

    SageMaker 開発 学習 Amazon EC2 P3 Instances Jupyter Notebook/Lab Amazon S3 The Jupyter Trademark is registered with the U.S. Patent & Trademark Office. トレーニングでのメリット: • API 経由で学習用インスタ ンスを起動、 学習が完了すると自動停止 • 高性能なインスタンスを 秒課金で、 簡単にコスト削減 • 指定した数のインスタンス を同時起動、 分散学習も容易
  17. © 2020, Amazon Web Services, Inc. or its Affiliates. Amazon

    SageMaker 開発 学習 Amazon EC2 P3 Instances Jupyter Notebook/Lab Amazon S3 The Jupyter Trademark is registered with the U.S. Patent & Trademark Office.
  18. © 2020, Amazon Web Services, Inc. or its Affiliates. Amazon

    SageMaker 開発 推論 Amazon EC2 P3 Instances Jupyter Notebook/Lab Endpoint Amazon S3 Amazon ECR The Jupyter Trademark is registered with the U.S. Patent & Trademark Office.
  19. © 2020, Amazon Web Services, Inc. or its affiliates. All

    rights reserved. In Partnership with 最もベーシックな使い方: Fit, Deploy, Predict # Train my estimator pytorch_estimator = PyTorch(entry_point='train_and_deploy.py', instance_type='ml.p3.2xlarge’, instance_count=1, framework_version='1.5.0’, py_version='py3') pytorch_estimator.fit('s3://my_bucket/my_training_data/')
  20. © 2020, Amazon Web Services, Inc. or its affiliates. All

    rights reserved. In Partnership with 最もベーシックな使い方: Fit, Deploy, Predict # Train my estimator pytorch_estimator = PyTorch(entry_point='train_and_deploy.py', instance_type='ml.p3.2xlarge’, instance_count=1, framework_version='1.5.0’, py_version='py3') pytorch_estimator.fit('s3://my_bucket/my_training_data/') # Deploy my estimator to a SageMaker Endpoint and get a Predictor predictor = pytorch_estimator.deploy(instance_type='ml.m5.xlarge', initial_instance_count=1)
  21. © 2020, Amazon Web Services, Inc. or its affiliates. All

    rights reserved. In Partnership with 最もベーシックな使い方: Fit, Deploy, Predict # Train my estimator pytorch_estimator = PyTorch(entry_point='train_and_deploy.py', instance_type='ml.p3.2xlarge’, instance_count=1, framework_version='1.5.0’, py_version='py3') pytorch_estimator.fit('s3://my_bucket/my_training_data/') # Deploy my estimator to a SageMaker Endpoint and get a Predictor predictor = pytorch_estimator.deploy(instance_type='ml.m5.xlarge', initial_instance_count=1) # `data` is a NumPy array or a Python list. # `response` is a NumPy array. response = predictor.predict(data)
  22. © 2020, Amazon Web Services, Inc. or its affiliates. All

    rights reserved. In Partnership with entry_point のコード import argparse if __name__ == '__main__’: parser = argparse.ArgumentParser() # hyperparameters parser.add_argument('--epochs', type=int, default=10) # input data and model directories parser.add_argument('--train', type=str, default=os.environ['SM_CHANNEL_TRAIN']) parser.add_argument('--test', type=str, default=os.environ['SM_CHANNEL_TEST']) parser.add_argument('--model-dir', type=str, default=os.environ['SM_MODEL_DIR']) args, _ = parser.parse_known_args() … (以下省略) コンテナ内のパス (環境変数の中⾝): /opt/ml/input/data/train /opt/ml/input/data/test /opt/ml/model 環境変数 から取得 Script Mode では普通の Python スクリプトとして実⾏される。 はじめに環境変数からデータ・モデル⼊出⼒のパスを取得して、 そこを読み書きするように train.py を書く:
  23. © 2020, Amazon Web Services, Inc. or its affiliates. All

    rights reserved. In Partnership with
  24. © 2020, Amazon Web Services, Inc. or its affiliates. All

    rights reserved. In Partnership with Fit method (Train) # Train my estimator pytorch_estimator = PyTorch(entry_point='train_and_deploy.py’, instance_type='ml.p3.2xlarge’, instance_count=1, framework_version='1.5.0’, py_version='py3’) pytorch_estimator.fit('s3://my_bucket/my_training_data/’) SageMaker CreateTrainingJob API を呼び出しトレーニングを開始。 Estimator 作成時の設定と、指定したトレーニングデータを CreateTrainingJob に投げる。
  25. © 2020, Amazon Web Services, Inc. or its affiliates. All

    rights reserved. In Partnership with Fit の実装 (Estimator class) def fit(self, inputs=None, wait=True, logs="All", job_name=None, experiment_config=None): self._prepare_for_training(job_name=job_name) self.latest_training_job = _TrainingJob.start_new(self, inputs, experiment_config) self.jobs.append(self.latest_training_job) if wait: self.latest_training_job.wait(logs=logs)
  26. © 2020, Amazon Web Services, Inc. or its affiliates. All

    rights reserved. In Partnership with _TrainingJob class の実装 class _TrainingJob(_Job): @classmethod def start_new(cls, estimator, inputs, experiment_config): """Create a new Amazon SageMaker training job from the estimator. """ train_args = cls._get_train_args(estimator, inputs, experiment_config) estimator.sagemaker_session.train(**train_args) return cls(estimator.sagemaker_session, estimator._current_job_name) ここで、 _get_train_args を使ってパラメータを変換している点に注意。
  27. © 2020, Amazon Web Services, Inc. or its affiliates. All

    rights reserved. In Partnership with Session class class sagemaker.session.Session(boto_session=None, sagemaker_client=None, sagemaker_runtime_client=None, default_bucket=None) Amazon SageMaker API や、その他必要な AWS サービスとの連携を管理している。 このクラスは、トレーニングジョブやエンドポイント、S3 の⼊⼒データセットなど、Amazon SageMaker が使⽤するエンティティやリソースを操作するための便利なメソッドを提供。 AWS サービスの呼び出しは、Boto3 セッションに委譲。 デフォルトでは AWS credential provider chain (様々な認証情報から適切なもの) を使⽤し初期 化。 S3 バケットにアクセスする Amazon SageMaker API コールを⾏い、バケットが存在しない場合、 セッションは AWS アカウント ID を含む命名規則に基づいてデフォルトのバケットを作成。 命名規則: sagemaker-{region}-{AWS account ID}
  28. © 2020, Amazon Web Services, Inc. or its affiliates. All

    rights reserved. In Partnership with session.train ≠ SageMaker.Client.create_training_job train(input_mode, input_config, role, job_name, output_config, resource_config, vpc_config, hyperparameters, stop_condition, tags, metric_definitions, enable_network_isolation=False, image_uri=None, algorithm_arn=None, encrypt_inter_container_traffic=False, use_spot_instances=False, checkpoint_s3_uri=None, checkpoint_local_path=None, experiment_config=None, debugger_rule_configs=None, debugger_hook_config=None, tensorboard_output_config=None, enable_sagemaker_metrics=None) _get_train_request: session.train → SageMaker.Client.create_training_job によりパラメータ変換していることに注意。 こんな感じ def train(input_param): train_request = self._get_train_request(**input_param) self.sagemaker_client.create_training_job(**train_request)
  29. © 2020, Amazon Web Services, Inc. or its affiliates. All

    rights reserved. In Partnership with ここまでで発見したもの • Estimator クラスのインスタンスメソッド fit は estimator, inputs, experiment_config を渡して クラスメソッド_TrainingJob.start_new を呼んでいる。 • 更にその中で _get_train_request で変換したものを渡して sagemaker.session.Session.train が呼ばれている。 • Session.train の中では _get_train_request でパラメータを変 換し、Boto3 の SageMaker.Client.create_training_job (SageMaker CreateTrainingJob API) を呼んでいる。
  30. © 2020, Amazon Web Services, Inc. or its affiliates. All

    rights reserved. In Partnership with
  31. © 2020, Amazon Web Services, Inc. or its affiliates. All

    rights reserved. In Partnership with model1 model2 Production variants Endpoint config 推論 Endpoint create / update Estimator.deploy の裏で起こっていること
  32. © 2020, Amazon Web Services, Inc. or its affiliates. All

    rights reserved. In Partnership with エンドポイント作成の流れ (AWS CLI) Model aws sagemaker create-model --model-name model1 --primary-container ‘{“Image”: “123.dkr.ecr.amazonaws.com/algo”, “ModelDataUrl”: “s3://bkt/model1.tar.gz”}’ --execution-role-arn arn:aws:iam::123:role/me
  33. © 2020, Amazon Web Services, Inc. or its affiliates. All

    rights reserved. In Partnership with エンドポイント作成の流れ (AWS CLI) Model Endpoint configuration aws sagemaker create-model --model-name model1 --primary-container ‘{“Image”: “123.dkr.ecr.amazonaws.com/algo”, “ModelDataUrl”: “s3://bkt/model1.tar.gz”}’ --execution-role-arn arn:aws:iam::123:role/me aws sagemaker create-endpoint-config --endpoint-config-name model1-config --production-variants ‘{“InitialInstanceCount”: 2, “InstanceType”: “ml.m4.xlarge”, “InitialVariantWeight”: 1, “ModelName”: “model1”, “VariantName”: “AllTraffic”}’
  34. © 2020, Amazon Web Services, Inc. or its affiliates. All

    rights reserved. In Partnership with エンドポイント作成の流れ (AWS CLI) Model Endpoint configuration Endpoint aws sagemaker create-model --model-name model1 --primary-container ‘{“Image”: “123.dkr.ecr.amazonaws.com/algo”, “ModelDataUrl”: “s3://bkt/model1.tar.gz”}’ --execution-role-arn arn:aws:iam::123:role/me aws sagemaker create-endpoint-config --endpoint-config-name model1-config --production-variants ‘{“InitialInstanceCount”: 2, “InstanceType”: “ml.m4.xlarge”, “InitialVariantWeight”: 1, “ModelName”: “model1”, “VariantName”: “AllTraffic”}’ aws sagemaker create-endpoint --endpoint-name my-endpoint --endpoint-config-name model1-config
  35. © 2020, Amazon Web Services, Inc. or its affiliates. All

    rights reserved. In Partnership with
  36. © 2020, Amazon Web Services, Inc. or its affiliates. All

    rights reserved. In Partnership with Deploy method # Deploy my estimator to a SageMaker Endpoint and get a Predictor predictor = pytorch_estimator.deploy(instance_type='ml.m5.xlarge’, initial_instance_count=1) トレーニング済のモデルを Amazon SageMaker エンドポイントへデプロイし、 sagemaker.Predictor オブジェクトを返す
  37. © 2020, Amazon Web Services, Inc. or its affiliates. All

    rights reserved. In Partnership with deploy の実装 (Estimator class) def deploy(…): … if SageMaker Neo (後述) でコンパイル済: model = self._compiled_models[family] else: model = self.create_model(**kwargs) … return model.deploy()
  38. © 2020, Amazon Web Services, Inc. or its Affiliates. AWS

    Neuron SDK https://github.com/aws/aws-neuron-sdk コンパイル Neuron コンパイラ (NCC) NEFF を出⼒ Neuron バイナリ (NEFF) デプロイ Neuron ランタイム (NRT) プロファイル Neuron ツール C:\>code --version 1.1.1
  39. © 2020, Amazon Web Services, Inc. or its affiliates. All

    rights reserved. In Partnership with create_model def create_model(…): … return Model(…) Model オブジェクトを返していて、 それに対して model.deploy を呼んでいた class Model(object): def deploy(…): # この中で⾊々やっている
  40. © 2020, Amazon Web Services, Inc. or its affiliates. All

    rights reserved. In Partnership with Model class class Model(object): def deploy(…): … # self.sagemaker_session.create_model(…) self._create_sagemaker_model(…) # sagemaker.production_variant(…) production_variant = sagemaker.production_variant(…) # self.create_endpoint(…) self.sagemaker_session.endpoint_from_production_variants(…) この3つを順番に調べる。
  41. © 2020, Amazon Web Services, Inc. or its affiliates. All

    rights reserved. In Partnership with 1. self._create_sagemaker_model(…) self.sagemaker_session.create_model(…) これは理解できる。実際、中では boto3 を使って self.sagemaker_client.create_model(**create_model_reque st) を呼んでいる (SageMaker CreateModel API)。
  42. © 2020, Amazon Web Services, Inc. or its affiliates. All

    rights reserved. In Partnership with def production_variant(): production_variant_configuration = { "ModelName": model_name, "InstanceType": instance_type, "InitialInstanceCount": initial_instance_count, "VariantName": variant_name, "InitialVariantWeight": initial_weight, } return production_variant_configuration この時点では単純に Python の dict を返している。 2. sagemaker.production_variant(…)
  43. © 2020, Amazon Web Services, Inc. or its affiliates. All

    rights reserved. In Partnership with def endpoint_from_production_variants(): if not _deployment_entity_exists(…): self.sagemaker_client.create_endpoint_config(**config_options) return self.create_endpoint(…) boto3 の create_endpoint_config (SageMaker CreateEndpointConfig API) を呼び出し ている。 その後、 self.create_endpoint を呼んでいる。この中身は、 self.sagemaker_client.create_endpoint (Sagemaker CreateEndpiont API)。 3. sagemaker_session .endpoint_from_production_variants(…)
  44. © 2020, Amazon Web Services, Inc. or its affiliates. All

    rights reserved. In Partnership with ここまでに見つけたもの • estimator.deploy の中では、self.create_model で Model オブジェクト を作成。 • model.deploy の中で色々やっていて、 • sagemaker_session.create_model (SageMaker CreateModel API) • sagemaker.production_variant (dict を返す) • sagemaker_client.create_endpoint_config (SageMaker CreateEndpointConfig API) と sagemaker_client.create_endpoint (Sagemaker CreateEndpiont API) を呼 び出し。 • Predictor を返す。
  45. © 2020, Amazon Web Services, Inc. or its affiliates. All

    rights reserved. In Partnership with
  46. © 2020, Amazon Web Services, Inc. or its affiliates. All

    rights reserved. In Partnership with Predict method # `data` is a NumPy array or a Python list. # `response` is a NumPy array. response = predictor.predict(data)
  47. © 2020, Amazon Web Services, Inc. or its affiliates. All

    rights reserved. In Partnership with predictor.predict def predict(…): request_args = self._create_request_args() response = self.sagemaker_session.sagemaker_runtime_client.invoke_ endpoint(**request_args) return self._handle_response(response) 指定したエンドポイントからの推論結果を返す sagemaker_session.sagemaker_runtime_client.invoke_endpo int (SageMaker Client InvokeEndpoint)
  48. © 2020, Amazon Web Services, Inc. or its affiliates. All

    rights reserved. In Partnership with
  49. © 2020, Amazon Web Services, Inc. or its affiliates. All

    rights reserved. In Partnership with EstimatorBase, Estimator classes
  50. © 2020, Amazon Web Services, Inc. or its affiliates. All

    rights reserved. In Partnership with EstimatorBase class class sagemaker.estimator.EstimatorBase(role, instance_count=None, instance_type=None, volume_size=30, volume_kms_key=None, max_run=86400, input_mode='File', output_path=None, output_kms_key=None, base_job_name=None, sagemaker_session=None, tags=None, subnets=None, security_group_ids=None, model_uri=None, model_channel_name='model', metric_definitions=None, encrypt_inter_container_traffic=False, use_spot_instances=False, max_wait=None, checkpoint_s3_uri=None, checkpoint_local_path=None, rules=None, debugger_hook_config=None, tensorboard_output_config=None, enable_sagemaker_metrics=None, enable_network_isolation=False, **kwargs) Amazon SageMaker のトレーニングとデプロイのタスクを End-to-End で処理。
  51. © 2020, Amazon Web Services, Inc. or its affiliates. All

    rights reserved. In Partnership with EstimatorBase class abstract training_image_uri() abstract hyperparameters() サブクラスで、どの Docker イメージをトレーニングに使うか、どのハイパーパラメータを使う か、それとどのように適切な推論インスタンスを作成する方法を定義する。
  52. © 2020, Amazon Web Services, Inc. or its affiliates. All

    rights reserved. In Partnership with EstimatorBase class fit(inputs=None, wait=True, logs='All', job_name=None, experiment_config=None) compile_model(target_instance_family, input_shape, output_path, framework=None, framework_version=None, compile_max_run=900, tags=None, target_platform_os=None, target_platform_arch=None, target_platform_accelerator=None, compiler_options=None, **kwargs) deploy(initial_instance_count, instance_type, serializer=None, deserializer=None, accelerator_type=None, endpoint_name=None, use_compiled_model=False, wait=True, model_name=None, kms_key=None, data_capture_config=None, tags=None, **kwargs) transformer(instance_count, instance_type, strategy=None, assemble_with=None, output_path=None, output_kms_key=None, accept=None, env=None, max_concurrent_transforms=None, max_payload=None, tags=None, role=None, volume_kms_key=None, vpc_config_override='VPC_CONFIG_DEFAULT', enable_network_isolation=None, model_name=None)
  53. © 2020, Amazon Web Services, Inc. or its affiliates. All

    rights reserved. In Partnership with Estimator class class sagemaker.estimator.Estimator(image_uri, role, instance_count=None, instance_type=None, volume_size=30, volume_kms_key=None, max_run=86400, input_mode='File', output_path=None, output_kms_key=None, base_job_name=None, sagemaker_session=None, hyperparameters=None, tags=None, subnets=None, security_group_ids=None, model_uri=None, model_channel_name='model', metric_definitions=None, encrypt_inter_container_traffic=False, use_spot_instances=False, max_wait=None, checkpoint_s3_uri=None, checkpoint_local_path=None, enable_network_isolation=False, rules=None, debugger_hook_config=None, tensorboard_output_config=None, enable_sagemaker_metrics=None, **kwargs) 任意のアルゴリズムを使ってトレーニングするための汎⽤的な Estimator クラス。 このクラスは、独⾃のカスタムクラスを持たないアルゴリズムで使⽤するために設計されている。
  54. © 2020, Amazon Web Services, Inc. or its affiliates. All

    rights reserved. In Partnership with
  55. © 2020, Amazon Web Services, Inc. or its affiliates. All

    rights reserved. In Partnership with Framework classes
  56. © 2020, Amazon Web Services, Inc. or its affiliates. All

    rights reserved. In Partnership with PyTorch classes
  57. © 2020, Amazon Web Services, Inc. or its affiliates. All

    rights reserved. In Partnership with
  58. © 2020, Amazon Web Services, Inc. or its affiliates. All

    rights reserved. In Partnership with v2 への移行 $ sagemaker-upgrade-v2 --in-file input.py --out-file output.py $ sagemaker-upgrade-v2 --in-file input.ipynb --out-file output.ipynb https://sagemaker.readthedocs.io/en/stable/v2.html#automatically-upgrade-your-code
  59. © 2020, Amazon Web Services, Inc. or its affiliates. All

    rights reserved. In Partnership with
  60. Thank you! © 2020, Amazon Web Services, Inc. or its

    affiliates. All rights reserved. In Partnership with Yoshitaka Haribara @_hariby