rights reserved. Amazon Confidential and Trademark. 17 GPT-OSS 120b • Mixture of Expert MOE • 128 experts • 117b parameters with 5.1b parameters active per token • 4-bit quantization scheme using mxfp4 format. • fits in a single 80 GB GPU • for production, general purpose, high reasoning scenarios • Apache 2.0 license
rights reserved. Amazon Confidential and Trademark. 18 GPT-OSS 20b • Mixture of Expert MOE • 32 experts • 21b parameters with 3.6b parameters active per token • 4-bit quantization scheme using mxfp4 format. • fits in a single 16 GB GPU • for lower latency, on-device and consumer hardware usage • Apache 2.0 license
rights reserved. Amazon Confidential and Trademark. 21 GPT-OSS アーキテクチャー https://www.linkedin.com/posts/xiaolishen_llm-airesearch-transformers-activity-7358864067916152833-IS2I/
rights reserved. Amazon Confidential and Trademark. 47 Amazon SageMaker AI Hugging Face Deep Learning Containers Deep Learning Containers を⽤いて学習 & 推論 両⽅において Hugging Face のオープンソース エコシステムを Amazon SageMaker で活⽤ • SageMaker JumpStart • SageMaker Training Jobs • SageMaker HyperPod • Amazon EC2 GPU インスタンス
rights reserved. Amazon Confidential and Trademark. 50 SageMaker Training Jobs ⾼レベルな概要図 ジョブの⼊⼒ (e.g. data in Amazon S3 bucket) ジョブの出⼒ (e.g. model artifacts in Amazon S3 bucket) ⼀時的な計算クラスター Invoke API [e.g. CreateTrainingJob()] ジョブコードと Docker コンテナとしてのランタイム