Upgrade to Pro — share decks privately, control downloads, hide ads and more …

🐳 DeepSeek を AWS で動かす / DeepSeek on AWS (ja)

🐳 DeepSeek を AWS で動かす / DeepSeek on AWS (ja)

本むベントでは、DeepSeek を題材に、生成 AI 基盀モデル・倧芏暡蚀語モデル (LLM) のデプロむに関する実践的なガむダンスを提䟛したす。

たず、DeepSeek-R1-Zero、DeepSeek-R1 およびその蒞留モデル (DeepSeek-R1-Distill-Qwen, DeepSeek-R1-Distill-Llama) の技術解説を行いたす。続いお、AWS で LLM のデプロむメントに利甚可胜なアクセラレヌタ (NVIDIA GPU, AWS Trainium/Inferentia) や、AWS サヌビスの遞択肢 (Amazon Bedrock, Amazon SageMaker AI, Amazon EC2) を玹介したす。

埌半では、実際に DeepSeek-R1 の蒞留モデルを甚いたハンズオン圢匏のワヌクショップを行い、コストずパフォヌマンスの最適化に぀いおも取り䞊げたす。本セッションを通じお、LLM デプロむメント戊略に圹立぀知芋を提䟛したす。

https://aws.amazon.com/startups/events/deepseek-workshop

Avatar for Yoshitaka Haribara

Yoshitaka Haribara

February 19, 2025
Tweet

More Decks by Yoshitaka Haribara

Other Decks in Technology

Transcript

  1. © 2025, Amazon Web Services, Inc. or its affiliates. All

    rights reserved. 3 © 2025, Amazon Web Services, Inc. or its affiliates. All rights reserved. Yoshitaka Haribara, Ph.D. A W S S T A R T U P L O F T T O K Y O Sr. GenAI/Quantum Startup Solutions Architect AWS 🐳 DeepSeek を AWS で動かす
  2. © 2025, Amazon Web Services, Inc. or its affiliates. All

    rights reserved. 4 • 背景: Reasoning (論理的掚論) ず Chain-of-Thought (CoT) • DeepSeek-R1 ず蒞留モデルの抂芁 • AI アクセラレヌタ: NVIDIA H200 GPU ず AWS Trainium, Inferentia • AWS サヌビスの遞択: Amazon Bedrock Marketplace, Amazon SageMaker AI, Amazon EC2 • ハンズオン: Amazon Bedrock Marketplace, Amazon SageMaker (ml.inf2.xlarge) Agenda
  3. © 2025, Amazon Web Services, Inc. or its affiliates. All

    rights reserved. 5 Generative Pre-trained Transformer (GPT) に代衚されるように、倧量の孊習 デヌタを甚いた自己教垫あり孊習により、高い汎化性胜を持぀蚀語モデル: LLM 事前孊習 (Pre-training) は重芁なパラダむムであり、スケヌリング則 (モデル サむズ・デヌタ量・蚈算リ゜ヌス) を増やすず性胜が単調増加: Scaling law Supervised Finetuning (SFT) や Chain-of-Thought (CoT) に代衚される In-context learning ずいった事埌孊習が埐々に脚光を济び、 掚論に時間をかけるこずで高い性胜が達成されるようになっおきた LLM の孊習: 事前孊習・事埌孊習
  4. © 2025, Amazon Web Services, Inc. or its affiliates. All

    rights reserved. 6 課題: 元々 LLM は算数や論理的掚論が䞍埗意 (cf. 非圢匏的 挔繹的 掚論; Informal deductive reasoning) これを克服するため、以䞋のような手法が研究・提案されおきた (埌述) • Chain-of-Thought (CoT) • Zero-shot CoT • CoT with Self-consistency (Multi-path CoT) • Tree of thought (ToT) Reasoning (論理的掚論)
  5. © 2025, Amazon Web Services, Inc. or its affiliates. All

    rights reserved. 7 LLM に䞭間的な掚論ステップを生成させるこずで、論理的掚論胜力が向䞊 モデル入力 Q: ロゞャヌは🎟を5個持っおいる。2猶の🎟を買い足した。 1猶には3個の🎟が入っおる。今、圌は合蚈で䜕個の🎟を持っおる A: ロゞャヌは🎟を最初5個持っおいた。3個入り x 2猶分 (6個) 🎟を買い足した。 5 + 6 = 11で、答えは11個。 Q: カフェテリアに🍏が23個あった。そのうち20個をランチの準備に䜿い、 新たに6個買い足した。今、カフェテリアに🍏は䜕個 モデル出力 A: カフェテリアには最初🍏が23個。ランチの準備に20個䜿った。 よっお23 - 20 = 3個に。6個買い足し、3 + 6 = 9個。答えは9個。 Chain-of-Thought (CoT) [arXiv:2201.11903]
  6. © 2025, Amazon Web Services, Inc. or its affiliates. All

    rights reserved. 8 “Let‘s think step by step” (「ステップバむステップで考えよう」) ず蚀うだけ モデル入力 Q: ゞョンは16個のフルヌツを買いたい。 フルヌツの半分はリンゎで、そのの半分は🍏にする。🍏は䜕個 A: ステップバむステップで考えよう。 モデル出力 フルヌツは党郚で16個。 フルヌツの半分がリンゎ、぀たりリンゎは8個。 リンゎの半分が🍏、぀たり🍏は4個。 Zero-shot CoT [arXiv:2205.11916]
  7. © 2025, Amazon Web Services, Inc. or its affiliates. All

    rights reserved. 9 耇数経路の掚論を行いそれらの回答から最終的な結果を導く ゞャネットの🪿は1日に16個の🥚を産む。圌女は毎朝3個を朝食に食べ、 毎日4個を䜿っお友人のためにマフィンを焌く。残りの🥚は1個2ドルで売る。 圌女は毎日いくら皌げる • 圌女は16 - 3 - 4 = 9個の🥚が残る。∎ 1日に2ドル × 9 = 18ドル皌ぐ。 • 圌女は残りの卵を2ドル × (16 - 4 - 3) = 26ドルで売るこずになる。 • 圌女は朝食に3個食べ、16 - 3 = 13個が残る。次にマフィンを焌き、13 - 4 = 9個の卵が残る。∎ 9個の卵 × 2ドル = 18ドル ⇒ 答えは18ドル Multi-path CoT with Self-consistency [arXiv:2203.11171]
  8. © 2025, Amazon Web Services, Inc. or its affiliates. All

    rights reserved. 10 朚構造のような思考プロセスにより解を探玢 Tree of thought (ToT) [arXiv:2305.08291, arXiv:2305.10601] ToT 論文 [arXiv:2305.10601] より
  9. © 2025, Amazon Web Services, Inc. or its affiliates. All

    rights reserved. 13 課題: LLM で事埌孊習に匷化孊習を䜿うず蚈算量が倚くなる 事前孊習枈み DeepSeek-V3-Base をベヌスに、匷化孊習フレヌムワヌクずしお Group Relative Policy Optimization (GRPO) [arXiv:2402.03300] を採甚 DeepSeek-R1-Zero [arXiv:2501.12948] DeepSeekMath [arXiv:2402.03300] の図を改倉 Rule-based
  10. © 2025, Amazon Web Services, Inc. or its affiliates. All

    rights reserved. 14 課題: 可読性や蚀語の混圚 匷化孊習する前に少量のコヌルドヌスタヌトデヌタず耇数ステヌゞ孊習: • DeepSeek-V3-Base の fine-tuining のため数千のコヌルドスタヌトデヌタ収集 • DeepSeek-R1-Zero のように Reasoning 重芖の匷化孊習 • 匷化孊習が収束しそうな時点で棄华サンプリングにより SFT デヌタを収集。 DeepSeek-V3 のドメむン教垫デヌタ (writing, factural QA, self-congnition) も 合わせる • DeepSeek-V3-Base を再孊習 (SFT) • 孊習埌、党おのシナリオで匷化孊習させたチェックポむントが DeepSeek-R1 DeepSeek-R1 [arXiv:2501.12948]
  11. © 2025, Amazon Web Services, Inc. or its affiliates. All

    rights reserved. 15 DeepSeek-R1 から蒞留したモデル: Qwen ず Llama の2系統 • DeepSeek-R1-Distill-Qwen (1.5B, 7B, 14B, 32B) • DeepSeek-R1-Distill-Llama (8B, 70B) DeepSeek-R1-Distill (蒞留モデル) [arXiv:2501.12948] https://github.com/deepseek-ai/DeepSeek-R1
  12. © 2025, Amazon Web Services, Inc. or its affiliates. All

    rights reserved. 16 DeepSeek ずいく぀かの蒞留モデルの重みが公開 ベヌス (V3) ず R1 モデル (671B パラメヌタ) • DeepSeek-V3: ベヌス Mixture of Expert (MoE) モデル • DeepSeek-R1-Zero: 匷化孊習のみ • DeepSeek-R1: コヌルドスタヌトデヌタしお匷化孊習 Distilled Models • DeepSeek-R1-Distill-Qwen (1.5B, 7B, 14B, 32B) • DeepSeek-R1-Distill-Llama (8B and 70B) DeepSeek は 耇数のタスクにおいお 高床な論理的掚論胜力を発揮
  13. © 2025, Amazon Web Services, Inc. or its affiliates. All

    rights reserved. 17 䞻な胜力 • 耇雑な問題解決のための高い論理的掚論胜力 (数孊やコヌディングぞの応甚) • AIME 2024, MATH-500, and SWE-bench などの ベンチマヌクで高い性胜 • 671B パラメヌタの Mixture of Experts (MoE) アヌキテクチャ • 37B activation parameter • DeepSeek-R1 の掚論には FP8 で少なくずも 800 GB of HBM が必芁
  14. © 2025, Amazon Web Services, Inc. or its affiliates. All

    rights reserved. 18 Model Options • レむテンシ・コストを抑え぀぀ コア機胜を保った蒞留モデル • 耇数サむズを芁件に応じお遞択 • DeepSeek-R1-Distill-Qwen (1.5B, 7B, 14B, 32B) • DeepSeek-R1-Distill-Llama (8B, 70B)
  15. © 2025, Amazon Web Services, Inc. or its affiliates. All

    rights reserved. 19 AI/ML 向け EC2 アクセラレヌタむンスタンス G6 (L4) P5 (H100) DL1 G6e (L40S) P4 (A100) P5e (H200) Inf1 Inf2 P5en (H200) Trn1 GPUs AI/ML accelerators and ASICs Trn2 G5 (A10G) AWS Trainium, Inferentia H100, H200, B200, GB200, A100, L40S, L4, A10G Cloud AI100 Standard Radeon GPU Xilinx accelerator Xilinx FPGA DL2q Gaudi accelerator Announced GB200 B200
  16. © 2025, Amazon Web Services, Inc. or its affiliates. All

    rights reserved. 20 CPU CPU NSC EBS Host EFA PCIe SSD EFA SSD 
 Switching layer PCIe PCIe PCIe ML chip interconnect ML chip ML chip ML chip ML chip 
 Accelerators Accelerated compute architecture
  17. © 2025, Amazon Web Services, Inc. or its affiliates. All

    rights reserved. 23 P5 むンスタンス Optimized for AI training and inference 900 GB/s NVSwitch for GPU peer-to-peer connections Scale-out with non-blocking interconnect Elastic Fabric Adapter (EFA) Instance GPU GPU memory CPU vCPU Instance memory Networking Local storage P5 8 NVIDIA H100 640 GB AMD Milan 192 2 TB 3200 Gbps EFAv2 30 TB SSD P5e 8 NVIDIA H200 1128 GB AMD Milan 192 2 TB 3200 Gbps EFAv2 30 TB SSD P5en 8 NVIDIA H200 1128 GB Intel SPR 192 2 TB 3200 Gbps EFAv3 30 TB SSD
  18. © 2025, Amazon Web Services, Inc. or its affiliates. All

    rights reserved. 24 G5/G6 むンスタンス G Compute and graphics optimized GPUs Flexibility with multiple instance sizes Great for single GPU or single node workloads Instance GPU GPU memory CPU vCPU Instance memory Networking Local storage G5 Up to 8 NVIDIA A10G Up to 192 GB AMD Rome Up to 192 Up to 768 GB Up to 100 Gbps Up to 7.6 TB SSD G6 Up to 8 NVIDIA L4 Up to 192 GB GDDR6 AMD Milan Up to 192 Up to 768 GB Up to 100 Gbps Up to 7.6 TB SSD G6e Up to 8 NVIDIA L40S Up to 384 GB GDDR6 AMD Milan Up to 192 Up to 1.536 TB Up to 400 Gbps Up to 7.6 TB SSD
  19. © 2025, Amazon Web Services, Inc. or its affiliates. All

    rights reserved. 25 Bedrock Marketplace implementation • Bedrock Marketplace で DeepSeek-R1 ず蒞留モデルが利甚可胜 • 簡単デプロむ • Bedrock のセキュリティ・モニタリング機胜が 利甚できる
  20. © 2025, Amazon Web Services, Inc. or its affiliates. All

    rights reserved. 26 Bedrock Marketplace delivers 100+ models from 30+ providers EVOLUTIONARY SCALE WIDN CAMB.AI GRETEL ARCEE AI PREFERRED NETWORKS WRITER UPSTAGE NCSOFT STOCKMARK KARAKURI JOHN SNOW LABS LIQUID DATABRICKS CYBERAGENT HUGGING FACE STABILITY AI LG AI RESEARCH M I S T R A L AI SNOWFLAKE N V I D I A DEEPSEEK
  21. © 2025, Amazon Web Services, Inc. or its affiliates. All

    rights reserved. 27 事前準備: デプロむ前に䞊限緩和 (R1: ml.p5e.48xlarge)
  22. © 2025, Amazon Web Services, Inc. or its affiliates. All

    rights reserved. 28 Step1: Model catalog で DeepSeek-R1 モデルを芋぀ける
  23. © 2025, Amazon Web Services, Inc. or its affiliates. All

    rights reserved. 29 Step2: デプロむ
  24. © 2025, Amazon Web Services, Inc. or its affiliates. All

    rights reserved. 30 Step3: Playground か InvokeModel API で遊ぶ
  25. © 2025, Amazon Web Services, Inc. or its affiliates. All

    rights reserved. 32 ハンズオン1: Amazon Bedrock Marketplace • Amazon Bedrock を開く • Model catalog を開く • DeepSeek ず打ち蟌んでフィルタヌ • DeepSeek-R1-Distill-Llama-8B などを遞んでデプロむ • Marketplace deployments に衚瀺されおるモデルを遞び Playground • https://gist.github.com/hariby/c6b4d1f7ceee8e976b8b752c388d7ae5
  26. © 2025, Amazon Web Services, Inc. or its affiliates. All

    rights reserved. 36 Tips: 適切な chat template (model tokenizer) を利甚 36 <begin▁of▁sentence><User>A man has 53 socks in his drawer: 21 identical blue, 15 identical black and 17 identical red. The lights are out, and he is completely in the dark. How many socks must he take out to make 100 percent certain he has at least one pair of black socks?<Assistant> Bedrock Playground で䜿う際には、適切な chat template タグを぀ける必芁がある: InvokeModel API を䜿う堎合、適切な tokenizer を䜿う必芁がある: tokenizer = AutoTokenizer.from_pretrained(hf_model_id) messages = [{"role": "user", "content": test_prompt}] prompt = tokenizer.apply_chat_template(messages, tokenize=False, add_generation_prompt=not continuation) Bad quality output Good quality output
  27. © 2025, Amazon Web Services, Inc. or its affiliates. All

    rights reserved. 37 DeepSeek-R1 利甚における「責任ある AI」 37 (through the ApplyGuardrail API) can provide an extra layer of security and responsible AI measures
  28. © 2025, Amazon Web Services, Inc. or its affiliates. All

    rights reserved. 38 Enterprise Protection • Enterprise-grade security features built-in • Complete data privacy when using AWS services • No data sharing with model providers • End-to-end encryption for all operations • Access controls and governance features • Compliance with AWS security standards
  29. © 2025, Amazon Web Services, Inc. or its affiliates. All

    rights reserved. 39 Critical Concerns • Models hosted by AWS without any communication with DeepSeek servers or APIs • No customer data used to improve base models • Enterprise data protection capabilities • Privacy control through AWS services
  30. © 2025, Amazon Web Services, Inc. or its affiliates. All

    rights reserved. 43 Custom Model Import implementation • Bedrock Custom Model Import enables DeepSeek deployment • Support for Llama 8B and 70B distilled DeepSeek R1 variants • Complete code samples and step-by-step deployment guides provided for quick implementation • Standard Bedrock security and monitoring features • Pricing is on-demand in 5-minute window from first successful invocation • There is a cold-start and scaling up/down time
  31. © 2025, Amazon Web Services, Inc. or its affiliates. All

    rights reserved. 49 Customizing & Deploying models with Amazon SageMakerAI Select Evaluate Customize Deploy Amazon Build, train, and deploy ML models—including FMs—for any use case with fully managed infrastructure, tools, and workflows Data Scientists and ML Engineers Bring Your Own Model Bring Your Own Container Choice of hundreds of models from SageMaker JumpStart Automated & Human evaluation of models Customize models to your use cases by pre-training, fine-tuning, model distillation, etc. Optimize and deploy for inference
  32. © 2025, Amazon Web Services, Inc. or its affiliates. All

    rights reserved. 61 Trn1/Trn2 むンスタンス Powered by AWS Trainium/Trainium2 custom ML chips Optimized for large-scale training distributed workloads Trn2 Ultraservers with extended NeuronLink for trillion-parameter AI Neuron Kernel Interface (NKI) for custom operators Instance Accelerators Accelerator memory vCPU Instance memory Networking trn1.32xlarge 16 512 GB 128 512 GB 800 Gbps EFAv2 trn1n.32xlarge 16 512 GB 128 512 GB 1600 Gbps EFAv2 trn2.48xlarge 16 1.5 TB 192 2 TB 3.2 Tbps EFAv3
  33. © 2025, Amazon Web Services, Inc. or its affiliates. All

    rights reserved. 62 AWS Trainium/Inferentia アヌキテクチャ • Tensor engine are based on power-optimized systolic array • AWS Neuron SDK supports typical architecutres such as Llama
  34. © 2025, Amazon Web Services, Inc. or its affiliates. All

    rights reserved. 63 ハンズオン2: Amazon SageMaker AI • SageMaker AI のペヌゞを開く • Notebook むンスタンスを䜜る (CPU むンスタンスで ok) • Hugging Face で䜿いたいモデルを遞び、 Deploy > Amazon SageMaker から Python スクリプトをコピヌ https://huggingface.co/collections/deepseek-ai/deepseek-r1 • deepseek-ai/DeepSeek-R1-Distill-Llama-8B • Noteobok むンスタンス䞊から SageMaker Python SDK で ゚ンドポむントに DeepSeek-Distill モデルデプロむ
  35. © 2025, Amazon Web Services, Inc. or its affiliates. All

    rights reserved. 64 ハンズオン2: Amazon SageMaker AI • SageMaker Notebook むンスタンスの Jupyter Notebook に Python スクリプトを貌り付ける • huggingface_model.deploy: デプロむ, predictor.predict: 掚論
  36. © 2025, Amazon Web Services, Inc. or its affiliates. All

    rights reserved. 66 DeepSeek-R1 on AWS たずめ 1. Amazon Bedrock Marketplace (Amazon SageMaker JumpStart) で DeepSeek-R1/Distill モデルのデプロむ 2. Amazon SageMaker AI Inf2 むンスタンスぞ DeepSeek-R1-Distill モデルのデプロむ 3. Amazon Bedrock Custom Model Import で DeepSeek-R1-Distill モデルのデプロむ DeepSeek on AWS ブログもありたす ↑ https://aws.amazon.com/jp/blogs/news/deepseek-r1-models-now-available-on-aws/
  37. © 2025, Amazon Web Services, Inc. or its affiliates. All

    rights reserved. 67 AWS のための✣成 AI アプリ構築実践ガむド 察象読者✣成 AI の本栌掻✀を怜蚎しおいる技術者 ✣成 AI アプリを構築するための基瀎抂念を解説 (プロンプト゚ンゞニアリング、RAG、゚ヌゞェント) 基瀎抂念の応✀をするためのより実践的なハンズオン (RAG、゚ヌゞェント) 本番導⌊するためのポむントも解説 (責任ある AI, Working Backwards, etc.) 67 2025幎春頃発売予定Amazon で予玄できたす https://www.amazon.co.jp/dp/4296205234
  38. © 2025, Amazon Web Services, Inc. or its affiliates. All

    rights reserved. 69 Thank you! © 2025, Amazon Web Services, Inc. or its affiliates. All rights reserved. Yoshitaka Haribara, Ph.D. X: @_hariby
  39. © 2025, Amazon Web Services, Inc. or its affiliates. All

    rights reserved. 70 Further reading • DeepSeek • Anthropic CEO Dario Blog • https://darioamodei.com/on-deepseek-and-export-controls • Startup Customer Case Studies on AWS • Sakana AI • https://aws.amazon.com/startups/learn/letting-nature-lead-how-sakana-ai-is- transforming-model-building?lang=en-US • ELYZA (Llama2 Speculative Decoding on AWS Inferentia2 chip) • https://aws.amazon.com/jp/blogs/startup/tech-interview-elyza-2024/ • LLM Development on Trn1 • https://aws.amazon.com/jp/blogs/machine-learning/unlocking-japanese-llms- with-aws-trainium-innovators-showcase-from-the-aws-llm-development-support- program/