Slide 1

Slide 1 text

UPDATE THIS PRESENTATION HEADER IN SLIDE MASTER © 2024, Amazon Web Services, Inc. or its affiliates. All rights reserved. Amazon Confidential and Trademark. © 2024, Amazon Web Services, Inc. or its affiliates. All rights reserved. Amazon Confidential and Trademark. J A W S - H P C # 2 0 2 0 2 4 / 9 / 1 2 Hello! Parallel Computing Service! Hiroshi Kobayashi HPC Solutions Architect

Slide 2

Slide 2 text

© 2024, Amazon Web Services, Inc. or its affiliates. Automation and orchestration Access and visualization AWS HPC-optimized instances AWS HPC portfolio HPC-optimized Hpc7a Hpc7g Hpc6id Trn1 P5 G5 DL1 F1 Inf2 VT1 Accelerators Compute, memory, and networking M7i C7gn C7g C7i R7i X2iezn C5n M5zn RES on AWS NICE DCV Amazon AppStream 2.0 Amazon WorkSpaces Family AWS PCS AWS ParallelCluster AWS Batch Amazon SageMaker

Slide 3

Slide 3 text

© 2024, Amazon Web Services, Inc. or its affiliates. AWS PCS Orchestration Access cloud resources at scale Job management Use common job schedulers (using Slurm) Easy migration Migrate without any code or script changes for HPC workloads

Slide 4

Slide 4 text

© 2024, Amazon Web Services, Inc. or its affiliates. Key capabilities of AWS PCS Unified compute and remote visualization management Dynamic resource provisioning and scaling Ability to bring your own applications Managed updates and in-depth telemetry

Slide 5

Slide 5 text

© 2024, Amazon Web Services, Inc. or its affiliates. Which workloads are best suited for AWS PCS? HTC and loosely coupled workloads Building scientific models Tightly coupled workloads Accelerated computing

Slide 6

Slide 6 text

© 2024, Amazon Web Services, Inc. or its affiliates. Overview Cluster: Assembly of compute nodes, file systems, and job queues, along with login nodes and workstations, hosting a scheduler. Compute node group: Collection of Amazon EC2 instances with a distinct configuration of instance types, networking, storage, software, and security. Queue: Virtual location where jobs are stored until the scheduler executes them on instances in Compute node group(s). Login node group: Collection of Amazon EC2 instances where users can submit jobs or manage and visualize data. External resources: Customer-provided networked resources that support a cluster, like shared storage, directory, accounting database… Queue Jobs Cluster Storage Accounting database* LDAP directory Metrics Logs Cost Explorer Budgets Queue Compute node group Login node group Compute node group Compute node group Jobs Queue Jobs * Not in GA

Slide 7

Slide 7 text

© 2024, Amazon Web Services, Inc. or its affiliates. Service architecture Private subnet On-premise End users (team 1) Directory services End users (team 2) SSH 1 BYO Login nodes 2 Submit jobs SSH 1 AWS Account Customer VPC PCS-Managed Service VPC) 2 4 4 Submit jobs Compute nodes allocated Slurm accounting DB* Jobs queued AWS services/resources S3 storage, license servers, databases, etc. Login Node Group 1 Min =1, max = 1 C5 Compute Node Group 1 Min =0, max = 20 C5 C5 Amazon machine image (AMI) AWS IAM role Amazon EC2 launch template Node Group configuration 1 Node Group configuration 2 PCS Cluster Slurm controller Queue 1 PCS controller, replicas, etc. VPN or Direct Connect ENI * Not in GA

Slide 8

Slide 8 text

© 2024, Amazon Web Services, Inc. or its affiliates. Cost – Controller(Headnode) PCS Pricing https://aws.amazon.com/pcs/pricing/ (Tokyo Region) 東京リージョンでSmallを1ヶ月稼働すると、約8万円。Mediumだと45万円。Large だと90万円。。。

Slide 9

Slide 9 text

© 2024, Amazon Web Services, Inc. or its affiliates. Cost – Node Management Fee (Compute and Login/Viz) PCS Pricing https://aws.amazon.com/pcs/pricing/ (Tokyo Region) HPC, C, M, R等のインスタンスファミリーははStandard。PやTrnはAdvancedという分類。 上記の価格が、EC2の費用の上にかかってくる。

Slide 10

Slide 10 text

© 2024, Amazon Web Services, Inc. or its affiliates. OS & AMI Supported OS: AL2, RHEL9, Rocky Linux 9, Ubuntu2204 https://docs.aws.amazon.com/pcs/latest/userguide/working-with_ami_installers.html#working-with_ami_installers_os Sample AMI with Amazon Linux2 https://docs.aws.amazon.com/pcs/latest/userguide/working-with_ami_samples.html Custom AMI 1. Pick a supported OS 2. Install PCS agent and Slurm packages 3. Install additional apps/libs/drivers 4. Create AMI (and use that AMI on PCS) Doc : https://docs.aws.amazon.com/pcs/latest/userguide/working-with_ami_custom.html Youtube : https://youtu.be/3ysMkZrDlGI?si=WTEnx0fB5jdbECPT

Slide 11

Slide 11 text

© 2024, Amazon Web Services, Inc. or its affiliates. Launch template Using Amazon EC2 launch template with AWS PCS https://docs.aws.amazon.com/pcs/latest/userguide/working-with_launch-templates.html User Data https://docs.aws.amazon.com/pcs/latest/userguide/working-with_ec2-user-data.html 1. Install software packages 2. Run scripts from S3 bucket 3. Set global ENV VAR 4. Mount network storage (EFS, FSx)

Slide 12

Slide 12 text

© 2024, Amazon Web Services, Inc. or its affiliates. PCS Demo! 13

Slide 13

Slide 13 text

© 2024, Amazon Web Services, Inc. or its affiliates. Demo Environment 1. Create VPC & subnets 2. Create cluster security group 3. Create a PCS cluster 4. Create shared storages 5. Create Instance profile 6. Create Launch templates 7. Create login node group 8. Create compute node group 9. Create queue 10. Login to your cluster 11. Run jobs https://docs.aws.amazon.com/pcs/latest/userguide/getting-started.html

Slide 14

Slide 14 text

© 2024, Amazon Web Services, Inc. or its affiliates. Take aways • PCS manages cluster controller. That minimize the cluster operation workloads. • PCS offers a unified set of APIs to help build and operate clusters supporting a range of HPC and scientific and engineering modeling workloads. • PCS charges node management fee for both controller node and compute nodes. • Need to work security group, launch templates, IAM role, network,,, together

Slide 15

Slide 15 text

© 2024, Amazon Web Services, Inc. or its affiliates. PCS Resource 16 • Blog • AWS HPC Blog https://aws.amazon.com/blogs/hpc/ • YouTube • PCS series https://youtube.com/playlist?list=PL6tstO5J3TRGPTfz6C4XY3gT6Fg70nAPN&si=9_QlXr9z96wwraJJ • Doc • User guide https://docs.aws.amazon.com/pcs/latest/userguide/what-is-service.html • API reference https://docs.aws.amazon.com/pcs/latest/APIReference/Welcome.html • Github • HPC recipe https://github.com/aws-samples/aws-hpc-recipes/tree/main/recipes/pcs

Slide 16

Slide 16 text

UPDATE THIS PRESENTATION HEADER IN SLIDE MASTER © 2024, Amazon Web Services, Inc. or its affiliates. All rights reserved. Amazon Confidential and Trademark. Thank you! © 2024, Amazon Web Services, Inc. or its affiliates. All rights reserved. Amazon Confidential and Trademark. Hiroshi Kobayashi [email protected]