Slide 1

Slide 1 text

Classified as Microsoft Confidential Zürich Meetup Azure High Performance Computing Dr. Darko Mocelj Simplify and optimize HPC deployments with Microsoft Azure

Slide 2

Slide 2 text

© Copyright Microsoft Corporation. All rights reserved. Classified as Microsoft Confidential HPC in a nutshell

Slide 3

Slide 3 text

Classified as Microsoft Confidential © Microsoft Corporation High Performance Computing (HPC) is a platform for groundbreaking scientific discoveries and game-changing innovation. Why HPC matters Drives innovation across nearly every industry to solve scientific and real-life problems such as fluid dynamics, finite element analysis, weather modelling, gene research, financial risk and more Furnishes fast computation power and large-scale parallel processing for accurately processing large volumes of data Provides deep insights into business data for driving smarter simulations and empowering intelligent decision making Saves time and money by delivering faster results

Slide 4

Slide 4 text

Classified as Microsoft Confidential Materials Science Clinical Trial Simulation Rocket Design Self-driving Cars Seismic and Reservoir modelling Materials Science Fundamental Science Architecture & Engineering Entertainment Genomics Drug Design Environment Impact Computational Chemistry Quantum computing Cancer Research Circuit Design Risk Management Diagnostics Crash Testing Machine Learning Product Design & Safety Security/ Encryption Logistics Data Science

Slide 5

Slide 5 text

Classified as Microsoft Confidential HPC Automotive Oil & Gas Ship engineering Banking Insurance Energy Defense & Aerospace Pharmaceutical Healthcare Life Science Weather forecasting Chemical engineering Engineering & construction Graphics & rendering Fluid dynamics Structural simulations Crash Simulations Deep Learning Genomics Molecular Modelling Risk Simulations High Performance Computing Solutions Electronic Design

Slide 6

Slide 6 text

© Copyright Microsoft Corporation. All rights reserved. Classified as Microsoft Confidential Solutions for various workloads Loosely coupled Large-scale, compute-intensive workloads, which can be run in parallel, taking advantage of the scale and flexibility of the cloud (FSI, Genomics, Physics…) Tightly coupled Solving the underlying mathematical model of a dynamic physical system in a highly iterative and closely coupled fashion (Seismic/Reservoir, Engineering, Weather,…..) Hybrid and cloud bursting Optimizing application workflows to benefit from both on- and off-premises resources (FSI, Engineering, Auto, …)

Slide 7

Slide 7 text

© Copyright Microsoft Corporation. All rights reserved. Classified as Microsoft Confidential ©Microsoft Corporation General-purpose VMs D: Standard workloads E: High memory F: Compute-bound Solve any HPC, AI workload—at any scale Small scale MPI (Handful of cores) Extreme scale MPI (100k+ cores) High memory VMs L: High SSD & IOPS M: Extreme memory A/B series VMs Burstable virtual machines (VMs) D/E/F Cray in Azure Managed custom bare-metal server Large to extreme-scale HPC infrastructure Azure network integration Specialized VMs H: High memory HB: Memory bandwidth HC: Compute-bound NC: GP-GPU compute ND: Deep learning NV: Graphics applications NP: Programmable FPGA L/M H/N A/B Infiniband network interconnected (up to 400GB/s)

Slide 8

Slide 8 text

© Copyright Microsoft Corporation. All rights reserved. Classified as Microsoft Confidential Purpose-built HPC A full range of CPU and GPU capabilities that help applications scale to 80K+ cores Fast, secure networking Fast InfiniBand inter-connects as well as edge-to-cloud connectivity​ High performing storage A range of storage capabilities to support simple-to-complex storage needs Workload orchestration End-to-end workflow agility using known, familiar tools and processes Intelligence services AI, machine learning, and deep learning at supercomputer scale​ Achieve more with Microsoft Azure HPC

Slide 9

Slide 9 text

© Copyright Microsoft Corporation. All rights reserved. Classified as Microsoft Confidential HPC FinOps – FSI Focus

Slide 10

Slide 10 text

© Copyright Microsoft Corporation. All rights reserved. Classified as Microsoft Confidential Fixed cost (on-premises) Variable cost (cloud) Reduce infrastructure costs and use capacity on demand Respond in an agile way to new business demands Improve operations Manage bursts due to external events quickly and seamlessly Data processing demand Azure HPC Cost Optimization Strategies Mix & Match

Slide 11

Slide 11 text

© Copyright Microsoft Corporation. All rights reserved. Classified as Microsoft Confidential Azure HPC Cost Optimization Strategies Mix & Match Leverage autoscaling to adjust consumed compute capacity Average e uest E ecution time um er o e uests s s s s seconds minutes hours Understand utilizations patterns (refer to Azure Advisor) and cover the base utilization with Reserved Instances Compute instances can be evicted any time – engineer for flexibility and portability. Flexibility in selecting different VM families HBv3 Dasv5 Easv3 F- D- E-Series Portability* to different Azure Regions with Spot instances availability

Slide 12

Slide 12 text

© Copyright Microsoft Corporation. All rights reserved. Classified as Microsoft Confidential Azure Cost Optimization Strategies (FSI Focus) Workload Examples • Predictable demand, e.g., opening of new trading books • Increase Accuracy in risk models (replace approximations, increase Monte-Carlo paths, etc.) • Increased computational demand imposed by regulations (FRTB, XVA) Spot: • Grid workloads with mature cloud customers leveraging regional and compute flexibility • Dev / Test workloads • One-Off Regression Test • Try-Out of new architectures (GPU / CPU) • Full Flexibility • React quickly to volatile markets and adjust computational capacity accordingly

Slide 13

Slide 13 text

© Copyright Microsoft Corporation. All rights reserved. Classified as Microsoft Confidential HPC – Know your environmental impact

Slide 14

Slide 14 text

© Copyright Microsoft Corporation. All rights reserved. Classified as Microsoft Confidential Carbon-aware computing – impact on HPC operations https://news.microsoft.com/de-ch/2023/01/10/carbon-aware-computing-whitepaper/ Measure carbon intensity of - past workloads - Predict future workloads and Minimize carbon footprint of your workloads Measuring the Carbon Intensity of AI in Cloud Instances (arxiv.org) https://github.com/Green-Software-Foundation/carbon-aware-sdk

Slide 15

Slide 15 text

© Copyright Microsoft Corporation. All rights reserved. Classified as Microsoft Confidential Azure HPC Services

Slide 16

Slide 16 text

© Copyright Microsoft Corporation. All rights reserved. Classified as Microsoft Confidential © Microsoft Corporation Azure Batch • HPC-as-a-Service Model • All HPC resources are cloud-based Services for HPC Workload Management – Azure Batch focus of the Demo Cloud-native job scheduling • Support for third party schedulers • Traditional HPC scaling methodology, but using Azure Azure CycleCloud Traditional cluster scheduler orchestration HPC App, Head node and on prem compute VM Resource Pool Azure CycleCloud HPC Application on client workstation VM Resource Pool Azure CycleCloud Head Node Hybrid / cloud bursting model Cloud native model HPC Application on client workstation VM Resource Pool Azure Batch

Slide 17

Slide 17 text

© Copyright Microsoft Corporation. All rights reserved. Classified as Microsoft Confidential Developer access SaaS platform for cloud- enabling workflows .NET, Java, Node.js, Python, REST + Common languages and frameworks Job scheduling Easiest way to run batch jobs at scale in Azure Detect and retry failed tasks Task dependencies Job prep and cleanup tasks Monitoring VM monitoring and auto- recover Metrics and logs available via Portal and API Azure Batch Capabilities Autoscale Native VM orchestration Automatically scale environments up as down as jobs require Choice of VMs Windows or Linux Standard or custom images Windows pool can use AHUB Can use low-priority & Spot VMs

Slide 18

Slide 18 text

© Copyright Microsoft Corporation. All rights reserved. Classified as Microsoft Confidential © Microsoft Corporation • Configure and create VMs to cater for any scale: tens to thousands • Automatically scale the number of VMs to maximize utilization • Easy low-priority and VM sizing, suited to your application Batch jobs and tasks • Task is a unit of execution; task = application command line (EXE, BAT, CMD, PS1, etc.) • Jobs are created and tasks are submitted to a pool. Next, tasks are queued and assigned to VMs • Any application, any execution time; run applications unchanged • Automatic detection and retry of frozen or failing tasks Batch pools Azure Batch (example of a cloud native HPC service)

Slide 19

Slide 19 text

© Copyright Microsoft Corporation. All rights reserved. Classified as Microsoft Confidential Example of a Azure HPC Deployment (following Microsoft Cloud Adoption Framework for Azure) HPC Microsoft Cloud Adoption Framework for Azure - Cloud Adoption Framework | Microsoft Learn What is an Azure landing zone? - Cloud Adoption Framework | Microsoft Learn Connectivity / Shared Services HPC Workload Container Registry Subnet Container Registry

Slide 20

Slide 20 text

© Copyright Microsoft Corporation. All rights reserved. Classified as Microsoft Confidential You have questions? Let’s stay in touch.