Slide 1

Slide 1 text

HIGH PERFORMANCE COMPUTING RAJESH SINGH THE INSTITUTE OF MATHEMATICAL SCIENCES, CHENNAI

Slide 2

Slide 2 text

THE PURPOSE OF COMPUTING IS INSIGHT, NOT NUMBERS. RICHARD HAMMING

Slide 3

Slide 3 text

PLAN ▸MOTIVATION FOR HPC ▸APPLICATIONS

Slide 4

Slide 4 text

PLAN ▸MOTIVATION FOR HPC ▸APPLICATIONS

Slide 5

Slide 5 text

WHO NEEDS HPC ‣ Biology: drug discovery, modelling diseases, protein folding, etc. ‣ Computer aided design: automobile design, testing ‣ Chemical engineering: molecule designing and processes ‣ Economics: Risk analysis, automated trading ‣ Geophysics: Oil and gas exploration ‣ Weather forecasting and aerodynamics ‣ Basic and applied research in universities A high performance computer is a supercomputer which rely on combined power of, sometimes, hundred of thousands of individual processing units, running in parallel.

Slide 6

Slide 6 text

HPC IN ACTION Trucks consume a great share of a nation’s petroleum usage. Half of engine input goes in countering the aerodynamic drag! How to make trucks more fuel efficient? https://www.youtube.com/watch?v=TnzUapgfgeU

Slide 7

Slide 7 text

HPC IN ACTION 3D grid for simulations to estimate aerodynamics drag on HPC under various scenarios. https://www.youtube.com/watch?v=TnzUapgfgeU

Slide 8

Slide 8 text

HPC IN ACTION Wind-tunnel experiments https://www.youtube.com/watch?v=TnzUapgfgeU

Slide 9

Slide 9 text

HPC IN ACTION https://www.youtube.com/watch?v=TnzUapgfgeU

Slide 10

Slide 10 text

NEED OF HPC Start the project on laptop Scale up to a better device: like fast multi-core architecture, etc. Single machines have performance saturation: they don’t scale up beyond a point. For the largest model, we need to device a better strategy? Moore law does not help as adding more transistors to chip leads to excessive heating and you have to keep your cpu in liquid nitrogen!

Slide 11

Slide 11 text

HPC CLUSTERS Use HPC clusters, so that the computation can be done in parallel

Slide 12

Slide 12 text

CORES, THREADS AND NODES ‣ Core: a processor ‣ Node: collection of cores ‣ Many programs can run at same time on many cores on a particular node ‣ These programs are separated from each others by a thread ‣ This kind of programming using threads can be accomplished using OpenMP ‣ OpenMP is supported by almost all compilers but runs on one node while MPI runs on several nodes ‣ MPI is used between nodes and OpenMP within a node.

Slide 13

Slide 13 text

PROGRAMMING INTERFACES ‣ OpenMP is used for architectures where memory is shared, e.g. within a node, with multiple threads ‣ Message Passing Interface (MPI) is used for distributed memory architectures ‣ Here, the computation happens on each node and data is transferred between nodes ‣ CUDA uses CUDA-enabled graphic processing unit (GPU) for general purpose processing, GPGPU. ‣ GPUs operate at lower frequencies but have many more cores which makes it faster than CPUs ‣ It does not support the full C standard while both OpenMP and MPI do

Slide 14

Slide 14 text

FLOPS ‣ Floating point operations per second (FLOPS) ‣ A floating point operation is mathematical operation like addition, multiplication. etc. ‣ A calculator which can do 10 FLOPS is considered functional ‣ FLOPS = # cores * Frequency * operations per cycle ‣ An intel process of 2.5 GHz gives around 10 GFLOPS (10*10^9 FLOPS) for four core machine. ‣ Fastest super computer can do PETAFLOPS (10^15 FLOPS) ‣ FLOPS can have different levels of precision - single or double precision ‣ Double precision (64 bits) takes more memory than single one and takes longer for calculations than single.

Slide 15

Slide 15 text

PLAN ▸MOTIVATION FOR HPC ▸APPLICATIONS

Slide 16

Slide 16 text

APPLICATIONS Weather forecasting ‣ Divide the atmosphere in 1km x 1km x 1km, with around 10^9 cells ‣ If we assume 100 FLOPS per cell, then we need, around 10^11 FLOPS ‣ To forecast at an interval of 10 min for 10 days would require around 10^14 FLOPS ‣ We will need again a large number of FLOPS for problems in astronomy where bodies are attracted to each other. ‣ So problem has N^2 FLOPS. A galaxy has 10^11 stars! ‣ Similar problem comes in accounting for electrostatic interactions in protein folding

Slide 17

Slide 17 text

APPLICATIONS ‣ Automobile engineers can simulate their prototypes using computer simulation rather than actually building and crashing them. ‣ This can reduce cost in automobile designs and is of paramount importance in design of aeroplanes, etc. ‣ An engineer at e-commerce company like Flipkart may get thousand of hits per second ‣ A HPC should be used there to ensure that the web server responds to customer without delay ‣ HPC can also be used for scientific visualisation which leads to better insights.

Slide 18

Slide 18 text

TOP 10 SUPERCOMPUTERS top500.org

Slide 19

Slide 19 text

CPU top500.org

Slide 20

Slide 20 text

SHARE OF SUPERCOMPUTERS top500.org

Slide 21

Slide 21 text

Thank you for your attention!