Upgrade to Pro — share decks privately, control downloads, hide ads and more …

High performance computing

Rajesh Singh
February 05, 2016

High performance computing

An introduction to high performance computing

Rajesh Singh

February 05, 2016
Tweet

More Decks by Rajesh Singh

Other Decks in Programming

Transcript

  1. WHO NEEDS HPC ‣ Biology: drug discovery, modelling diseases, protein

    folding, etc. ‣ Computer aided design: automobile design, testing ‣ Chemical engineering: molecule designing and processes ‣ Economics: Risk analysis, automated trading ‣ Geophysics: Oil and gas exploration ‣ Weather forecasting and aerodynamics ‣ Basic and applied research in universities A high performance computer is a supercomputer which rely on combined power of, sometimes, hundred of thousands of individual processing units, running in parallel.
  2. HPC IN ACTION Trucks consume a great share of a

    nation’s petroleum usage. Half of engine input goes in countering the aerodynamic drag! How to make trucks more fuel efficient? https://www.youtube.com/watch?v=TnzUapgfgeU
  3. HPC IN ACTION 3D grid for simulations to estimate aerodynamics

    drag on HPC under various scenarios. https://www.youtube.com/watch?v=TnzUapgfgeU
  4. NEED OF HPC Start the project on laptop Scale up

    to a better device: like fast multi-core architecture, etc. Single machines have performance saturation: they don’t scale up beyond a point. For the largest model, we need to device a better strategy? Moore law does not help as adding more transistors to chip leads to excessive heating and you have to keep your cpu in liquid nitrogen!
  5. CORES, THREADS AND NODES ‣ Core: a processor ‣ Node:

    collection of cores ‣ Many programs can run at same time on many cores on a particular node ‣ These programs are separated from each others by a thread ‣ This kind of programming using threads can be accomplished using OpenMP ‣ OpenMP is supported by almost all compilers but runs on one node while MPI runs on several nodes ‣ MPI is used between nodes and OpenMP within a node.
  6. PROGRAMMING INTERFACES ‣ OpenMP is used for architectures where memory

    is shared, e.g. within a node, with multiple threads ‣ Message Passing Interface (MPI) is used for distributed memory architectures ‣ Here, the computation happens on each node and data is transferred between nodes ‣ CUDA uses CUDA-enabled graphic processing unit (GPU) for general purpose processing, GPGPU. ‣ GPUs operate at lower frequencies but have many more cores which makes it faster than CPUs ‣ It does not support the full C standard while both OpenMP and MPI do
  7. FLOPS ‣ Floating point operations per second (FLOPS) ‣ A

    floating point operation is mathematical operation like addition, multiplication. etc. ‣ A calculator which can do 10 FLOPS is considered functional ‣ FLOPS = # cores * Frequency * operations per cycle ‣ An intel process of 2.5 GHz gives around 10 GFLOPS (10*10^9 FLOPS) for four core machine. ‣ Fastest super computer can do PETAFLOPS (10^15 FLOPS) ‣ FLOPS can have different levels of precision - single or double precision ‣ Double precision (64 bits) takes more memory than single one and takes longer for calculations than single.
  8. APPLICATIONS Weather forecasting ‣ Divide the atmosphere in 1km x

    1km x 1km, with around 10^9 cells ‣ If we assume 100 FLOPS per cell, then we need, around 10^11 FLOPS ‣ To forecast at an interval of 10 min for 10 days would require around 10^14 FLOPS ‣ We will need again a large number of FLOPS for problems in astronomy where bodies are attracted to each other. ‣ So problem has N^2 FLOPS. A galaxy has 10^11 stars! ‣ Similar problem comes in accounting for electrostatic interactions in protein folding
  9. APPLICATIONS ‣ Automobile engineers can simulate their prototypes using computer

    simulation rather than actually building and crashing them. ‣ This can reduce cost in automobile designs and is of paramount importance in design of aeroplanes, etc. ‣ An engineer at e-commerce company like Flipkart may get thousand of hits per second ‣ A HPC should be used there to ensure that the web server responds to customer without delay ‣ HPC can also be used for scientific visualisation which leads to better insights.