Slide 1

Slide 1 text

1 Programming Design Molecular Dynamics Method: Theory and Implementation Department of Applied Physics and Physico-Informatics, Keio University Hiroshi Watanabe

Slide 2

Slide 2 text

2 What is the most important thing for programming? プログラムを組む時に最も大事なことは何か? 妥協 compromise

Slide 3

Slide 3 text

3 Facebookの新規株式公開(IPO)申請書類に添付された手紙より http://www.sec.gov/Archives/edgar/data/1326801/000119312512034517/d287954ds1.htm Done is better than perfect. 完成を目指すより、まず終わらせよ Code wins arguments. コードは議論に勝る Mark Elliot Zuckerberg (Founder and CEO of Facebook)

Slide 4

Slide 4 text

4 Incomplete programs are worthless. → Some compromise is required. There are no silver bullet for compromise. Where should we compromise and where should not? → I will introduce my experience from development of my MD code.

Slide 5

Slide 5 text

5 Molecular Dynamics code for Avogadro Challenge Project (KAUST GRP Investors Award) Algorithm Classical Moleculer Dynamics method with short-range interaction Support only Lennard-Jones potential with cutoff Target Phase transition between gas and liquid Universality of gas-liquid phase transition (Equilibrium state) Waiting time of a bubble nucleation (Quasi static State) Multi-bubble nuclei (Non-equilibrium phenomena)

Slide 6

Slide 6 text

6 Background There are many nice MD applications... (LAMMPS, NAMD, SPaSM, MODYLAS, etc) We can write papers with using above codes. If I were a physisist, I used the above codes. But I am a programmer! I want to write a paper with using MY CODE. I want to challenge the state of the art by MY CODE.

Slide 7

Slide 7 text

7 Overview Language: C++ Parallelization: flat-MPI for Ver 1.0 MPI/OpenMP Hybrid for ver 2.0 Size: ~5000 lines URL: https://github.com/kaityo256/mdacp Design Policy • Pursuing both ease of development and efficiency → If the two interfere, take efficiency • Do not consider use by others → Publishing source codes for developers, rather than users • Less dependency on external libraries

Slide 8

Slide 8 text

8 Input File Project Manager MDManager (Process) Variables MDUnit (Thread) Variables MDUnit (Thread) MDManager (Process) Variables MDUnit (Thread) Variables MDUnit (Thread) A Project Execute an appropriate project among pre-registered main function Actual main function Communication

Slide 9

Slide 9 text

9 Module Coupling

Slide 10

Slide 10 text

10 Class • A module which combimes data and methods to some function • Subclass can be created by inheritance Inheritance • Subclass can be created by inheritance • Usually used with virtual functions Virtual Function • Override the behavior of superclass • The method to be called is determined at runtime

Slide 11

Slide 11 text

11 Form (Container) Component Button Component TextForm Component CheckBox Button::Repaint TextForm::Repaint CheckBox::Repaint Component::Repaint The Form manages components and it does not know the details. The form calls Component::Repaint when repaint is necessary The derived methods, such as Button::Repaint, are called. No need to recompile Form class even if the types of components increases

Slide 12

Slide 12 text

12 Coupling • Strength of relationship between modules • Strongly dependent on each other (Tight Coupling) • Independent of each other (Loose Coupling) Tightly coupled modules: Local modifications spread Loosely coupled modules: Modifications affect locally Loose coupling is better ... but I often adopted designs which do not minimize the coupling.

Slide 13

Slide 13 text

13 $ ./a.out 32 1.5 32 # SystemSize 1.5 # Temperature SystemSize=32 Tempreature=1.5 32 1.5 Coupling Tight Loose Command line arguments Ordered numbers in a file Key=Value format in a file XML or other formats Adopted

Slide 14

Slide 14 text

14 Coupling vs. Readability Some other projects adopt XML, but... ※ Other formats, such as TOML, look good... XML seems to be overspec for me. • XML format requires external library. • XML is hard to be modified by hand. Implementation • Parameter class (parameter.cc/h) • String hash (std::map) • "Key=Value" format

Slide 15

Slide 15 text

15 Flow 1. Give a name of a input file to executable 2. Receive the filename in main, and pass it to manager class 3. Manager class creates an instance of Parameter class from it 4. Read values via the instance of Parameter class Retrieving Values Get a value via string hash double Parameter::GetDouble(string key) double Parameter::GetDoubleDef(string key, double default) Special Key "Mode" Input file must include the special key "Mode" → It will be used for managements of projects

Slide 16

Slide 16 text

16 A product code is used for many purposes. (Main difference from benchmark codes) Coupling Tight Loose Rewrite main function and recompile Switch statement Singleton Pattern Create a library Support some script languages Adopted

Slide 17

Slide 17 text

17 Tight Coupling Rewrite main function int main(void){ //ProjectA(); //ProjectB(); ProjectC(); } switch(mode){ case PROJECT_A: ProjectA(); break; case PROJECT_B: ProjectB(); break; } Switch Statement

Slide 18

Slide 18 text

18 Loose Coupling Domain Specific Language (ex: LAMMPS) # 2d polygon nparticle bodies units lj dimension 2 atom_style body nparticle 2 6 read_data data.body velocity all create 1.44 87287 loop geom pair_style body 5.0 pair_coeff * * 1.0 1.0 neighbor 0.3 bin fix 1 all nve/body fix 2 all enforce2d #compute 1 all body/local type 1 2 3 #dump 1 all local 100 dump.body index c_1[1] c_1[2] c_1[3] c_1[4] thermo 500 run 10000 github: lammps/examples/body/in.body User friendly Very hard for a programmer

Slide 19

Slide 19 text

19 Singleton Pattern 1. User implements subclass of Project class 2. Register itself to Project Manager at its constructor ProjectA “ProjectA” ProjectB “ProjectB” ProjectC “ProjectC” ProjectManager hashkey Pointer to itself Regester 3. Find the project by the key, and call Project::Run method Mode=ProjectC Run ProjectA “ProjectA” ProjectB “ProjectB” ProjectManager ProjectC “ProjectC”

Slide 20

Slide 20 text

20 Design Policy Write a main function for each project Recompiling is unnecessary when a project is added. Implementation Implement ProjectManager class as singleton All projects are subclasses of the Project class. Project::Run is the actual main function (user implements it). Pros/Cons Easy for implementation, relatively loose coupling Inconvenient for other users ※ Maybe I am an only user...

Slide 21

Slide 21 text

21 At least two module require "argc" and "argv". • MPI requires them for initialization • Parameter class requires it to obtain an input filename. How should I pass them? Tight Loose Main func passes the arguments to manager class. Main func creates two instances of classes which require the arguments and pass two instances to manager class. Design Changed plan B plan A

Slide 22

Slide 22 text

22 Plan A Create instances of MPI management class (MPIInfo) Parameter management class (Parameter) Pass "argc" and "argv" to their constructors Pass MPIInfo and Parameter to Calculation Management class argc,argv main MPIInfo Parameter argc,argv argc,argv MDManager I adopted Plan A for Ver 1.0.

Slide 23

Slide 23 text

23 Plan B • Main func passes the arguments to manager class. • Manager class creates MPIInfo and Parameter by using them. argc,argv main MPI_Init Parameter argc,argv argc,argv MDManager int main(int argc, char *argv[]){ MDManager mdm(argc, argv); } I think Plan A is more beautiful, but I adopted Plan B for Ver 2.0.

Slide 24

Slide 24 text

24 Why did you put everything in the calculation management class? Don't you think that the design is cleaner if you divide it? Q A Because I thought there was not much profit to divide... • Instances of MPIInfo and Parameter share their lifetime with MDManager. • Should MDManager know its rank? or MPIInfo should? • Simple main function is preferable.

Slide 25

Slide 25 text

25 Tight Loose • The faster the force calculation, the better. • Different architectures require different optimizations. • I want to deal with various types of interactions. Implement Force calculation directly Implement virtual function for each architecture Implement virtual function for each interaction Adopted

Slide 26

Slide 26 text

26 Virtual functions for each interaction • Appropriate routine is automatically chosen. • The most beautiful design. • Desperately slow Virtual functions for each architecture ForceCalc_SPARC64 is inherited from ForceCalc, etc... →Highly inefficient codes Since some compiler cannot apply IPO for calling virtual functions. ※ IPO (Interprocedural optimization)

Slide 27

Slide 27 text

27 void ForceCalculator::CalculateForce(Variables *vars, ...) { #ifdef FX10 // CalculateForceReactlessSIMD(vars,mesh,sinfo); CalculateForceReactlessSIMD_errsafe(vars, mesh, sinfo); #elif AVX2 CalculateForceAVX2(vars, mesh, sinfo); #else CalculateForceNext(vars, mesh, sinfo); //CalculateForceBruteforce(vars,sinfo); //CalculateForceSorted(vars,mesh,sinfo); //CalculateForcePair(vars,mesh,sinfo); //CalculateForceUnroll(vars,mesh,sinfo); #endif } After all, I adopted #ifdef switching...

Slide 28

Slide 28 text

28 Design for Parallelization

Slide 29

Slide 29 text

29 MDUnit: Responsible for calculation of divided region Simulation Box MDUnit MDUnit MDUnit MDUnit How should I implement communication?

Slide 30

Slide 30 text

30 Plan A MDUnit MDUnit MDUnit MDUnit MDUnits communicate each other. MDUnit knows who lives in the neighborhood. Updates divided reagion (Local Information) Communicate to each other (Grobal Information) I adopted Plan A for Ver 1.0.

Slide 31

Slide 31 text

31 Plan B Manager class to manage MDUnits MDUnit MDUnit MDUnit MDUnit MDManager MDUnit do not know who lives in the neighborhood. All communication are performed via MDManager. I adopted Plan B for Ver 2.0.

Slide 32

Slide 32 text

32 MDUnit MDUnit MDUnit MDUnit MDUnit MDUnit MDUnit MDUnit MDManager Then, which was better? Plan A Plan B I don't know....

Slide 33

Slide 33 text

33 0 1 4 5 2 3 6 7 8 9 11 12 10 11 13 14 4 proccesses / node, 4 nodes calculation Who should be responsible for this map?

Slide 34

Slide 34 text

34 Plan A MDUnit MPIInfo Calculation Domain Prepare MPIInfo class (Responsible for the "map") MDUnit has MPIInfo. Loose coupling Plan B MDUnit MDUnit MDUnit MDUnit MDManager MDManager knows map. Tight coupling I adopted Plan A for Ver 1.0, but changed to Plan B for Ver 2.0.

Slide 35

Slide 35 text

35 Why did you put everything to management class? Don't you think that the design is cleaner if you divide it? Q A Because I thought there was not much profit to divide... MDManager calls MPI_Init in its constructor and MPI_Finalize in its destructor. So MPI-related functions are not separated ...

Slide 36

Slide 36 text

36 There is no perfect answer to design. • It is difficult to achieve performance and maintainability simultaneously. • Rewrite codes if you feel uncomfortable. Parallel programs are cumbersome • I tried to hide the parallel structure. • But understanding parallel structure is mandatory for something complicated...

Slide 37

Slide 37 text

37 Don't think. Write codes.