Upgrade to Pro — share decks privately, control downloads, hide ads and more …

Molecular Dynamics Method: Theory and Implementation / MD2019 Programming Design

A10e41b0a61d59f2258d7f6172c33479?s=47 kaityo256
PRO
December 20, 2019

Molecular Dynamics Method: Theory and Implementation / MD2019 Programming Design

Molecular Dynamics Method: Theory and Implementation
「分子動力学法の理論と実装」
金沢大学集中講義 2019年 12月18日~20日

A10e41b0a61d59f2258d7f6172c33479?s=128

kaityo256
PRO

December 20, 2019
Tweet

Transcript

  1. 1 Programming Design Molecular Dynamics Method: Theory and Implementation Department

    of Applied Physics and Physico-Informatics, Keio University Hiroshi Watanabe
  2. 2 What is the most important thing for programming? プログラムを組む時に最も大事なことは何か?

    妥協 compromise
  3. 3 Facebookの新規株式公開(IPO)申請書類に添付された手紙より http://www.sec.gov/Archives/edgar/data/1326801/000119312512034517/d287954ds1.htm Done is better than perfect. 完成を目指すより、まず終わらせよ Code

    wins arguments. コードは議論に勝る Mark Elliot Zuckerberg (Founder and CEO of Facebook)
  4. 4 Incomplete programs are worthless. → Some compromise is required.

    There are no silver bullet for compromise. Where should we compromise and where should not? → I will introduce my experience from development of my MD code.
  5. 5 Molecular Dynamics code for Avogadro Challenge Project (KAUST GRP

    Investors Award) Algorithm Classical Moleculer Dynamics method with short-range interaction Support only Lennard-Jones potential with cutoff Target Phase transition between gas and liquid Universality of gas-liquid phase transition (Equilibrium state) Waiting time of a bubble nucleation (Quasi static State) Multi-bubble nuclei (Non-equilibrium phenomena)
  6. 6 Background There are many nice MD applications... (LAMMPS, NAMD,

    SPaSM, MODYLAS, etc) We can write papers with using above codes. If I were a physisist, I used the above codes. But I am a programmer! I want to write a paper with using MY CODE. I want to challenge the state of the art by MY CODE.
  7. 7 Overview Language: C++ Parallelization: flat-MPI for Ver 1.0 MPI/OpenMP

    Hybrid for ver 2.0 Size: ~5000 lines URL: https://github.com/kaityo256/mdacp Design Policy • Pursuing both ease of development and efficiency → If the two interfere, take efficiency • Do not consider use by others → Publishing source codes for developers, rather than users • Less dependency on external libraries
  8. 8 Input File Project Manager MDManager (Process) Variables MDUnit (Thread)

    Variables MDUnit (Thread) MDManager (Process) Variables MDUnit (Thread) Variables MDUnit (Thread) A Project Execute an appropriate project among pre-registered main function Actual main function Communication
  9. 9 Module Coupling

  10. 10 Class • A module which combimes data and methods

    to some function • Subclass can be created by inheritance Inheritance • Subclass can be created by inheritance • Usually used with virtual functions Virtual Function • Override the behavior of superclass • The method to be called is determined at runtime
  11. 11 Form (Container) Component Button Component TextForm Component CheckBox Button::Repaint

    TextForm::Repaint CheckBox::Repaint Component::Repaint The Form manages components and it does not know the details. The form calls Component::Repaint when repaint is necessary The derived methods, such as Button::Repaint, are called. No need to recompile Form class even if the types of components increases
  12. 12 Coupling • Strength of relationship between modules • Strongly

    dependent on each other (Tight Coupling) • Independent of each other (Loose Coupling) Tightly coupled modules: Local modifications spread Loosely coupled modules: Modifications affect locally Loose coupling is better ... but I often adopted designs which do not minimize the coupling.
  13. 13 $ ./a.out 32 1.5 32 # SystemSize 1.5 #

    Temperature SystemSize=32 Tempreature=1.5 <PARAM name=SystemSize>32</PARAM> <PARAM name=Temperature>1.5</PARAM> Coupling Tight Loose Command line arguments Ordered numbers in a file Key=Value format in a file XML or other formats Adopted
  14. 14 Coupling vs. Readability Some other projects adopt XML, but...

    ※ Other formats, such as TOML, look good... XML seems to be overspec for me. • XML format requires external library. • XML is hard to be modified by hand. Implementation • Parameter class (parameter.cc/h) • String hash (std::map) • "Key=Value" format
  15. 15 Flow 1. Give a name of a input file

    to executable 2. Receive the filename in main, and pass it to manager class 3. Manager class creates an instance of Parameter class from it 4. Read values via the instance of Parameter class Retrieving Values Get a value via string hash double Parameter::GetDouble(string key) double Parameter::GetDoubleDef(string key, double default) Special Key "Mode" Input file must include the special key "Mode" → It will be used for managements of projects
  16. 16 A product code is used for many purposes. (Main

    difference from benchmark codes) Coupling Tight Loose Rewrite main function and recompile Switch statement Singleton Pattern Create a library Support some script languages Adopted
  17. 17 Tight Coupling Rewrite main function int main(void){ //ProjectA(); //ProjectB();

    ProjectC(); } switch(mode){ case PROJECT_A: ProjectA(); break; case PROJECT_B: ProjectB(); break; } Switch Statement
  18. 18 Loose Coupling Domain Specific Language (ex: LAMMPS) # 2d

    polygon nparticle bodies units lj dimension 2 atom_style body nparticle 2 6 read_data data.body velocity all create 1.44 87287 loop geom pair_style body 5.0 pair_coeff * * 1.0 1.0 neighbor 0.3 bin fix 1 all nve/body fix 2 all enforce2d #compute 1 all body/local type 1 2 3 #dump 1 all local 100 dump.body index c_1[1] c_1[2] c_1[3] c_1[4] thermo 500 run 10000 github: lammps/examples/body/in.body User friendly Very hard for a programmer
  19. 19 Singleton Pattern 1. User implements subclass of Project class

    2. Register itself to Project Manager at its constructor ProjectA “ProjectA” ProjectB “ProjectB” ProjectC “ProjectC” ProjectManager hashkey Pointer to itself Regester 3. Find the project by the key, and call Project::Run method Mode=ProjectC Run ProjectA “ProjectA” ProjectB “ProjectB” ProjectManager ProjectC “ProjectC”
  20. 20 Design Policy Write a main function for each project

    Recompiling is unnecessary when a project is added. Implementation Implement ProjectManager class as singleton All projects are subclasses of the Project class. Project::Run is the actual main function (user implements it). Pros/Cons Easy for implementation, relatively loose coupling Inconvenient for other users ※ Maybe I am an only user...
  21. 21 At least two module require "argc" and "argv". •

    MPI requires them for initialization • Parameter class requires it to obtain an input filename. How should I pass them? Tight Loose Main func passes the arguments to manager class. Main func creates two instances of classes which require the arguments and pass two instances to manager class. Design Changed plan B plan A
  22. 22 Plan A Create instances of MPI management class (MPIInfo)

    Parameter management class (Parameter) Pass "argc" and "argv" to their constructors Pass MPIInfo and Parameter to Calculation Management class argc,argv main MPIInfo Parameter argc,argv argc,argv MDManager I adopted Plan A for Ver 1.0.
  23. 23 Plan B • Main func passes the arguments to

    manager class. • Manager class creates MPIInfo and Parameter by using them. argc,argv main MPI_Init Parameter argc,argv argc,argv MDManager int main(int argc, char *argv[]){ MDManager mdm(argc, argv); } I think Plan A is more beautiful, but I adopted Plan B for Ver 2.0.
  24. 24 Why did you put everything in the calculation management

    class? Don't you think that the design is cleaner if you divide it? Q A Because I thought there was not much profit to divide... • Instances of MPIInfo and Parameter share their lifetime with MDManager. • Should MDManager know its rank? or MPIInfo should? • Simple main function is preferable.
  25. 25 Tight Loose • The faster the force calculation, the

    better. • Different architectures require different optimizations. • I want to deal with various types of interactions. Implement Force calculation directly Implement virtual function for each architecture Implement virtual function for each interaction Adopted
  26. 26 Virtual functions for each interaction • Appropriate routine is

    automatically chosen. • The most beautiful design. • Desperately slow Virtual functions for each architecture ForceCalc_SPARC64 is inherited from ForceCalc, etc... →Highly inefficient codes Since some compiler cannot apply IPO for calling virtual functions. ※ IPO (Interprocedural optimization)
  27. 27 void ForceCalculator::CalculateForce(Variables *vars, ...) { #ifdef FX10 // CalculateForceReactlessSIMD(vars,mesh,sinfo);

    CalculateForceReactlessSIMD_errsafe(vars, mesh, sinfo); #elif AVX2 CalculateForceAVX2(vars, mesh, sinfo); #else CalculateForceNext(vars, mesh, sinfo); //CalculateForceBruteforce(vars,sinfo); //CalculateForceSorted(vars,mesh,sinfo); //CalculateForcePair(vars,mesh,sinfo); //CalculateForceUnroll(vars,mesh,sinfo); #endif } After all, I adopted #ifdef switching...
  28. 28 Design for Parallelization

  29. 29 MDUnit: Responsible for calculation of divided region Simulation Box

    MDUnit MDUnit MDUnit MDUnit How should I implement communication?
  30. 30 Plan A MDUnit MDUnit MDUnit MDUnit MDUnits communicate each

    other. MDUnit knows who lives in the neighborhood. Updates divided reagion (Local Information) Communicate to each other (Grobal Information) I adopted Plan A for Ver 1.0.
  31. 31 Plan B Manager class to manage MDUnits MDUnit MDUnit

    MDUnit MDUnit MDManager MDUnit do not know who lives in the neighborhood. All communication are performed via MDManager. I adopted Plan B for Ver 2.0.
  32. 32 MDUnit MDUnit MDUnit MDUnit MDUnit MDUnit MDUnit MDUnit MDManager

    Then, which was better? Plan A Plan B I don't know....
  33. 33 0 1 4 5 2 3 6 7 8

    9 11 12 10 11 13 14 4 proccesses / node, 4 nodes calculation Who should be responsible for this map?
  34. 34 Plan A MDUnit MPIInfo Calculation Domain Prepare MPIInfo class

    (Responsible for the "map") MDUnit has MPIInfo. Loose coupling Plan B MDUnit MDUnit MDUnit MDUnit MDManager MDManager knows map. Tight coupling I adopted Plan A for Ver 1.0, but changed to Plan B for Ver 2.0.
  35. 35 Why did you put everything to management class? Don't

    you think that the design is cleaner if you divide it? Q A Because I thought there was not much profit to divide... MDManager calls MPI_Init in its constructor and MPI_Finalize in its destructor. So MPI-related functions are not separated ...
  36. 36 There is no perfect answer to design. • It

    is difficult to achieve performance and maintainability simultaneously. • Rewrite codes if you feel uncomfortable. Parallel programs are cumbersome • I tried to hide the parallel structure. • But understanding parallel structure is mandatory for something complicated...
  37. 37 Don't think. Write codes.