Slide 1

Slide 1 text

MODERN DOCUMENTATION ACROSS LANGUAGES ROHIT GOSWAMI Created: 2021-07-16 Fri 19:08 1

Slide 2

Slide 2 text

BRIEF INTRODUCTION 3

Slide 3

Slide 3 text

HELLO! Find me here: Who? Rohit Goswami MInstP Doctoral Researcher, University of Iceland, Faculty of Physical Sciences https://rgoswami.me 4

Slide 4

Slide 4 text

LOGISTICS All contents are Slides are in presentations/SERI2021 hosted on GitHub Questions are welcome after the talk 5

Slide 5

Slide 5 text

THE RATIONALE 7

Slide 6

Slide 6 text

READING CODE I main: push rbp mov rbp, rsp mov DWORD PTR [rbp-4], 3 mov eax, 0 pop rbp ret __static_initialization_ and_destruction_0(int, int): push rbp mov rbp, rsp sub rsp, 16 mov DWORD PTR [rbp-4], edi mov DWORD PTR [rbp-8], esi cmp DWORD PTR [rbp-4], 1 jne .L5 cmp DWORD PTR [rbp-8], 65535 jne .L5 mov edi, OFFSET FLAT:_ZStL8 __ioinit call std::ios_base::Init::Init() [complete object constructor] mov edx, OFFSET FLAT:__dso_handle mov esi, OFFSET FLAT:_ZStL8__ioini mov edi, OFFSET FLAT:_ZNSt8ios_bas call __cxa_atexit .L5: nop leave ret _GLOBAL__sub_I_main: push rbp mov rbp, rsp mov esi, 65535 mov edi, 1 call __static_initialization_ and_destruction_0(int, int) pop rbp ret But who writes assembly anyway? 8

Slide 7

Slide 7 text

READING CODE II int main () { int D.48918; { int a; a = 3; D.48918 = 0; return D.48918; } D.48918 = 0; return D.48918; } void _GLOBAL__sub_I_main.cpp () { __static_initialization_ and_destruction_0 (1, 65535); } void __static_initialization_ and_destruction_0 (int __initialize_p, int __priority) { if (__initialize_p == 1) goto ; : if (__priority == 65535) goto ; : std::ios_base::Init::Init (&__ioinit __cxxabiv1::__cxa_atexit (__dt_comp &__ioinit, &__dso_han goto ; : : goto ; : : } GIMPLE is an internal gcc representation… 9

Slide 8

Slide 8 text

READING CODE III #include int main() { int a=3; return 0; } Better for most people, still a bit lacking for novices Assigning an integer Produces a file binary which can be run as: Output There is no output, but an assignment of an integer with value 3 takes place g++ main.cpp -o file ./file What about different languages? 10

Slide 9

Slide 9 text

READING CODE IV Maybe gcc is just an ugly compiler… program main integer :: x = 3 + 6 print *, x end program lfortran has a nicer intermediate structure conda create -n lf conda activate lf conda install lfortran \ -c conda-forge lfortran --show-asr consint.f90 11

Slide 10

Slide 10 text

PROJECT LAYOUTS 13

Slide 11

Slide 11 text

LANGUAGE AGNOSTIC BEGINNINGS Readme.{md,org} Motivation, rationale, license, installation instructions LICENSE Plain text, and preferably an open license is pretty handy for this license-generator .gitignore Lists files which do not need to be committed; typically generated files can be used to generate these gibo $ git init # Inside project $ gibo macOS Windows Xcode Emacs \ Vim Python C++ \ CMake TeX > .gitignore $ touch readme.md $ license-generator MIT \ --author "Person" $ tree -L 2 . ├── LICENSE ├── docs │ └── pres └── readme.org 2 directories, 2 files 14

Slide 12

Slide 12 text

LARGE PROJECT STRUCTURE Has a core With bindings For other languages Needs api documentation Also user documentation . ├── api-docs ├── dependencies ├── python-symengine-feedstock ├── symengine ├── symengine-bench ├── SymEngineBuilder ├── symengine.f90 ├── symengine-feedstock ├── symengine.github.io ├── symengine.hs ├── SymEngine.jl ├── symengine-paper ├── symengine.py ├── symengine.R ├── symengine.rb ├── symengine.spkg └── symengine-wheels 15

Slide 13

Slide 13 text

DOCUMENTATION DISSEMINATION 17

Slide 14

Slide 14 text

MAN PAGES Great for terminal programs Not great for APIs 18

Slide 15

Slide 15 text

USER MANUALS Can be hard to manipulate C++ standard is ≈1800 pages 19

Slide 16

Slide 16 text

WEBSITES How many? Must provide metadata about the code Community building aspects

Slide 17

Slide 17 text

20

Slide 18

Slide 18 text

DOCUMENTATION INSERTION POINTS 22

Slide 19

Slide 19 text

USER PERSPECTIVE Tutorials Code-along 23

Slide 20

Slide 20 text

DEVELOPER PERSPECTIVE API documentation Code contribution guidelines 24

Slide 21

Slide 21 text

LANGUAGES Language Package + + + R pkgdown Python Sphinx C++ Doxygen doxyYoda Julia Documenter.jl Notebooks / MyST Sphinx myst jupytext 26

Slide 22

Slide 22 text

R

Slide 23

Slide 23 text

27

Slide 24

Slide 24 text

JULIA 28

Slide 25

Slide 25 text

PYTHON 29

Slide 26

Slide 26 text

GENERIC sphinx is reasonably good for code documentation Static sites can be leveraged for user-documentation

Slide 27

Slide 27 text

30

Slide 28

Slide 28 text

C++ 31

Slide 29

Slide 29 text

PROJECT FILES /** * @file add.h * @author SymEngine Developers * @date 2021-02-25 * @brief Classes and functions relating to the binary operation of ad * * Created on: 2012-07-11 * * This file contains the basic binary operations defined for symbolic * In particular the @ref Add class for representing addition is * @b declared here, along with the `add` and `substract` functions. */ #ifndef SYMENGINE_ADD_H #define SYMENGINE_ADD_H 32

Slide 30

Slide 30 text

HEADER FILES /** * @brief Create an appropriate instance from dictionary quickly. * @pre The dictionary must be in canonical form. * @see `Mul` for how `Pow` gets returned. * @see `Basic` for the guarantees and expectations. * @param coef the numeric coefficient. * @param d the dictionary of the expression without the coefficien * @return `coef` if the dictionary is empty (size 0). * @return `Mul` if the dictionary has one element which is a `Mul` * @return `Integer` if the dictionary has one element which is a * `Integer`. * @return `Symbol` if the dictionary has one element which is a `S * @return `Pow` if the dictionary has one element which is a `Pow` * @return `Add` if the size of the dictionary is greater than 1. */ static RCP from_dict(const RCP &coef, umap_basic_num &&d); 33

Slide 31

Slide 31 text

SOURCE FILES /** * @details This function ensures that each term in *dict* is in canoni * form. The implementation in the form of a exclusion list (defaults * true). * * @note **Canonical form** requires the existance of both `coef` and * `dict`, so `null` coefficients and purely numerical (empty dictiona * are also not considered to be in canonical form. Also, the ordering * important, it must be `(coeff, dict)` and **not** `(dict, coeff)`. * * Some **non-cannonical** forms are: * - @f$0 + x@f$. * - @f$0 + 2x@f$. * - @f$ 2 \times 3 @f$. * - @f$ x \times 0 @f$. * - @f$ 1 \times x @f$ has the wrong order. * - @f$ 3x \times 2 @f$ is actually just @f$6x@f$. */ bool Add::is_canonical(const RCP &coef, const umap_basic_num &dict) const 34

Slide 32

Slide 32 text

BASE DOXYGEN Is ugly Not mobile friendly 35

Slide 33

Slide 33 text

EXHALE Cannot include source code 36

Slide 34

Slide 34 text

DOXYREST Includes more structure than exhale Can be extended to other source languages Has a rather complicated setup 37

Slide 35

Slide 35 text

DOXYYODA 38

Slide 36

Slide 36 text

TRANSLATIONS At the user level, e.g. with docusaurus cat irhpc.github.io/i18n/is/docusaurus-plugin-content-docs/current/intro

Slide 37

Slide 37 text

40

Slide 38

Slide 38 text

REVIEWING DOCUMENTATION 42

Slide 39

Slide 39 text

DOCUMENTED FALLACIES """ This function adds two numbers """ def sum(a,b): return a*b 43

Slide 40

Slide 40 text

INVALIDATE OFTEN Documentation cannot typically be tested julia aside

Slide 41

Slide 41 text

44

Slide 42

Slide 42 text

CONCLUSIONS 46

Slide 43

Slide 43 text

OMITTED TOPICS Web development and design Including frameworks and UX Continuous integration How to ensure documentation is coupled to working code Benchmarking Demonstrating code superiority Code Review Practices Scrum and teamwork Multi-language API Where code from different languages are called together 47

Slide 44

Slide 44 text

FURTHER RESOURCES Describes the present SOTA for documentation practices in the context of a large multi-language project A large scientific code project designed with a user- wiki, SymEngine and the Season of Docs d-SEAMS [goswamiDSEAMSDeferredStructural2020] 48

Slide 45

Slide 45 text

KEY TAKEAWAYS Document at every level Use the best tools for the job Internationalize only where necessary User level Ensure documentation expires Keep provenance Ensure a documentation style guide is present Lint automatically 49

Slide 46

Slide 46 text

THE END 51

Slide 47

Slide 47 text

BIBLIOGRAPHY Goswami, Goswami & Singh, D-SEAMS: Deferred Structural Elucidation Analysis for Molecular Simulations, Journal of Chemical Information and Modeling, 60(4), 2169-2177 . . [goswamiDSEAMSDeferredStructural2020] doi 52

Slide 48

Slide 48 text

THANKS! 53