of Applied Mathematics & Center for Interdisciplinary Scientific Computation Illinois Institute of Technology hickernell@iit.edu mypages.iit.edu/~hickernell with input from Mark Klinchin, the GAIL an QMCPy teams, and friends partially supported by SigOpt (Intel) and an Illinois Tech alum Join us Tuesday, July 19, at 12:45 PM for lunch at Bella Casa to discuss the future of MCQMC software Thanks to the organizers as we meet again in person Slides at speakerdeck.com/fjhickernell/challenges-of-developing-mcqmc-software MCQMC 2022, Linz, July 18, 2022
Ecosystem Should Be Correct, e.g., not omit the zeroth point in a low discrepancy sequence (Owen 2022; SciPy Developers 2020) Complete—contain the components or easily access components in other libraries to solve real, complex problems Accessible—downloadable; tutorials, demos, discussion forums, etc. for (new) users; written in a language that potential users speak; provide a consistent user interface Efficient—in terms of compute time and memory Current—include the latest and best algorithms Sustainable—have a sufficient user base and developer community for updates and maintenance Scalable—take advantage of advanced computer architectures for speed and to solve large problems 2/20
Available MCQMC Software I L’Ecuyer (2017) provides a history of random number generation Notorious RANDU from the 1960s (Wikipedia n.d.), which fails the spectral test VEGAS (Lepage 1978; Lepage 2021) importance sampling Monte Carlo algorithm favored by physicists ACM low discrepancy point generators (Bratley and Fox 1988; Bratley, Fox, and Niederreiter 1992; Hong and H. 2003) FinDer (Paskov and Traub 1995; Papageorgiou n.d.) and BRODA (Kucherenko n.d.) targeting quantitative finance Korobov cubature and scrambled Sobol’ generators in NAG (The Numerical Algorithms Group 2021) for decades Scrambled Sobol’ and Halton generators in MATLAB (The MathWorks, Inc. 2022) since 2008, and fixed a few years later 3/20
Available MCQMC Software II BUGS (Lunn et al. 2012; MRC Biostatistic Unit n.d.) and Stan (Stan Development Team 2022) for Markov Chain Monte Carlo (MCMC) LatMRG (L’Ecuyer and Couture 1997), RNGStreams (L’Ecuyer, Simard, et al. 2002), SSJ (L’Ecuyer 2002; L’Ecuyer n.d.), and LatNetBuilder (Darmon et al. 2018) Talk M 11 Low discrepancy generators (Friedel and Keller 2002; Friedel and Keller 2001), SamplePack (Kollig and Keller 2002), Grünschloß (n.d.), and MatBuilder (Paulin et al. 2022) Talk M 12, Tu 12 Sobol’ direction numbers (Joe and Kuo 2003; Joe and Kuo 2008; Joe and Kuo n.d.) Fast CBC, Magic Point Shop, QMC4PDE and other code since 2004 at Nuyens (n.d.) Talk Tu 10:30 Multi-level software (Giles n.d.[a]; Giles n.d.[b]) Talk M 10:30 4/20
Available MCQMC Software III Data-driven error bounds and stopping criteria in GAIL (Choi, Ding, et al. 2021) and QMCPy (Choi, H., Jagadeeswaran, M. McCourt, et al. 2021; Choi, H., Jagadeeswaran, M. J. McCourt, et al. 2022) Talk Tu 11 Uncertainty quantification libraries Dakota (Adams et al. 2022), UQTk (Debusschere et al. 2004; Debusschere n.d.), and MUQ (Parno et al. 2014) Talk Tu 11:30 have some basic level low discrepancy sampling QMC framework in Julia since 2019 (SciML QuasiMonteCarlo.jl n.d.) by Robbe and others Talk M 17 Scrambled Sobol’ in SciPy (Virtanen et al. 2020) and PyTorch (Paszke et al. 2019) since several years ago TensorFlow QMC framework (TF Quant Finance Contributors 2021) since a year or so ago We have stand-alone algorithms, small libraries, and pieces of larger libraries 5/20
Software? Theory Explains, justifies, and leads to algorithms Assumptions =⇒ success can be viewed as failure =⇒ what went wrong Software Makes algorithms practical Solves (new) applications Eliminates do-it-yourself Applications Make societal impact Inspire new theory 6/20
Software? Theory Explains, justifies, and leads to algorithms Assumptions =⇒ success can be viewed as failure =⇒ what went wrong Software Makes algorithms practical Solves (new) applications Eliminates do-it-yourself Applications Make societal impact Inspire new theory The US Department of Energy is investing in studying how to develop great scientific software, including a recent Workshop on the Science of Scientific-Software Development and Use (Heroux 2019; U.S. Department of Energy, Office of Advanced Scientific Computing Research 2021) 6/20
T g(t) λ(t) dt = · · · = E[f(X)] expectation = [0,1]d f(x) dx integration ≈ 1 n n i=1 f(Xi) =: ^ µn Low Discrepancy Generator producing {X1, X2, . . . } that mimics the uniform distribution True Measure that defines the original integral, e.g., Lebesgue; embodies the transformation t = Ψ(x) Integrand g, which defines the original integral, plus the transformed version, f, to fit the LD generator Stopping Criterion based on a data-driven error bound, which determines how large n should be to ensure that |µ − ^ µn| ⩽ ε 7/20
T g(t) λ(t) dt = · · · = E[f(X)] expectation = [0,1]d f(x) dx integration ≈ 1 n n i=1 f(Xi) =: ^ µn Low Discrepancy Generator producing {X1, X2, . . . } that mimics the uniform distribution True Measure that defines the original integral, e.g., Lebesgue; embodies the transformation t = Ψ(x) Integrand g, which defines the original integral, plus the transformed version, f, to fit the LD generator Stopping Criterion based on a data-driven error bound, which determines how large n should be to ensure that |µ − ^ µn| ⩽ ε May we agree on a common software definition for these these objects like we have for floating point numbers or basic mathematical functions? 7/20
or Environment Language/library choices of the users are driven by familiarity and speed Favorites aligned with various communities, but no one dominates all Difference between prototyping environments and ninja written production code Growth of multi-processor environments Provides opportunity Complicates software development; where does parallelism really help? 8/20
or Environment Language/library choices of the users are driven by familiarity and speed Favorites aligned with various communities, but no one dominates all Difference between prototyping environments and ninja written production code Growth of multi-processor environments Provides opportunity Complicates software development; where does parallelism really help? Recommendations if you want to contribute software Realize your new idea as a demo in your favorite library Add a significant idea as a feature to other libraries Move basic features of your library to larger library Write wrappers or demos to connect other libraries to your library 8/20
(Logg et al. 2012) Want to demonstrate how our QMC software could work with a popular DE solver for DEs with random coefficients Partial success − d dx a(x, W) d dx u(x, W) = 200(2x − 1) 0 ⩽ x ⩽ 1, u(0) = u(1) = 0 a(x, W) = 1 + 0.6 d k=1 Wk Tk(2x − 1) k2 W ∼ U[−1, 1]d E[u(0.25, W)] = ? 9/20
(Logg et al. 2012) Want to demonstrate how our QMC software could work with a popular DE solver for DEs with random coefficients Partial success − d dx a(x, W) d dx u(x, W) = 200(2x − 1) 0 ⩽ x ⩽ 1, u(0) = u(1) = 0 a(x, W) = 1 + 0.6 d k=1 Wk Tk(2x − 1) k2 W ∼ U[−1, 1]d E[u(0.25, W)] = −1.7462 10−3 10−2 10−1 Error Tolerance 10−1 100 101 102 103 Computation Time (seconds) Comparison of IID and Sobol Points IID Points Sobol Points 9/20
(Logg et al. 2012) Want to demonstrate how our QMC software could work with a popular DE solver for DEs with random coefficients Partial success Why? Even with FEniCS/Dolfin’s extensive documentation we had difficulties Did not understand the latest version, so reverted to an older version Had to learn how to change the random instance of the coefficient without tiggering the JIT compiler Do not yet know how to express the random coefficient in terms of the covariance kernel But It works If we can overcome the challenges, will post a blog and demo at Choi, Hickernell, et al. (2020) 9/20
Library Requires A repository where Your library can be easily downloaded There is documentation Issues or queries can be posted There are updates Tests are run regularly to mitigate against bugs (Elementary) demos on how to use the software and highlighting its advantages Demos connecting your library with others Recommending your students and postdocs to use your library Welcoming instructions on how to contribute an algorithm or a demo 10/20
for Your Library Many senior researchers don’t code, and rely on transient team members (students, postdocs) Encourage a sense of ownership that lasts beyond the time your team members are part of your lab Encourage those outside your own group to contribute Those of us who understand theory and appreciate software need to encourage and steer popular software to ensure that it is implemented well. Recognize good software as valuable scholarly output Recognize research software engineer as a valued vocation; see Research Software Engineers International https://researchsoftware.org, Research Software Engineers are people who combine professional software expertise with an understanding of research. They go by various job titles but the term Research Software Engineer (RSE) is fast gaining international recognition. 11/20
theoretical developments are not yet implemented in readily available software? What potential MCQMC user communities are ill-served by existing software libraries? Should those of us interested in software gather regularly for show-and-tell to foster more collaboration? Join us for a lunch discussion at Bella Casa tomorrow, July 19, at 12:45 PM 12/20
M. et al. (May 2022). Dakota, A Multilevel Parallel Object-Oriented Framework for Design Optimization, Parameter Estimation, Uncertainty Quantification, and Sensitivity Analysis: Version 6.16 User’s Manual. Sandia National Laboratories. Bratley, P. and B. L. Fox (1988). “Algorithm 659: Implementing Sobol’s Quasirandom Sequence Generator”. In: ACM Trans. Math. Software 14, pp. 88–100. Bratley, P., B. L. Fox, and H. Niederreiter (1992). “Implementation and Tests of Low-Discrepancy Sequences”. In: ACM Trans. Model. Comput. Simul. 2, pp. 195–213. Choi, S.-C. T., F. J. H., R. Jagadeeswaran, M. McCourt, et al. (2021). QMCPy: A quasi-Monte Carlo Python Library. doi: 10.5281/zenodo.3964489. url: https://qmcsoftware.github.io/QMCSoftware/. Choi, S.-C. T., F. J. H., R. Jagadeeswaran, M. J. McCourt, et al. (2022). “Quasi-Monte Carlo Software”. In: Monte Carlo and Quasi-Monte Carlo Methods: MCQMC, Oxford, England, August 2020. Ed. by A. Keller. Springer Proceedings in Mathematics and Statistics. https://arxiv.org/abs/2102.07833. Springer, Cham, pp. 23–50. 13/20
T., Y. Ding, et al. (2021). GAIL: Guaranteed Automatic Integration Library (Versions 1.0–2.3.2). MATLAB software, http://gailgithub.github.io/GAIL_Dev/. doi: 10.5281/zenodo.4018189. Choi, S.-C. T., F. J. Hickernell, et al. (2020). QMCPy Blogs. url: https://qmcpy.wpcomstaging.com. Darmon, Y. et al. (2018). LatNet Builder. url: https://github.com/umontreal-simul/latnetbuilder. Debusschere, B. (n.d.). UQ Toolkit. url: https://www.sandia.gov/uqtoolkit/. Debusschere, B. et al. (2004). “Numerical Challenges in the Use of Polynomial Chaos Representations for Stochastic Processes”. In: SIAM J. Sci. Comput., pp. 698–719. url: 10.1137/S1064827503427741. Friedel, I. and A. Keller (2001). libseq – low discrepancy sequence library. url: http://www.multires.caltech.edu/software/libseq/. — (2002). “Fast Generation of Randomized Low-Discrepancy Point Sets”. In: Monte Carlo and Quasi-Monte Carlo Methods 2000. Ed. by K. T. Fang, F. J. H., and H. Niederreiter. Springer-Verlag, Berlin, pp. 257–273. 14/20
(n.d.[a]). Multi-Level Monte Carlo Software. url: https://people.maths.ox.ac.uk/gilesm/mlmc/. — (n.d.[b]). Multi-Level Quasi-Monte Carlo Software. url: http://people.maths.ox.ac.uk/~gilesm/mlqmc/. Grünschloß, L. (n.d.). Leonhard Grünschloß’s webpage. url: https://gruenschloss.org. Heroux, M. (2019). Research Software Science: A Scientific Approach to Understanding and Improving How We Develop and Use Software for Research. url: https://bssw.io/blog_posts/research-software-science-a-scientific-approach-to- understanding-and-improving-how-we-develop-and-use-software-for-research (visited on 2019). Hong, H. S. and F. J. H. (2003). “Algorithm 823: Implementing Scrambled Digital Nets”. In: ACM Trans. Math. Software 29, pp. 95–109. doi: 10.1145/779359.779360. Joe, S. and F. Y. Kuo (2003). “Remark on Algorithm 659: Implementing Sobol’s quasirandom sequence generator”. In: ACM Trans. Math. Software 29, pp. 49–57. 15/20
and F. Y. Kuo (2008). “Constructing Sobol sequences with better two-dimensional projections”. In: SIAM J. Sci. Comput. 30, pp. 2635–2654. — (n.d.). Sobol Sequence Generator. url: https://web.maths.unsw.edu.au/~fkuo/sobol/. Kollig, T. and A. Keller (2002). SamplePack. url: https://www.uni-kl.de/AG-Heinrich/SamplePack.html. Kucherenko, S. (n.d.). BRODA. url: https://www.broda.co.uk/index.html. L’Ecuyer, P. (2002). “SSJ: A Framework for Stochastic Simulation in Java”. In: 2002 Winter Simulation Conference. IEEE Press, pp. 234–242. — (2017). “History of uniform random number generation”. In: Proceedings of the 2017 Winter Simulation Conference. Ed. by W. K. V. Chan et al. doi: 10.1109/WSC.2017.8247790. — (n.d.). SSJ: Stochastic Simulation in Java. url: https://github.com/umontreal-simul/ssj. 16/20
and R. Couture (1997). “An Implementation of the Lattice and Spectral Tests for Multiple Recursive Linear Random Number Generators”. In: INFORMS J. Comput. 9, pp. 206–217. L’Ecuyer, P., R. Simard, et al. (2002). “An Objected-Oriented Random-Number Package with Many Long Streams and Substreams”. In: Oper. Res. 50, pp. 1073–1075. doi: 10.1287/opre.50.6.1073.358. Lepage, G. P. (1978). “A new algorithm for adaptive multidimensional integration”. In: J. Comput. Phys. 27, pp. 192–203. — (2021). “Adaptive multidimensional integration: VEGAS enhanced”. In: J. Comput. Phys. 439, p. 110386. Logg, A., G. N. Wells, and J. Hake (2012). “DOLFIN: a C++/Python Finite Element Library”. In: Automated Solution of Differential Equations by the Finite Element Method. Ed. by K. M. A. Logg and G. N. Wells. Vol. 84. Lecture Notes in Computational Science and Engineering. Springer. Chap. 10. Lunn, D. et al. (2012). The BUGS Book: A Practical Introduction to Bayesian Analysis. Chapman & Hall. 17/20
Unit (n.d.). The BUGS Project. url: https://www.mrc-bsu.cam.ac.uk/software/bugs/. Nuyens, D. (n.d.). Dirk’s Homepage. url: https://people.cs.kuleuven.be/~dirk.nuyens/. Owen, A. B. (2022). “On dropping the first Sobol’ point”. In: Monte Carlo and Quasi-Monte Carlo Methods: MCQMC, Oxford, England, August 2020, pp. 71–86. url: https://arxiv.org/pdf/2008.08051.pdf. Papageorgiou, A. (n.d.). FinDer. url: http://www.cs.columbia.edu/~ap/html/finder.html. Parno, M. et al. (2014). MIT uncertainty quantification (MUQ) library. url: https://mituq.bitbucket.io/source/_site/index.html. Paskov, S. and J. Traub (1995). “Faster Valuation of Financial Derivatives”. In: J. Portfolio Management 22, pp. 113–120. Paszke, A. et al. (2019). “PyTorch: An imperative style, high-performance deep learning library”. In: Advances in neural information processing systems 32, pp. 8026–8037. 18/20
et al. (Aug. 2022). “MatBuilder: Mastering Sampling Uniformity Over Projections”. In: ACM Transactions on Graphics (Proceedings of SIGGRAPH) 41.4. doi: https://doi.org/10.1145/3528223.3530063. SciML QuasiMonteCarlo.jl (n.d.). url: https://github.com/SciML/QuasiMonteCarlo.jl. SciPy Developers (2020). scipy discussion of Sobol’ sequence implementation. url: https://github.com/scipy/scipy/pull/10844. Stan Development Team (2022). Stan Modeling Language Users Guide and Reference Manual, version 2.30. url: http://mc-stan.org. TF Quant Finance Contributors (2021). Quasi Monte-Carlo Methods. url: https://github.com/google/tf-quant-finance/math/qmc. The MathWorks, Inc. (2022). MATLAB R2022b. Natick, MA. The Numerical Algorithms Group (2021). The NAG Library. Mark 27. Oxford. U.S. Department of Energy, Office of Advanced Scientific Computing Research (2021). Workshop on the Science of Scientific-Software Development and Use. url: https://web.cvent.com/event/1b7d7c3a-e9b4-409d-ae2b-284779cfe72f/summary. 19/20
et al. (2020). “SciPy 1.0: fundamental algorithms for scientific computing in Python”. In: Nature Methods 17.3, pp. 261–272. Wikipedia (n.d.). RANDU. url: https://en.wikipedia.org/wiki/RANDU. 20/20