Slide 1

Slide 1 text

Fred J. Hickernell, Illinois Institute of Technology April 10, 2025 Data-Driven Error Bounds for Monte Carlo Thanks to • Lulu Kang, Devon Lin, and IMSI • My collaborators • US National Science Foundation #2316011 Slides at speakerdeck.com/fjhickernell/imsi-short-talk-2025-04-10

Slide 2

Slide 2 text

No content

Slide 3

Slide 3 text

No content

Slide 4

Slide 4 text

Next 25 minutes

Slide 5

Slide 5 text

Next 25 minutes • What must you assume to obtain driven error bounds?

Slide 6

Slide 6 text

Next 25 minutes • What must you assume to obtain driven error bounds? • approach

Slide 7

Slide 7 text

Next 25 minutes • What must you assume to obtain driven error bounds? • approach • approach

Slide 8

Slide 8 text

Next 25 minutes • What must you assume to obtain driven error bounds? • approach • approach • IID Monte Carlo

Slide 9

Slide 9 text

Next 25 minutes • What must you assume to obtain driven error bounds? • approach • approach • IID Monte Carlo • Quasi-Monte Carlo

Slide 10

Slide 10 text

Next 25 minutes • What must you assume to obtain driven error bounds? • approach • approach • IID Monte Carlo • Quasi-Monte Carlo • Discussion and future work

Slide 11

Slide 11 text

Monte Carlo

Slide 12

Slide 12 text

Monte Carlo • You want to know some property of a random variable, , e.g., mean, probability density, or quartile Y

Slide 13

Slide 13 text

Monte Carlo • You want to know some property of a random variable, , e.g., mean, probability density, or quartile Y • You generate samples , which you use to approximate the property of Y1 , Y2 , . . . , Yn Y

Slide 14

Slide 14 text

Monte Carlo • You want to know some property of a random variable, , e.g., mean, probability density, or quartile Y • You generate samples , which you use to approximate the property of Y1 , Y2 , . . . , Yn Y • How do you generate samples? • How do you approximate the mean, density, quartile, etc.? • How many samples do you need and/or what error bound can you con fi dently assert?

Slide 15

Slide 15 text

Examples of Monte Carlo Y = f(X), X ∈ ℝd and is easy to sample Y = f(X) = option payo ff underground water pressure with random rock porosity pixel intensity from random ray option price average water pressure average pixel intensity = μ := 𝔼 (Y) may be dozens or hundreds d

Slide 16

Slide 16 text

IID Monte Carlo Confidence Intervals

Slide 17

Slide 17 text

• Monte Carlo approximates 
 
 by , for μ := 𝔼 (Y) = 𝔼 [f( X ⏟ ∼ 𝒰 [0,1]d )] = ∫ [0,1]d f(x) dx ̂ μn := 1 n n ∑ i=1 f(xi ) {xi }n i=1 IID ∼ 𝒰 [0,1]d IID Monte Carlo Confidence Intervals

Slide 18

Slide 18 text

• Monte Carlo approximates 
 
 by , for μ := 𝔼 (Y) = 𝔼 [f( X ⏟ ∼ 𝒰 [0,1]d )] = ∫ [0,1]d f(x) dx ̂ μn := 1 n n ∑ i=1 f(xi ) {xi }n i=1 IID ∼ 𝒰 [0,1]d • Chebyshev: for ℙ[|μ − ̂ μn | ≤ σ/ αn] ≥ 1 − α f ∈ {f : Std(f ) ≤ σ} IID Monte Carlo Confidence Intervals

Slide 19

Slide 19 text

• Monte Carlo approximates 
 
 by , for μ := 𝔼 (Y) = 𝔼 [f( X ⏟ ∼ 𝒰 [0,1]d )] = ∫ [0,1]d f(x) dx ̂ μn := 1 n n ∑ i=1 f(xi ) {xi }n i=1 IID ∼ 𝒰 [0,1]d • Chebyshev: for ℙ[|μ − ̂ μn | ≤ σ/ αn] ≥ 1 − α f ∈ {f : Std(f ) ≤ σ} • Empirical Bernstein [MP09]: for , where is the sample variance ℙ [ |μ − ̂ μn | ≤ 2 ̂ σ2 n log(4/α) n + 7 log(4/α) 3(n − 1) ] ≥ 1 − α f = {f ∈ ℒ2[0,1]d : 0 ≤ f(x) ≤ 1} ̂ σ2 n • See [WSR24] for betting con fi dence intervals IID Monte Carlo Confidence Intervals

Slide 20

Slide 20 text

IID Monte Carlo Confidence Intervals

Slide 21

Slide 21 text

• Monte Carlo approximates 
 
 by , for μ := 𝔼 (Y) = 𝔼 [f( X ⏟ ∼ 𝒰 [0,1]d )] = ∫ [0,1]d f(x) dx ̂ μn := 1 n n ∑ i=1 f(xi ) {xi }n i=1 IID ∼ 𝒰 [0,1]d IID Monte Carlo Confidence Intervals

Slide 22

Slide 22 text

• Monte Carlo approximates 
 
 by , for μ := 𝔼 (Y) = 𝔼 [f( X ⏟ ∼ 𝒰 [0,1]d )] = ∫ [0,1]d f(x) dx ̂ μn := 1 n n ∑ i=1 f(xi ) {xi }n i=1 IID ∼ 𝒰 [0,1]d • Central Limit Theorem: Compute the sample standard deviation, , 
 for of nice functions
 ̂ σnσ ℙ[ |μ − ̂ μn | ≤ 1.96 fdg(n) ̂ σnσ / n ] ≥ 95 % f ∈ IID Monte Carlo Confidence Intervals

Slide 23

Slide 23 text

• Monte Carlo approximates 
 
 by , for μ := 𝔼 (Y) = 𝔼 [f( X ⏟ ∼ 𝒰 [0,1]d )] = ∫ [0,1]d f(x) dx ̂ μn := 1 n n ∑ i=1 f(xi ) {xi }n i=1 IID ∼ 𝒰 [0,1]d • Central Limit Theorem: Compute the sample standard deviation, , 
 for of nice functions
 ̂ σnσ ℙ[ |μ − ̂ μn | ≤ 1.96 fdg(n) ̂ σnσ / n ] ≥ 95 % f ∈ • [HJLO13] gives a rigorous approach using Berry-Esseen inequalities IID Monte Carlo Confidence Intervals

Slide 24

Slide 24 text

• Monte Carlo approximates 
 
 by , for μ := 𝔼 (Y) = 𝔼 [f( X ⏟ ∼ 𝒰 [0,1]d )] = ∫ [0,1]d f(x) dx ̂ μn := 1 n n ∑ i=1 f(xi ) {xi }n i=1 IID ∼ 𝒰 [0,1]d • Central Limit Theorem: Compute the sample standard deviation, , 
 for of nice functions
 ̂ σnσ ℙ[ |μ − ̂ μn | ≤ 1.96 fdg(n) ̂ σnσ / n ] ≥ 95 % f ∈ • [HJLO13] gives a rigorous approach using Berry-Esseen inequalities • Bahadur and Savage [BS56] explain why the cone should be non-convex IID Monte Carlo Confidence Intervals

Slide 25

Slide 25 text

Low discrepancy (LD) sequences can beat grids and IID for moderate to large d Quasi-Monte Carlo (QMC) methods [HKS25] 0.00 0.25 0.50 0.75 1.00 xi1 0.00 0.25 0.50 0.75 1.00 xi2 64 IID points for d = 6 0.0 0.2 0.4 0.6 0.8 1.0 xi1 0.0 0.2 0.4 0.6 0.8 1.0 xi2 64 Grid Points for d =6 0.00 0.25 0.50 0.75 1.00 xi1 0.00 0.25 0.50 0.75 1.00 xi2 64 Sobol’ points for d = 6 0.00 0.25 0.50 0.75 1.00 xi1 0.00 0.25 0.50 0.75 1.00 xi2 64 Lattice Points for d = 6

Slide 26

Slide 26 text

Low discrepancy (LD) sequences can beat grids and IID for moderate to large d Quasi-Monte Carlo methods 100 101 102 103 104 105 106 107 Sample Size, n 10°6 10°4 10°2 100 Relative Error, |(µ ° ˆ µn )/µ| grid O(n°1/5) IID MC med O(n°1/2) LD med O(n°1) μ = ∫ ℝ6 cos(∥x∥) exp( −∥x∥2) dx [Kei96]

Slide 27

Slide 27 text

Discrepancy measures the quality of [H00] x0 , x1 , … μ := expectation 𝔼 [f( X ⏟ ∼ 𝒰 ([0,1]d) )] = integral ∫ [0,1]d f(x) dx ≈ sample mean 1 n n ∑ i=1 f(xi ) =: ̂ μn |μ − ̂ μn | ≤ tight discrepancy({xi }n i=1 ) norm of the error functional variation(f ) semi-norm

Slide 28

Slide 28 text

Discrepancy measures the quality of [H00] x0 , x1 , … μ := expectation 𝔼 [f( X ⏟ ∼ 𝒰 ([0,1]d) )] = integral ∫ [0,1]d f(x) dx ≈ sample mean 1 n n ∑ i=1 f(xi ) =: ̂ μn |μ − ̂ μn | ≤ tight discrepancy({xi }n i=1 ) norm of the error functional variation(f ) semi-norm If is in a Hilbert space with reproducing kernel , then f ℋ K discrepancy2({xi }n i=1 ) = ∫ [0,1]d×[0,1]d K(t, x) dt dx − 2 n n ∑ i=1 ∫ [0,1]d K(t, xi ) dt + 1 n2 n ∑ i,j=1 K(xi , xj ) variation(f ) = inf c∈ℝ ∥f − c∥ℋ operations 𝒪 (dn2) infeasible to compute

Slide 29

Slide 29 text

Deterministic stopping rules for QMC [HJ16, JH16, SJ24] μ := expectation 𝔼 [f( X ⏟ ∼ 𝒰 ([0,1]d) )] = integral ∫ [0,1]d f(x) dx ≈ sample mean 1 n n ∑ i=1 f(xi ) =: ̂ μn , want |μ − ̂ μn | ≤ ε • For lattices, , , • due to aliasing ( constant for all ) • Assume that lies in of functions whose Fourier coe ffi cients decay reasonably • Approximate by a one-dimensional FFT of • FFT approximations provide a rigorous data-driven bound on with work • Similarly for digital nets ̂ f(k) := ∫ [0,1]d f(x) 𝖾 −2π −1kT x dx f(x) = ∑ k∈ℤd ̂ f(k) 𝖾 2π −1kT x μ = ̂ f(0) μ − ̂ μn 𝖾 2π −1kT xi i f ̂ f(k) {f(xi )}n−1 i=0 μ − ̂ μn 𝒪 (n log n)

Slide 30

Slide 30 text

Deterministic stopping rules for QMC [HJ16, JH16, SJ24] μ := expectation 𝔼 [f( X ⏟ ∼ 𝒰 ([0,1]d) )] = integral ∫ [0,1]d f(x) dx ≈ sample mean 1 n n ∑ i=1 f(xi ) =: ̂ μn , want |μ − ̂ μn | ≤ ε • For lattices, , , • due to aliasing ( constant for all ) • Assume that lies in of functions whose Fourier coe ffi cients decay reasonably • Approximate by a one-dimensional FFT of • FFT approximations provide a rigorous data-driven bound on with work • Similarly for digital nets ̂ f(k) := ∫ [0,1]d f(x) 𝖾 −2π −1kT x dx f(x) = ∑ k∈ℤd ̂ f(k) 𝖾 2π −1kT x μ = ̂ f(0) μ − ̂ μn 𝖾 2π −1kT xi i f ̂ f(k) {f(xi )}n−1 i=0 μ − ̂ μn 𝒪 (n log n)

Slide 31

Slide 31 text

Bayesian stopping rules for QMC [JH19, JH22] μ := expectation 𝔼 [f( X ⏟ ∼ 𝒰 ([0,1]d) )] = integral ∫ [0,1]d f(x) dx ≈ sample mean 1 n n ∑ i=1 f(xi ) =: ̂ μn , want |μ − ̂ μn | ≤ ε • Assume is a Gaussian stochastic process with covariance kernel , where the hyper-parameters are properly tuned • Construct a Bayesian credible interval for • If is chosen to fi t the lattice/digital sequences, then the computation normally required reduces to • Works for in f K μ K 𝒪 (n3) 𝒪 (n log n) f

Slide 32

Slide 32 text

Keister example sample size and time μ = ∫ ℝ6 cos(∥x∥) exp( −∥x∥2) dx [Kei96] 10°4 10°3 10°2 10°1 Relative error tolerance, " 102 103 104 105 106 107 Sample size, n Sobol’ deterministic lattice deterministic Sobol’ Bayesian lattice Bayesian 10°4 10°3 10°2 10°1 Relative error tolerance, " 10°4 10°3 10°2 10°1 100 101 Computation time (sec) Sobol’ deterministic lattice deterministic Sobol’ Bayesian lattice Bayesian

Slide 33

Slide 33 text

Discussion and future work Adaptive 1-D Integration Adaptive (quasi-)Monte Carlo Adaptive multi-level (quasi-)Monte Carlo Adaptive FEM for PDEs w/ a posteriori error bounds Adaptive multi-level (quasi-)Monte Carlo with FEM for PDEs with randomness • For replications of QMC for bounded see [JO25] • For another view, see [LENOT24] f

Slide 34

Slide 34 text

MCM2025Chicago.org July 28 – August 1 Plenary Speakers Nicholas Chopin, ENSAE Peter Glynn, Stanford U
 Roshan Joseph, Georgia Tech Christiane Lemieux, U Waterloo
 Matt Pharr, NVIDIA Veronika Rockova, U Chicago
 Uros Seljak, U California, Berkeley Michaela Szölgyenyi, U Klagenfurt

Slide 35

Slide 35 text

References [BS56] R. R. Bahadur and L. J. Savage, The nonexistence of certain statistical procedures in nonparametric problems, Ann. Math. Stat. 27 (1956), 1115–1122. [Gil15] M. B. Giles, Multilevel Monte Carlo methods, Acta Numer. 24 (2015), 259–328. [H00] F. J. Hickernell, What a ff ects the accuracy of quasi-Monte Carlo quadrature?, Monte Carlo and Quasi-Monte Carlo Methods 1998 (H. Niederreiter and J. Spanier, eds.), Springer-Verlag, Berlin, 2000, pp. 16–55. [HJLO13] F. J. Hickernell, L. Jiang, Y. Liu, and A. B. Owen, Guaranteed conservative fi xed width con fi dence intervals via Monte Carlo sampling, Monte Carlo and Quasi-{M}onte {C}arlo Methods 2012 (J. Dick, F. Y. Kuo, G. W. Peters, and I. H. Sloan, eds.), Springer Proceedings in Mathematics and Statistics, vol. 65, Springer-Verlag, Berlin, 2013, pp. 105–128. [HJ16] F. J. Hickernell and Ll. A. Jiménez Rugama, Reliable adaptive cubature using digital sequences, Monte Carlo and Quasi-Monte Carlo Methods: MCQMC, Leuven, Belgium, April 2014 (R. Cools and D. Nuyens, eds.), Springer Proceedings in Mathematics and Statistics, vol. 163, Springer-Verlag, Berlin, 2016, pp. 367–383.

Slide 36

Slide 36 text

References [HKS25] F. J. Hickernell, N. Kirk, and A. G. Sorokin, Quasi-Monte Carlo methods: What, why, and how?, submitted to MCQMC 2024 proceedings, https://doi.org/10.48550/arXiv.2502.03644, 2025+. [JH19] R. Jagadeeswaran and F. J. Hickernell, Fast automatic Bayesian cubature using lattice sampling, Stat. Comput. 29 (2019), 1215–1229. [JH22] R. Jagadeeswaran and F. J. Hickernell, Fast automatic Bayesian cubature using Sobol' sampling, Advances in Modeling and Simulation: Festschrift in Honour of Pierre L'Ecuyer (Z. Botev, A. Keller, C. Lemieux, and B. Tu ff i n, eds.), Springer, Cham, 2022, pp. 301–318. [JO25] A. Jain and A. B. Owen, Empirical Bernstein and betting con fi dence intervals for randomized quasi-Monte Carlo, in preparation to be submitted for publication, 2025+. [JH16] Ll. A. Jiménez Rugama and F. J. Hickernell, Adaptive multidimensional integration based on rank-1 lattices, Monte Carlo and Quasi-Monte Carlo Methods: MCQMC, Leuven, Belgium, April 2014 (R. Cools and D. Nuyens, eds.), Springer Proceedings in Mathematics and Statistics, vol. 163, Springer-Verlag, Berlin, 2016, pp. 407–422.

Slide 37

Slide 37 text

References [Kei96] B. D. Keister, Multidimensional quadrature algorithms, Computers in Physics 10 (1996), 119– 122. [LENOT24] P. L'Ecuyer, M. K. Nakayama, A. B. Owen, and B. Tu ffi n, Con fi dence intervals for randomized quasi-Monte Carlo estimators, WSC ’23: Proceedings of the Winter Simulation Conference, 2024, pp. 445–456. [MP09] A. Maurer and M. Pontil, Empirical Bernstein bounds and sample variance penalization, Proceedings of the 22nd Annual Conference on Learning Theory (COLT), 2009, pp. 1–9. [SJ24] A. G. Sorokin and R. Jagadeeswaran, On bounding and approximating functions of multiple expectations using quasi-Monte Carlo, Monte Carlo and Quasi-Monte Carlo Methods: MCQMC, Linz, Austria, July 2022 (A. Hinrichs, P. Kritzer, and F. Pillichshammer, eds.), Springer Proceedings in Mathematics and Statistics, Springer, Cham, 2024, pp. 583–589. [WSR24] I. Waubdy-Smith and A. Ramdas, Estimating means of bounded random variables by betting, J. Roy. Statist. Soc. B 86 (2024), 1–27.

Slide 38

Slide 38 text

Multilevel methods reduce computational cost [Gil15] μ := expectation 𝔼 [f( X ⏟ ∼ 𝒰 ([0,1]d) )] = integral ∫ [0,1]d f(x) dx ≈ sample mean 1 n n ∑ i=1 f(xi ) =: ̂ μn • The cost to evaluate is typically , so the cost to obtain is typically • If one can approximate by lower dimensional approximations, , then f(xi ) 𝒪 (d) |μ − ̂ μn | ≤ ε 𝒪 (dε−1−δ) f fs : [0,1]s → ℝ μ = 𝔼 [fs1 (X1:s1 )] μ(1) + 𝔼 [fs2 (X1:s2 ) − fs1 (X1:s1 )] μ(2) + ⋯ + 𝔼 [f(X1:d ) − fsL−1 (X1:sL−1 )] μ(L) • Balance the cost to approximate well and the total cost to obtain may be as small as as 𝒪 (nl sl ) μ(s) |μ − ̂ μn | ≤ ε 𝒪 (ε−1−δ) d, ε−1 → ∞