Fred J. Hickernell
October 27, 2021
46

# Dagstuhl Bayesian Cubature LD QMCPy

October 27, 2021

## Transcript

1. Bayesian Cubature with Low Discrepancy Sequences in QMCPy
Fred J. Hickernell, Jagadeeswaran Rathinavel & Aleksei Sorokin
Department of Applied Mathematics Center for Interdisciplinary Scientific Computation
Illinois Institute of Technology [email protected] mypages.iit.edu/~hickernell
Thanks to the GAIL and QMCPy teams
Thanks to the organizers, especially during these unusual times
Disclosure: I am multicultural (computational mathematician & statistician)
Slides at speakerdeck.com/fjhickernell/dagstuhl-bayesian-cubature-ld-qmcpy
Probabilistic Numerics @ Dagstuhl, October 27, 2021

2. Introduction Matched Data Sites & Covariance Kernels QMCPy Challenges Summary References
Our take on Bayesian Quadrature Cubature [3–6]
µ :=
[0,1]d
f(x) dx =? ^
µn = function of f(x1), . . . , f(xn) Want P[|µ − ^
µn| ⩽ ε] ⩾ 99%
Assume we are handed f ∼ GP(m, s2Cθ), so E[f(x)] = m and
E {f(t) − m}{f(x) − m} = s2Cθ(t, x)
Sample f at x1, . . . , xn
Construct a credible interval for µ in terms of the posterior mean ^
µn
Increase n until the half-width is no greater than ε
2/14

3. Introduction Matched Data Sites & Covariance Kernels QMCPy Challenges Summary References
Our take on Bayesian Quadrature Cubature [3–6]
µ :=
[0,1]d
f(x) dx =? ^
µn = function of f(x1), . . . , f(xn) Want P[|µ − ^
µn| ⩽ ε] ⩾ 99%
Assume we are handed f ∼ GP(m, s2Cθ), so E[f(x)] = m and
E {f(t) − m}{f(x) − m} = s2Cθ(t, x)
Sample f at x1, . . . , xn
Tune the hyperparameters by empirical Bayes so that f is not an outlier
Construct a credible interval for µ in terms of the posterior mean ^
µn
Increase n until the half-width is no greater than ε
2/14

4. Introduction Matched Data Sites & Covariance Kernels QMCPy Challenges Summary References
Our take on Bayesian Quadrature Cubature [3–6]
µ :=
[0,1]d
f(x) dx =? ^
µn = function of f(x1), . . . , f(xn) Want P[|µ − ^
µn| ⩽ ε] ⩾ 99%
Assume we are handed f ∼ GP(m, s2Cθ), so E[f(x)] = m and
E {f(t) − m}{f(x) − m} = s2Cθ(t, x)
Choose Cθ and x1, x2, . . . for a fast eigenvector-eigenvalue decomposition of the Gram
matrix Cθ = Cθ(xi, xj) n
i,j=1
; this is a practical choice, rather than informed by f (see also
Kanagawa, Shäfer, Wenger)
Sample f at x1, . . . , xn
Tune the hyperparameters by empirical Bayes so that f is not an outlier
Construct a credible interval for µ in terms of the posterior mean ^
µn
Increase n until the half-width is no greater than ε
2/14

5. Introduction Matched Data Sites & Covariance Kernels QMCPy Challenges Summary References
Our take on Bayesian Quadrature Cubature [3–6]
µ :=
[0,1]d
f(x) dx =? ^
µn = function of f(x1), . . . , f(xn) Want P[|µ − ^
µn| ⩽ ε] ⩾ 99%
Assume we are handed f ∼ GP(m, s2Cθ), so E[f(x)] = m and
E {f(t) − m}{f(x) − m} = s2Cθ(t, x)
Choose Cθ and x1, x2, . . . for a fast eigenvector-eigenvalue decomposition of the Gram
matrix Cθ = Cθ(xi, xj) n
i,j=1
; this is a practical choice, rather than informed by f (see also
Kanagawa, Shäfer, Wenger)
Sample f at x1, . . . , xn
Tune the hyperparameters by empirical Bayes so that f is not an outlier
Construct a credible interval for µ in terms of the posterior mean ^
µn
Increase n until the half-width is no greater than ε
Implement in software: GAIL [1] and QMCPy [2] (see also Naslidynk, Pleiss, ProbNum)
2/14

6. Introduction Matched Data Sites & Covariance Kernels QMCPy Challenges Summary References
Low Discrepancy (LD) Sequences for Quasi-Monte Carlo (QMC) Methods
IID sequences have clusters and gaps
0 1
x
i
,1
0
1
x
i
,2
n = 64
0 1
x
i
,1
0
1
x
i
,2
n = 128
0 1
x
i
,1
0
1
x
i
,2
n = 256
IID Uniform Points
3/14

7. Introduction Matched Data Sites & Covariance Kernels QMCPy Challenges Summary References
Low Discrepancy (LD) Sequences for Quasi-Monte Carlo (QMC) Methods
LD lattice sequences [7–9] are more even than IID and give faster convergence
0 1
x
i
, 1
0
1
x
i
, 2
n = 64
0 1
x
i
, 1
0
1
x
i
, 2
n = 128
0 1
x
i
, 1
0
1
x
i
, 2
n = 256
Lattice Points
Cθ(t, x) = Kθ(t − x mod 1), e.g., Kθ(t) =
d
k=1
1 + ηk(−1)r+1 B2r(tk)
Bernoulli
polynomial
, θ = (r, η)
3/14

8. Introduction Matched Data Sites & Covariance Kernels QMCPy Challenges Summary References
Low Discrepancy (LD) Sequences for Quasi-Monte Carlo (QMC) Methods
LD digital sequences [9–11] (e.g., Sobol’) are more even than IID and give faster convergence
0 1
x
i
, 1
0
1
x
i
, 2
n = 64
0 1
x
i
, 1
0
1
x
i
, 2
n = 128
0 1
x
i
, 1
0
1
x
i
, 2
n = 256
Sobol Points
Cθ(t, x) = Kθ(t ⊖
digitwise
subtraction
x), e.g., Kθ(t) =
d
k=1
1 + ηk ωr(tk)
explicit
, θ = (r, η)
3/14

9. Introduction Matched Data Sites & Covariance Kernels QMCPy Challenges Summary References
Eigenvector-Eigenvalue Decomposition of the Gram Matrix [3–6]
For these matched data sites and covariance kernels
[0,1]d
Cθ(t, x) dt = 1 ∀x ∈ [0, 1]d, Cθ = Cθ(xi, xj) n
i,j=1
= Cθ1, . . . , Cθn) =
1
n
VΛθVH
V explict (complex exponential/Hadamard), V1 = 1
b → b := VHb is FFT/FHWT O(n log n), λθ := diag(Λθ) = VHCθ1 = Cθ1
4/14

10. Introduction Matched Data Sites & Covariance Kernels QMCPy Challenges Summary References
Tuning Hyperparameters in GP(m, s2Cθ
) and the Credible Interval [3–6]
For these matched data sites and covariance kernels
[0,1]d
Cθ(t, x) dt = 1 ∀x ∈ [0, 1]d, Cθ = Cθ(xi, xj) n
i,j=1
= Cθ1, . . . , Cθn) =
1
n
VΛθVH
V explict (complex exponential/Hadamard), V1 = 1
b → b := VHb is FFT/FHWT O(n log n), λθ := diag(Λθ) = VHCθ1 = Cθ1
which leads to the approximation and credible interval:
^
µ =
y1
n
=
1
n
n
i=1
f(xi) = sample mean, y = f(x1), . . . , f(xn) T
, y = VHy
P |µ − ^
µ| ⩽ err = 99%, err =
2.58
n
n
i=2
|yi|2
λθi
1 −
n
λθ1
4/14

11. Introduction Matched Data Sites & Covariance Kernels QMCPy Challenges Summary References
Tuning Hyperparameters in GP(m, s2Cθ
) and the Credible Interval [3–6]
For these matched data sites and covariance kernels
b → b := VHb is FFT/FHWT O(n log n), λθ := diag(Λθ) = VHCθ1 = Cθ1
which leads to the approximation and credible interval:
^
µ =
y1
n
=
1
n
n
i=1
f(xi) = sample mean, y = f(x1), . . . , f(xn) T
, y = VHy
P |µ − ^
µ| ⩽ err = 99%, err =
2.58
n
n
i=2
|yi|2
λθi
1 −
n
λθ1
θopt = argmin
θ
log
i=2
|yi|2
λθi
+
1
n
n
i=1
log(λθi)
4/14

12. Introduction Matched Data Sites & Covariance Kernels QMCPy Challenges Summary References
Our Bayesian Stopping Criteria into Software
Our research on stopping criteria (Bayesian and otherwise) has been implemented in
GAIL [1] MATLAB library (born in 2013)
QMCPy [2] Python library (born in 2019)
We aim to provide software that
Includes the best—mostly quasi-Monte Carlo (QMC)—algorithms available
Is well-tested
Is an easy on-ramp for practitioners
Provides use cases for algorithm developers
Plays well with other packages
5/14

13. Introduction Matched Data Sites & Covariance Kernels QMCPy Challenges Summary References
Keister Example in QMCPy computations in Google colab notebook here
µ =
Rd
cos(∥t∥) exp(− ∥t∥2) dt =
[0,1]d
f(x) dx =? Keister [12]
d = 5; tol = 1e-2
lattice = Lattice(dimension = d,
→ order = "linear")
integrand = Keister(lattice)
stop_crit = CubBayesLatticeG(
→ integrand, abs_tol = tol,
→ ptransform = "none")
solution,data =
→ stop_crit.integrate()
print(data)
solution 1.136
error_bound 0.009
n_total 1024
time_integrate 0.725
abs_tol 0.010
rel_tol 0
n_init 2^(8)
n_max 2^(22)
6/14

14. Introduction Matched Data Sites & Covariance Kernels QMCPy Challenges Summary References
Asian Option Pricing Example in QMCPy computations here
option price = µ =
Rd
payoff(z) discrete BM PDF(z) dz =
[0,1]d
f(x) dx =? Glasserman [13]
volatility=.5; interest_rate =.01
start_price = 30; strike_price = 25
t_final = 1; call_put = "call"
mean_type = "arithmetic"
d = 13; abs_tol = .01
integrand = AsianOption(Lattice(d, order = "
→ linear"), volatility, start_price,
→ strike_price, interest_rate, t_final,
→ call_put, mean_type)
stop_crit = CubBayesLatticeG(integrand, abs_tol
→ = abs_tol, ptransform = "Baker")
stop_crit.one_theta = True
solution,data = stop_crit.integrate()
print(data)
solution 6.261
error_bound 0.004
n_total 2048
time_integrate 0.128
abs_tol 0.010
time_vec [0.077 0.154 0.231 ...
→ 0.846 0.923 1. ]
covariance [[0.077 0.077 0.077
→ ... 0.077 0.077 0.077]
[0.077 0.154 0.154 ... 0.154 0.154 0
→ .154]
[0.077 0.154 0.231 ... 0.231 0.231 0
→ .231]
...
decomp_type PCA
7/14

15. Introduction Matched Data Sites & Covariance Kernels QMCPy Challenges Summary References
Slow Speed computations here
Example Stopping Criterion d Absolute Tolerance n Time
Keister Bayesian Lattice [4] 5 0.01 210 0.725 s
Keister Bayesian Sobol’ [6] 212 0.102 s
Keister Track Fourier Coef. Lattice [14] 213 0.024 s
Keister Track Fourier Coef. Sobol’ [15] 213 0.012 s
Option Pricing Bayesian Lattice [4] 13 0.01 211 0.128 s
Option Pricing Track Fourier Coef. Lattice [14] 212 0.034 s
Computational load in optimizing for the kernel parameters
May need to reduce the number of hyperparameters to ensure correct answer
Inefficient code
Speed may not be a problem if function evaluation is expensive
8/14

16. Introduction Matched Data Sites & Covariance Kernels QMCPy Challenges Summary References
Checking the Gaussian Process Assumption
If f(x1), . . . , f(xn) T
has a known multivariate Gaussian distribution, then the correct
transformation of this data has a standard univariate Gaussian distribution
Here are QQ plots for the Keister example with different randomized lattice data sites and n = 64
−3 −2 −1 0 1 2 3
Standard Gaussian Quantiles
−2
0
2
4
6
Data Quantiles
d =3,n =64,ropt
=5.28,θopt
=0.89
−3 −2 −1 0 1 2 3
Standard Gaussian Quantiles
−3
−2
−1
0
1
2
3
4
5
Data Quantiles
d =3,n =64,ropt
=5.10,θopt
=0.93
−3 −2 −1 0 1 2 3
Standard Gaussian Quantiles
−3
−2
−1
0
1
2
3
4
Data Quantiles
d =3,n =64,ropt
=5.18,θopt
=0.95
Should we be concerned?
9/14

17. Introduction Matched Data Sites & Covariance Kernels QMCPy Challenges Summary References
Summary
Bayesian cubature using credible intervals as stopping criteria often gives correct answers
Matching sampling sites (designs) with covariance kernels expedites the computation
Challenges of speeding up the computation and finding the right kernels or
hyperparameters still remain
10/14

18. Thank you
These slides are available at
speakerdeck.com/fjhickernell/dagstuhl-bayesian-cubature-ld-qmcpy

19. Introduction Matched Data Sites & Covariance Kernels QMCPy Challenges Summary References
References I
1. Choi, S.-C. T. et al. GAIL: Guaranteed Automatic Integration Library (Versions 1.0–2.3.2).
MATLAB software, http://gailgithub.github.io/GAIL_Dev/. 2021.
2. Choi, S.-C. T., H., F. J., Jagadeeswaran, R., McCourt, M. & Sorokin, A. QMCPy: A
quasi-Monte Carlo Python Library. 2020.
https://qmcsoftware.github.io/QMCSoftware/.
3. H., F. J. & Jagadeeswaran, R. Comment on “Probabilistic Integration: A Role in Statistical
Computation?” Statist. Sci. 34, 23–28 (2019).
4. Jagadeeswaran, R. & H., F. J. Fast Automatic Bayesian Cubature Using Lattice Sampling.
Stat. Comput. 29, 1215–1229 (2019).
5. Jagadeeswaran, R. Fast Automatic Bayesian Cubature Using Matching Kenrels and
Designs. PhD thesis (Illinois Institute of Technology, 2019).
12/14

20. Introduction Matched Data Sites & Covariance Kernels QMCPy Challenges Summary References
References II
6. Jagadeeswaran, R. & H., F. J. Fast Automatic Bayesian Cubature Using Sobol’ Sampling.
In preparation for submission for publication. 2021+.
7. Sloan, I. H. & Joe, S. Lattice Methods for Multiple Integration. (Oxford University Press,
Oxford, 1994).
8. Lemieux, C. Monte Carlo and quasi-Monte Carlo Sampling. (Springer Science+Business
Media, Inc., New York, 2009).
9. Dick, J., Kuo, F. & Sloan, I. H. High dimensional integration — the Quasi-Monte Carlo way.
Acta Numer. 22, 133–288 (2013).
10. Niederreiter, H. Random Number Generation and Quasi-Monte Carlo Methods. (SIAM,
11. Dick, J. & Pillichshammer, F. Digital Nets and Sequences: Discrepancy Theory and
Quasi-Monte Carlo Integration. (Cambridge University Press, Cambridge, 2010).
13/14

21. Introduction Matched Data Sites & Covariance Kernels QMCPy Challenges Summary References
References III
12. Keister, B. D. Multidimensional Quadrature Algorithms. Computers in Physics 10, 119–122
(1996).
13. Glasserman, P. Monte Carlo Methods in Financial Engineering. (Springer-Verlag, New
York, 2004).
14. Jiménez Rugama, L. A. & H., F. J. Adaptive Multidimensional Integration Based on Rank-1
Lattices. in Monte Carlo and Quasi-Monte Carlo Methods: MCQMC, Leuven, Belgium,
April 2014 (eds Cools, R. & Nuyens, D.) 163. arXiv:1411.1966 (Springer-Verlag, Berlin,
2016), 407–422.
15. H., F. J. & Jiménez Rugama, L. A. Reliable Adaptive Cubature Using Digital Sequences. in
Monte Carlo and Quasi-Monte Carlo Methods: MCQMC, Leuven, Belgium, April 2014 (eds
Cools, R. & Nuyens, D.) 163. arXiv:1410.8615 [math.NA] (Springer-Verlag, Berlin, 2016),
367–383.
14/14