Hickernell Department of Applied Mathematics Center for Interdisciplinary Scientific Computation Illinois Institute of Technology [email protected] mypages.iit.edu/~hickernell with Jagadeeswaran R. and the GAIL team partially supported by NSF-DMS-1522687 and NSF-DMS-1638521 (SAMSI) Thanks to the special session organizers SIAM-UQ 2020, March 2020
References Fast Bayesian Transforms in General Find a kernel Cθ to match the design X so that Cθ = 1 n VΛθ VH, VH = nV−1, V = (v1 , . . . , vn )T = (V1 , . . . , Vn ) known analytically v1 = V1 = 1, cθ = 1, b := VHb requires only O(n log(n)) operations ∀b. Cθ is a fast Bayesian transform kernel and b → VHb a fast Bayesian transform (FBT) Then by empirical Bayes y = FBT of function data y, λθ = diag(Λθ) = (λθ,1 , . . . , λθ,n )T = Cθ,1 = FBT of first column of Cθ θEB = argmin θ log n i=2 |yi |2 λθ,i + 1 n n i=1 log(λθ,i ) ALG(f, ε) = y1 n = 1 n n i=1 yi = sample mean when 2.58 n n i=2 |yi |2 λθ,i 1 − n λθ,1 ε Cost is O(n log(n)) times the number of iterations for optimizing θ 5/15
References Flexible FBT Kernel for Lattice Node Designs Cθ(t, x) = Kθ(x t) must be positive definite, where ⊕ = addition mod1. Common example is Kθ(x) = d j=1 [1 + a B2r (xj )] = d j=1 [1 + aκr (xj )], θ = (a, r) ∈ (0, ∞) × (1, ∞), r = 2r κr (x) := |k| 1 exp(2π √ −1kx) |k|r B2r closed form, but r ∈ N; κr defined for r > 1, but infinite sum 7/15
References Flexible FBT Kernel for Lattice Node Designs Cθ(t, x) = Kθ(x t) must be positive definite, where ⊕ = addition mod1. Common example is Kθ(x) = d j=1 [1 + a B2r (xj )] = d j=1 [1 + aκr (xj )], θ = (a, r) ∈ (0, ∞) × (1, ∞), r = 2r κr (x) := |k| 1 exp(2π √ −1kx) |k|r B2r closed form, but r ∈ N; κr defined for r > 1, but infinite sum But all we need to compute the integral and credible interval is y and λθ = Cθ,1 , where Cθ,1 = Kθ(xi x1 ) n i=1 , which only depends on κr ( /n) for = 0, . . . , n − 1 7/15
References Flexible FBT Kernel for Lattice Node Designs Cθ(t, x) = Kθ(x t) must be positive definite, where ⊕ = addition mod1. Common example is Kθ(x) = d j=1 [1 + a B2r (xj )] = d j=1 [1 + aκr (xj )], θ = (a, r) ∈ (0, ∞) × (1, ∞), r = 2r κr (x) := |k| 1 exp(2π √ −1kx) |k|r B2r closed form, but r ∈ N; κr defined for r > 1, but infinite sum But all we need to compute the integral and credible interval is y and λθ = Cθ,1 , where Cθ,1 = Kθ(xi x1 ) n i=1 , which only depends on κr ( /n) for = 0, . . . , n − 1 Moreover, κr ( /n) n−1 =0 can be computed using one FFT details 7/15
References Is f a Typical Instance of a Gaussian Process? If f ∈ GP(m, s2Cθ), then Z := s−1 n−1VΛ−1/2 θ VH C−1/2 θ (f − m1) ∼ N(0, I). Generate the data z = 1 nsEB VΛ−1/2 θEB VH(y − mEB 1) = n i=2 |yi |2 λθEB ,i −1 VΛ−1/2 θEB (y − y1 e1 ) Q-Q plots of z vs. standard Gaussian 10/15
References Is f a Typical Instance of a Gaussian Process? If f ∈ GP(m, s2Cθ), then Z := s−1 n−1VΛ−1/2 θ VH C−1/2 θ (f − m1) ∼ N(0, I). Generate the data z = 1 nsEB VΛ−1/2 θEB VH(y − mEB 1) = n i=2 |yi |2 λθEB ,i −1 VΛ−1/2 θEB (y − y1 e1 ) Q-Q plots of z vs. standard Gaussian Random function fitting kernel with given r, θ f(x) = f0 + k∈{1,...,N}d fc (k) cos(2πkTx) + fs (k) sin(2πkTx) f0 , fc (k), fs (k) IID ∼ N 0, a k 0 bd− k 0 kj=0 k−r j θ = a/b, N = 256 10/15
References Is f a Typical Instance of a Gaussian Process? If f ∈ GP(m, s2Cθ), then Z := s−1 n−1VΛ−1/2 θ VH C−1/2 θ (f − m1) ∼ N(0, I). Generate the data z = 1 nsEB VΛ−1/2 θEB VH(y − mEB 1) = n i=2 |yi |2 λθEB ,i −1 VΛ−1/2 θEB (y − y1 e1 ) Q-Q plots of z vs. standard Gaussian Random function fitting kernel with given r, θ f(x) = f0 + k∈{1,...,N}d fc (k) cos(2πkTx) + fs (k) sin(2πkTx) f0 , fc (k), fs (k) IID ∼ N 0, a k 0 bd− k 0 kj=0 k−r j θ = a/b, N = 256 10/15
References Is f a Typical Instance of a Gaussian Process? If f ∈ GP(m, s2Cθ), then Z := s−1 n−1VΛ−1/2 θ VH C−1/2 θ (f − m1) ∼ N(0, I). Generate the data z = 1 nsEB VΛ−1/2 θEB VH(y − mEB 1) = n i=2 |yi |2 λθEB ,i −1 VΛ−1/2 θEB (y − y1 e1 ) Q-Q plots of z vs. standard Gaussian Random function fitting kernel with given r, θ f(x) = f0 + k∈{1,...,N}d fc (k) cos(2πkTx) + fs (k) sin(2πkTx) f0 , fc (k), fs (k) IID ∼ N 0, a k 0 bd− k 0 kj=0 k−r j θ = a/b, N = 256 10/15
References Is f a Typical Instance of a Gaussian Process? If f ∈ GP(m, s2Cθ), then Z := s−1 n−1VΛ−1/2 θ VH C−1/2 θ (f − m1) ∼ N(0, I). Generate the data z = 1 nsEB VΛ−1/2 θEB VH(y − mEB 1) = n i=2 |yi |2 λθEB ,i −1 VΛ−1/2 θEB (y − y1 e1 ) Q-Q plots of z vs. standard Gaussian Random function fitting kernel with given r, θ f(x) = f0 + k∈{1,...,N}d fc (k) cos(2πkTx) + fs (k) sin(2πkTx) f0 , fc (k), fs (k) IID ∼ N 0, a k 0 bd− k 0 kj=0 k−r j θ = a/b, N = 256 10/15
References Is f a Typical Instance of a Gaussian Process? If f ∈ GP(m, s2Cθ), then Z := s−1 n−1VΛ−1/2 θ VH C−1/2 θ (f − m1) ∼ N(0, I). Generate the data z = 1 nsEB VΛ−1/2 θEB VH(y − mEB 1) = n i=2 |yi |2 λθEB ,i −1 VΛ−1/2 θEB (y − y1 e1 ) Q-Q plots of z vs. standard Gaussian Random function fitting kernel with given r, θ f(x) = f0 + k∈{1,...,N}d fc (k) cos(2πkTx) + fs (k) sin(2πkTx) f0 , fc (k), fs (k) IID ∼ N 0, a k 0 bd− k 0 kj=0 k−r j θ = a/b, N = 256 10/15
References Is f a Typical Instance of a Gaussian Process? If f ∈ GP(m, s2Cθ), then Z := s−1 n−1VΛ−1/2 θ VH C−1/2 θ (f − m1) ∼ N(0, I). Generate the data z = 1 nsEB VΛ−1/2 θEB VH(y − mEB 1) = n i=2 |yi |2 λθEB ,i −1 VΛ−1/2 θEB (y − y1 e1 ) Q-Q plots of z vs. standard Gaussian Random function fitting kernel with given r, θ f(x) = f0 + k∈{1,...,N}d fc (k) cos(2πkTx) + fs (k) sin(2πkTx) f0 , fc (k), fs (k) IID ∼ N 0, a k 0 bd− k 0 kj=0 k−r j θ = a/b, N = 256 10/15
References Is f a Typical Instance of a Gaussian Process? If f ∈ GP(m, s2Cθ), then Z := s−1 n−1VΛ−1/2 θ VH C−1/2 θ (f − m1) ∼ N(0, I). Generate the data z = 1 nsEB VΛ−1/2 θEB VH(y − mEB 1) = n i=2 |yi |2 λθEB ,i −1 VΛ−1/2 θEB (y − y1 e1 ) Q-Q plots of z vs. standard Gaussian Random function fitting kernel with given r, θ f(x) = f0 + k∈{1,...,N}d fc (k) cos(2πkTx) + fs (k) sin(2πkTx) f0 , fc (k), fs (k) IID ∼ N 0, a k 0 bd− k 0 kj=0 k−r j θ = a/b, N = 256 10/15
References Is f a Typical Instance of a Gaussian Process? If f ∈ GP(m, s2Cθ), then Z := s−1 n−1VΛ−1/2 θ VH C−1/2 θ (f − m1) ∼ N(0, I). Generate the data z = 1 nsEB VΛ−1/2 θEB VH(y − mEB 1) = n i=2 |yi |2 λθEB ,i −1 VΛ−1/2 θEB (y − y1 e1 ) Q-Q plots of z vs. standard Gaussian f from Keister’s example True r and θ unknown 10/15
References Is f a Typical Instance of a Gaussian Process? If f ∈ GP(m, s2Cθ), then Z := s−1 n−1VΛ−1/2 θ VH C−1/2 θ (f − m1) ∼ N(0, I). Generate the data z = 1 nsEB VΛ−1/2 θEB VH(y − mEB 1) = n i=2 |yi |2 λθEB ,i −1 VΛ−1/2 θEB (y − y1 e1 ) Q-Q plots of z vs. standard Gaussian f from Keister’s example True r and θ unknown 10/15
References Is f a Typical Instance of a Gaussian Process? If f ∈ GP(m, s2Cθ), then Z := s−1 n−1VΛ−1/2 θ VH C−1/2 θ (f − m1) ∼ N(0, I). Generate the data z = 1 nsEB VΛ−1/2 θEB VH(y − mEB 1) = n i=2 |yi |2 λθEB ,i −1 VΛ−1/2 θEB (y − y1 e1 ) Q-Q plots of z vs. standard Gaussian f from Keister’s example True r and θ unknown 10/15
References Is f a Typical Instance of a Gaussian Process? If f ∈ GP(m, s2Cθ), then Z := s−1 n−1VΛ−1/2 θ VH C−1/2 θ (f − m1) ∼ N(0, I). Generate the data z = 1 nsEB VΛ−1/2 θEB VH(y − mEB 1) = n i=2 |yi |2 λθEB ,i −1 VΛ−1/2 θEB (y − y1 e1 ) Q-Q plots of z vs. standard Gaussian f from Keister’s example True r and θ unknown 10/15
References Is f a Typical Instance of a Gaussian Process? If f ∈ GP(m, s2Cθ), then Z := s−1 n−1VΛ−1/2 θ VH C−1/2 θ (f − m1) ∼ N(0, I). Generate the data z = 1 nsEB VΛ−1/2 θEB VH(y − mEB 1) = n i=2 |yi |2 λθEB ,i −1 VΛ−1/2 θEB (y − y1 e1 ) Q-Q plots of z vs. standard Gaussian f from Keister’s example True r and θ unknown 10/15
References Is f a Typical Instance of a Gaussian Process? If f ∈ GP(m, s2Cθ), then Z := s−1 n−1VΛ−1/2 θ VH C−1/2 θ (f − m1) ∼ N(0, I). Generate the data z = 1 nsEB VΛ−1/2 θEB VH(y − mEB 1) = n i=2 |yi |2 λθEB ,i −1 VΛ−1/2 θEB (y − y1 e1 ) Q-Q plots of z vs. standard Gaussian f from Keister’s example True r and θ unknown 10/15
References The Successes of Bayesian Cubature f Credible intervals lead to adaptive cubature algorithms that provide answers to the desired tolerances Low discrepancy sampling provides cubatures requiring fewer samples that IID sampling Empirical Bayes and other methods can infer reasonable Gaussian process parameters for a family of covairance kernels—choosing a reasonable urn for f Covariance kernels that match the low discrepancy sampling facilitate fast Bayeian transforms for fitting the Gaussian process parameters, computing the cubature, and constructing the credible intervals Extra effort provides a richer family of covariance kernels 11/15
References The Challenges of Bayesian Cubature f Inferring the Gaussian process parameters and allowing a larger family of kernels takes extra computational time Overfitting as well as underfitting the kernel are possible The Gaussian assumption is not always justified by the data Which periodizing transformations of f are appropriate for lattice sampling and its matching kernels? 12/15
References References Jagadeeswaran, R. & H., F. J. Fast Automatic Bayesian Cubature Using Lattice Sampling. Stat. Comput. 29, 1215–1229 (2019). Choi, S.-C. T. et al. GAIL: Guaranteed Automatic Integration Library (Versions 1.0–2.2). MATLAB software. 2013–2017. http://gailgithub.github.io/GAIL_Dev/. Choi, S.-C. T., H., F. J., McCourt, M. & Sorokin, A. QMCPy: A quasi-Monte Carlo Python Library. 2020+. https://github.com/QMCSoftware/QMCSoftware. Keister, B. D. Multidimensional Quadrature Algorithms. Computers in Physics 10, 119–122 (1996). Genz, A. Comparison of Methods for the Computation of Multivariate Normal Probabilities. Computing Science and Statistics 25, 400–405 (1993). 14/15