$30 off During Our Annual Pro Sale. View Details »

Dagstuhl Bayesian Cubature LD QMCPy

Dagstuhl Bayesian Cubature LD QMCPy

Fred J. Hickernell

October 27, 2021
Tweet

More Decks by Fred J. Hickernell

Other Decks in Research

Transcript

  1. Bayesian Cubature with Low Discrepancy Sequences in QMCPy
    Fred J. Hickernell, Jagadeeswaran Rathinavel & Aleksei Sorokin
    Department of Applied Mathematics Center for Interdisciplinary Scientific Computation
    Illinois Institute of Technology [email protected] mypages.iit.edu/~hickernell
    Thanks to the GAIL and QMCPy teams
    Thanks to the organizers, especially during these unusual times
    Disclosure: I am multicultural (computational mathematician & statistician)
    Slides at speakerdeck.com/fjhickernell/dagstuhl-bayesian-cubature-ld-qmcpy
    Probabilistic Numerics @ Dagstuhl, October 27, 2021

    View Slide

  2. Introduction Matched Data Sites & Covariance Kernels QMCPy Challenges Summary References
    Our take on Bayesian Quadrature Cubature [3–6]
    µ :=
    [0,1]d
    f(x) dx =? ^
    µn = function of f(x1), . . . , f(xn) Want P[|µ − ^
    µn| ⩽ ε] ⩾ 99%
    Assume we are handed f ∼ GP(m, s2Cθ), so E[f(x)] = m and
    E {f(t) − m}{f(x) − m} = s2Cθ(t, x)
    Sample f at x1, . . . , xn
    Construct a credible interval for µ in terms of the posterior mean ^
    µn
    Increase n until the half-width is no greater than ε
    2/14

    View Slide

  3. Introduction Matched Data Sites & Covariance Kernels QMCPy Challenges Summary References
    Our take on Bayesian Quadrature Cubature [3–6]
    µ :=
    [0,1]d
    f(x) dx =? ^
    µn = function of f(x1), . . . , f(xn) Want P[|µ − ^
    µn| ⩽ ε] ⩾ 99%
    Assume we are handed f ∼ GP(m, s2Cθ), so E[f(x)] = m and
    E {f(t) − m}{f(x) − m} = s2Cθ(t, x)
    Sample f at x1, . . . , xn
    Tune the hyperparameters by empirical Bayes so that f is not an outlier
    Construct a credible interval for µ in terms of the posterior mean ^
    µn
    Increase n until the half-width is no greater than ε
    2/14

    View Slide

  4. Introduction Matched Data Sites & Covariance Kernels QMCPy Challenges Summary References
    Our take on Bayesian Quadrature Cubature [3–6]
    µ :=
    [0,1]d
    f(x) dx =? ^
    µn = function of f(x1), . . . , f(xn) Want P[|µ − ^
    µn| ⩽ ε] ⩾ 99%
    Assume we are handed f ∼ GP(m, s2Cθ), so E[f(x)] = m and
    E {f(t) − m}{f(x) − m} = s2Cθ(t, x)
    Choose Cθ and x1, x2, . . . for a fast eigenvector-eigenvalue decomposition of the Gram
    matrix Cθ = Cθ(xi, xj) n
    i,j=1
    ; this is a practical choice, rather than informed by f (see also
    Kanagawa, Shäfer, Wenger)
    Sample f at x1, . . . , xn
    Tune the hyperparameters by empirical Bayes so that f is not an outlier
    Construct a credible interval for µ in terms of the posterior mean ^
    µn
    Increase n until the half-width is no greater than ε
    2/14

    View Slide

  5. Introduction Matched Data Sites & Covariance Kernels QMCPy Challenges Summary References
    Our take on Bayesian Quadrature Cubature [3–6]
    µ :=
    [0,1]d
    f(x) dx =? ^
    µn = function of f(x1), . . . , f(xn) Want P[|µ − ^
    µn| ⩽ ε] ⩾ 99%
    Assume we are handed f ∼ GP(m, s2Cθ), so E[f(x)] = m and
    E {f(t) − m}{f(x) − m} = s2Cθ(t, x)
    Choose Cθ and x1, x2, . . . for a fast eigenvector-eigenvalue decomposition of the Gram
    matrix Cθ = Cθ(xi, xj) n
    i,j=1
    ; this is a practical choice, rather than informed by f (see also
    Kanagawa, Shäfer, Wenger)
    Sample f at x1, . . . , xn
    Tune the hyperparameters by empirical Bayes so that f is not an outlier
    Construct a credible interval for µ in terms of the posterior mean ^
    µn
    Increase n until the half-width is no greater than ε
    Implement in software: GAIL [1] and QMCPy [2] (see also Naslidynk, Pleiss, ProbNum)
    2/14

    View Slide

  6. Introduction Matched Data Sites & Covariance Kernels QMCPy Challenges Summary References
    Low Discrepancy (LD) Sequences for Quasi-Monte Carlo (QMC) Methods
    IID sequences have clusters and gaps
    0 1
    x
    i
    ,1
    0
    1
    x
    i
    ,2
    n = 64
    0 1
    x
    i
    ,1
    0
    1
    x
    i
    ,2
    n = 128
    0 1
    x
    i
    ,1
    0
    1
    x
    i
    ,2
    n = 256
    IID Uniform Points
    3/14

    View Slide

  7. Introduction Matched Data Sites & Covariance Kernels QMCPy Challenges Summary References
    Low Discrepancy (LD) Sequences for Quasi-Monte Carlo (QMC) Methods
    LD lattice sequences [7–9] are more even than IID and give faster convergence
    0 1
    x
    i
    , 1
    0
    1
    x
    i
    , 2
    n = 64
    0 1
    x
    i
    , 1
    0
    1
    x
    i
    , 2
    n = 128
    0 1
    x
    i
    , 1
    0
    1
    x
    i
    , 2
    n = 256
    Lattice Points
    Cθ(t, x) = Kθ(t − x mod 1), e.g., Kθ(t) =
    d
    k=1
    1 + ηk(−1)r+1 B2r(tk)
    Bernoulli
    polynomial
    , θ = (r, η)
    3/14

    View Slide

  8. Introduction Matched Data Sites & Covariance Kernels QMCPy Challenges Summary References
    Low Discrepancy (LD) Sequences for Quasi-Monte Carlo (QMC) Methods
    LD digital sequences [9–11] (e.g., Sobol’) are more even than IID and give faster convergence
    0 1
    x
    i
    , 1
    0
    1
    x
    i
    , 2
    n = 64
    0 1
    x
    i
    , 1
    0
    1
    x
    i
    , 2
    n = 128
    0 1
    x
    i
    , 1
    0
    1
    x
    i
    , 2
    n = 256
    Sobol Points
    Cθ(t, x) = Kθ(t ⊖
    digitwise
    subtraction
    x), e.g., Kθ(t) =
    d
    k=1
    1 + ηk ωr(tk)
    explicit
    , θ = (r, η)
    3/14

    View Slide

  9. Introduction Matched Data Sites & Covariance Kernels QMCPy Challenges Summary References
    Eigenvector-Eigenvalue Decomposition of the Gram Matrix [3–6]
    For these matched data sites and covariance kernels
    [0,1]d
    Cθ(t, x) dt = 1 ∀x ∈ [0, 1]d, Cθ = Cθ(xi, xj) n
    i,j=1
    = Cθ1, . . . , Cθn) =
    1
    n
    VΛθVH
    V explict (complex exponential/Hadamard), V1 = 1
    b → b := VHb is FFT/FHWT O(n log n), λθ := diag(Λθ) = VHCθ1 = Cθ1
    4/14

    View Slide

  10. Introduction Matched Data Sites & Covariance Kernels QMCPy Challenges Summary References
    Tuning Hyperparameters in GP(m, s2Cθ
    ) and the Credible Interval [3–6]
    For these matched data sites and covariance kernels
    [0,1]d
    Cθ(t, x) dt = 1 ∀x ∈ [0, 1]d, Cθ = Cθ(xi, xj) n
    i,j=1
    = Cθ1, . . . , Cθn) =
    1
    n
    VΛθVH
    V explict (complex exponential/Hadamard), V1 = 1
    b → b := VHb is FFT/FHWT O(n log n), λθ := diag(Λθ) = VHCθ1 = Cθ1
    which leads to the approximation and credible interval:
    ^
    µ =
    y1
    n
    =
    1
    n
    n
    i=1
    f(xi) = sample mean, y = f(x1), . . . , f(xn) T
    , y = VHy
    P |µ − ^
    µ| ⩽ err = 99%, err =
    2.58
    n
    n
    i=2
    |yi|2
    λθi
    1 −
    n
    λθ1
    4/14

    View Slide

  11. Introduction Matched Data Sites & Covariance Kernels QMCPy Challenges Summary References
    Tuning Hyperparameters in GP(m, s2Cθ
    ) and the Credible Interval [3–6]
    For these matched data sites and covariance kernels
    b → b := VHb is FFT/FHWT O(n log n), λθ := diag(Λθ) = VHCθ1 = Cθ1
    which leads to the approximation and credible interval:
    ^
    µ =
    y1
    n
    =
    1
    n
    n
    i=1
    f(xi) = sample mean, y = f(x1), . . . , f(xn) T
    , y = VHy
    P |µ − ^
    µ| ⩽ err = 99%, err =
    2.58
    n
    n
    i=2
    |yi|2
    λθi
    1 −
    n
    λθ1
    θopt = argmin
    θ
    log
    i=2
    |yi|2
    λθi
    +
    1
    n
    n
    i=1
    log(λθi)
    4/14

    View Slide

  12. Introduction Matched Data Sites & Covariance Kernels QMCPy Challenges Summary References
    Our Bayesian Stopping Criteria into Software
    Our research on stopping criteria (Bayesian and otherwise) has been implemented in
    GAIL [1] MATLAB library (born in 2013)
    QMCPy [2] Python library (born in 2019)
    We aim to provide software that
    Includes the best—mostly quasi-Monte Carlo (QMC)—algorithms available
    Is well-tested
    Is an easy on-ramp for practitioners
    Provides use cases for algorithm developers
    Plays well with other packages
    5/14

    View Slide

  13. Introduction Matched Data Sites & Covariance Kernels QMCPy Challenges Summary References
    Keister Example in QMCPy computations in Google colab notebook here
    µ =
    Rd
    cos(∥t∥) exp(− ∥t∥2) dt =
    [0,1]d
    f(x) dx =? Keister [12]
    d = 5; tol = 1e-2
    lattice = Lattice(dimension = d,
    → order = "linear")
    integrand = Keister(lattice)
    stop_crit = CubBayesLatticeG(
    → integrand, abs_tol = tol,
    → ptransform = "none")
    solution,data =
    → stop_crit.integrate()
    print(data)
    solution 1.136
    error_bound 0.009
    n_total 1024
    time_integrate 0.725
    abs_tol 0.010
    rel_tol 0
    n_init 2^(8)
    n_max 2^(22)
    6/14

    View Slide

  14. Introduction Matched Data Sites & Covariance Kernels QMCPy Challenges Summary References
    Asian Option Pricing Example in QMCPy computations here
    option price = µ =
    Rd
    payoff(z) discrete BM PDF(z) dz =
    [0,1]d
    f(x) dx =? Glasserman [13]
    volatility=.5; interest_rate =.01
    start_price = 30; strike_price = 25
    t_final = 1; call_put = "call"
    mean_type = "arithmetic"
    d = 13; abs_tol = .01
    integrand = AsianOption(Lattice(d, order = "
    → linear"), volatility, start_price,
    → strike_price, interest_rate, t_final,
    → call_put, mean_type)
    stop_crit = CubBayesLatticeG(integrand, abs_tol
    → = abs_tol, ptransform = "Baker")
    stop_crit.one_theta = True
    solution,data = stop_crit.integrate()
    print(data)
    solution 6.261
    error_bound 0.004
    n_total 2048
    time_integrate 0.128
    abs_tol 0.010
    time_vec [0.077 0.154 0.231 ...
    → 0.846 0.923 1. ]
    covariance [[0.077 0.077 0.077
    → ... 0.077 0.077 0.077]
    [0.077 0.154 0.154 ... 0.154 0.154 0
    → .154]
    [0.077 0.154 0.231 ... 0.231 0.231 0
    → .231]
    ...
    decomp_type PCA
    7/14

    View Slide

  15. Introduction Matched Data Sites & Covariance Kernels QMCPy Challenges Summary References
    Slow Speed computations here
    Example Stopping Criterion d Absolute Tolerance n Time
    Keister Bayesian Lattice [4] 5 0.01 210 0.725 s
    Keister Bayesian Sobol’ [6] 212 0.102 s
    Keister Track Fourier Coef. Lattice [14] 213 0.024 s
    Keister Track Fourier Coef. Sobol’ [15] 213 0.012 s
    Option Pricing Bayesian Lattice [4] 13 0.01 211 0.128 s
    Option Pricing Track Fourier Coef. Lattice [14] 212 0.034 s
    Computational load in optimizing for the kernel parameters
    May need to reduce the number of hyperparameters to ensure correct answer
    Inefficient code
    Speed may not be a problem if function evaluation is expensive
    8/14

    View Slide

  16. Introduction Matched Data Sites & Covariance Kernels QMCPy Challenges Summary References
    Checking the Gaussian Process Assumption
    If f(x1), . . . , f(xn) T
    has a known multivariate Gaussian distribution, then the correct
    transformation of this data has a standard univariate Gaussian distribution
    Here are QQ plots for the Keister example with different randomized lattice data sites and n = 64
    −3 −2 −1 0 1 2 3
    Standard Gaussian Quantiles
    −2
    0
    2
    4
    6
    Data Quantiles
    d =3,n =64,ropt
    =5.28,θopt
    =0.89
    −3 −2 −1 0 1 2 3
    Standard Gaussian Quantiles
    −3
    −2
    −1
    0
    1
    2
    3
    4
    5
    Data Quantiles
    d =3,n =64,ropt
    =5.10,θopt
    =0.93
    −3 −2 −1 0 1 2 3
    Standard Gaussian Quantiles
    −3
    −2
    −1
    0
    1
    2
    3
    4
    Data Quantiles
    d =3,n =64,ropt
    =5.18,θopt
    =0.95
    Should we be concerned?
    9/14

    View Slide

  17. Introduction Matched Data Sites & Covariance Kernels QMCPy Challenges Summary References
    Summary
    Bayesian cubature using credible intervals as stopping criteria often gives correct answers
    Matching sampling sites (designs) with covariance kernels expedites the computation
    Tuning the hyperparameters can add significant computational load
    Challenges of speeding up the computation and finding the right kernels or
    hyperparameters still remain
    10/14

    View Slide

  18. Thank you
    These slides are available at
    speakerdeck.com/fjhickernell/dagstuhl-bayesian-cubature-ld-qmcpy

    View Slide

  19. Introduction Matched Data Sites & Covariance Kernels QMCPy Challenges Summary References
    References I
    1. Choi, S.-C. T. et al. GAIL: Guaranteed Automatic Integration Library (Versions 1.0–2.3.2).
    MATLAB software, http://gailgithub.github.io/GAIL_Dev/. 2021.
    2. Choi, S.-C. T., H., F. J., Jagadeeswaran, R., McCourt, M. & Sorokin, A. QMCPy: A
    quasi-Monte Carlo Python Library. 2020.
    https://qmcsoftware.github.io/QMCSoftware/.
    3. H., F. J. & Jagadeeswaran, R. Comment on “Probabilistic Integration: A Role in Statistical
    Computation?” Statist. Sci. 34, 23–28 (2019).
    4. Jagadeeswaran, R. & H., F. J. Fast Automatic Bayesian Cubature Using Lattice Sampling.
    Stat. Comput. 29, 1215–1229 (2019).
    5. Jagadeeswaran, R. Fast Automatic Bayesian Cubature Using Matching Kenrels and
    Designs. PhD thesis (Illinois Institute of Technology, 2019).
    12/14

    View Slide

  20. Introduction Matched Data Sites & Covariance Kernels QMCPy Challenges Summary References
    References II
    6. Jagadeeswaran, R. & H., F. J. Fast Automatic Bayesian Cubature Using Sobol’ Sampling.
    In preparation for submission for publication. 2021+.
    7. Sloan, I. H. & Joe, S. Lattice Methods for Multiple Integration. (Oxford University Press,
    Oxford, 1994).
    8. Lemieux, C. Monte Carlo and quasi-Monte Carlo Sampling. (Springer Science+Business
    Media, Inc., New York, 2009).
    9. Dick, J., Kuo, F. & Sloan, I. H. High dimensional integration — the Quasi-Monte Carlo way.
    Acta Numer. 22, 133–288 (2013).
    10. Niederreiter, H. Random Number Generation and Quasi-Monte Carlo Methods. (SIAM,
    Philadelphia, 1992).
    11. Dick, J. & Pillichshammer, F. Digital Nets and Sequences: Discrepancy Theory and
    Quasi-Monte Carlo Integration. (Cambridge University Press, Cambridge, 2010).
    13/14

    View Slide

  21. Introduction Matched Data Sites & Covariance Kernels QMCPy Challenges Summary References
    References III
    12. Keister, B. D. Multidimensional Quadrature Algorithms. Computers in Physics 10, 119–122
    (1996).
    13. Glasserman, P. Monte Carlo Methods in Financial Engineering. (Springer-Verlag, New
    York, 2004).
    14. Jiménez Rugama, L. A. & H., F. J. Adaptive Multidimensional Integration Based on Rank-1
    Lattices. in Monte Carlo and Quasi-Monte Carlo Methods: MCQMC, Leuven, Belgium,
    April 2014 (eds Cools, R. & Nuyens, D.) 163. arXiv:1411.1966 (Springer-Verlag, Berlin,
    2016), 407–422.
    15. H., F. J. & Jiménez Rugama, L. A. Reliable Adaptive Cubature Using Digital Sequences. in
    Monte Carlo and Quasi-Monte Carlo Methods: MCQMC, Leuven, Belgium, April 2014 (eds
    Cools, R. & Nuyens, D.) 163. arXiv:1410.8615 [math.NA] (Springer-Verlag, Berlin, 2016),
    367–383.
    14/14

    View Slide