$30 off During Our Annual Pro Sale. View Details »

SAMSI-QMC WG 5-3 Research Problem

SAMSI-QMC WG 5-3 Research Problem

Ongoing work with Simon Mak for SAMSI-QMC

Fred J. Hickernell

March 07, 2018
Tweet

More Decks by Fred J. Hickernell

Other Decks in Research

Transcript

  1. Work in Progress:
    Function Approximation When Function Values Are Expensive
    Fred J. Hickernell
    Department of Applied Mathematics, Illinois Institute of Technology
    [email protected] mypages.iit.edu/~hickernell
    Supported by NSF-DMS-1522687 and DMS-1638521 (SAMSI)
    Working Group V.3, March 7, 2018

    View Slide

  2. Background Approx. by Fourier Known γ Inferred γ Approx. by Function Values References
    Prologue
    May 7–9 we have our SAMSI-QMC Transitions Workshop where we should report on our
    progress.
    These slides summarize ongoing work by Simon Mak and me. Comments are welcome.
    2/20

    View Slide

  3. Background Approx. by Fourier Known γ Inferred γ Approx. by Function Values References
    Approximating Functions When Function Values Are Expensive
    Interested in some f : Ω → R, where Ω ⊆ Rd, e.g., the result of a climate model, or a
    financial calculation
    d is a dozen, or dozens, or a few hundred
    $(f) = cost to evaluate f(x) for any x ∈ Ω = hours or days or $1M
    Want to construct a surrogate model, fapp, with fapp ≈ f, such that $(fapp) = $0.000001 so
    that we may quickly explore (plot, integrate, optimize, search for sharp gradients of) f
    fapp is constructed using n pieces of information about f, such as values of f or Fourier
    coefficients of f
    Want f − fapp ∞
    ε for n = O(dε−q) as d ↑ ∞ or ε ↓ 0
    Assume $(f) nr for any practical n and any positive r
    3/20

    View Slide

  4. Background Approx. by Fourier Known γ Inferred γ Approx. by Function Values References
    Functions Expressed at Series
    Let F be a vector space of functions f : [a, b]d → R that have L2([a, b]d, ) orthogonal series
    expansions:
    f(x) =
    j∈Nd
    0
    f(j)φj(x), φj(x) = φj1
    (x1) · · · φjd
    (xd)
    f(j) = f, φj =
    [a,b]d
    f(x)φj(x) (x) dx
    Legendre polynomials:
    1
    −1
    φj(x)φk(x) dx = δj,k
    Chebyshev polynomials: φj(x) = pj cos(j arccos(x)),
    1
    −1
    φj(x)φk(x)

    1 − x2
    dx = δj,k
    4/20

    View Slide

  5. Background Approx. by Fourier Known γ Inferred γ Approx. by Function Values References
    Approximation by Fourier Coefficients
    f(x) =
    j∈Nd
    0
    f(j)φj(x), f(j) = f, φj , pj = φj ∞
    Suppose that we may observe the Fourier coefficients f(j) at a cost of $1M each. (Eventually
    we want to consider the case of observing function values.) For any vector of non-negative
    constants, γ = (γj)j∈Nd
    0
    , define the family of quasi-norms on F:
    f
    q,γ
    =
    f(j) pj
    γj
    j∈Nd
    0 q
    , 0/0 = 0, γj = 0 & f
    ∞,γ
    < ∞ =⇒ f(j) = 0
    Order the wavenumbers j such that γj1
    γj2
    · · · . The optimal to f given n Fourier
    coefficients chosen optimally is
    fapp(x) =
    n
    i=1
    f(ji)φji
    , f − fapp ∞
    =

    i=n+1
    f(ji)φji

    f −fapp 1,1
    tight
    f
    ∞,γ

    i=n+1
    γji
    5/20

    View Slide

  6. Background Approx. by Fourier Known γ Inferred γ Approx. by Function Values References
    In What Sense Is This Optimal?
    f(x) =
    j∈Nd
    0
    f(j)φj(x), f(j) = f, φj , pj = φj ∞
    , f
    q,γ
    =
    f(j) pj
    γj
    j∈Nd
    0 q
    γj1
    γj2
    · · · , fapp(x) =
    n
    i=1
    f(ji)φji
    , f − fapp ∞
    f − fapp 1,1
    tight
    f
    ∞,γ

    i=n+1
    γji
    For any other approximation, g, based on n Fourier coefficients, f(j) with j ∈ J and J having
    cardinality n,
    f − ^
    g
    1,1
    =
    j∈J
    f(j) − ^
    g(j) pj +
    j/
    ∈J
    f(j) − ^
    g(j) pj
    f + ^
    g
    ∞,γ
    j/
    ∈J
    γj
    f
    ∞,γ

    i=n+1
    γji
    6/20

    View Slide

  7. Background Approx. by Fourier Known γ Inferred γ Approx. by Function Values References
    How Quickly Does Error Decay?
    f(x) =
    j∈Nd
    0
    f(j)φj(x), f(j) = f, φj , pj = φj ∞
    , f
    q,γ
    =
    f(j) pj
    γj
    j∈Nd
    0 q
    γj1
    γj2
    · · · , fapp(x) =
    n
    i=1
    f(ji)φji
    , f − fapp ∞
    f − fapp 1,1
    f
    ∞,γ

    i=n+1
    γji
    A trick that is often used (q > 0):
    γjn+1
    1
    n
    γ1/q
    j1
    + · · · + γ1/q
    jn
    q 1
    nq
    γ
    1/q
    , γ
    1/q
    =
    j∈Nd
    0
    γ1/q
    j
    q

    i=n+1
    γji
    γ
    1/q

    i=n
    1
    iq
    γ
    1/q
    (q − 1)(n − 1)q−1
    7/20

    View Slide

  8. Background Approx. by Fourier Known γ Inferred γ Approx. by Function Values References
    Recap
    f(x) =
    j∈Nd
    0
    f(j)φj(x), f(j) = f, φj , pj = φj ∞
    , f
    q,γ
    =
    f(j) pj
    γj
    j∈Nd
    0 q
    dependence of f on d is hidden γj1
    γj2
    · · · , fapp(x) =
    n
    i=1
    f(ji)φji
    ,
    f − fapp ∞
    f − fapp 1,1
    f
    ∞,γ

    i=n+1
    γji
    f
    ∞,γ
    γ
    1/q
    (q − 1)(n − 1)q−1
    Want
    ε
    n = O





    f
    ∞,γ
    γ
    1/q
    ε


    1/(q−1)


    is sufficient
    To succeed with n = O(d) , we need γ
    1/q
    = O(dq−1)
    Novak, E. & Woźniakowski, H. Tractability of Multivariate Problems Volume I: Linear Information. EMS Tracts in
    Mathematics 6 (European Mathematical Society, Zürich, 2008), Kühn, T. et al. Approximation numbers of Sobolev
    embeddings—Sharp constants and tractability. J. Complexity 30, 95–116 (2014). 8/20

    View Slide

  9. Background Approx. by Fourier Known γ Inferred γ Approx. by Function Values References
    Product, Order, and Smoothness Dependent Weights
    f(x) =
    j∈Nd
    0
    f(j)φj(x), f(j) = f, φj , pj = φj ∞
    , f
    q,γ
    =
    f(j) pj
    γj
    j∈Nd
    0 q
    j∈Nd
    0
    γ1/q
    j
    = O(d(q−1)/q) =⇒ f − fapp ∞
    ε for n = O(d) if f
    ∞,γ
    < ∞
    Experimental design assumes
    Effect sparsity: Only a small number of effects are important
    Effect hierarchy: Lower-order effects are more important than higher-order effects
    Effect heredity: Interaction is active only if both parent effects are also active
    Effect smoothness: Coarse horizontal scales are more important than fine horizontal scales
    Consider product, order and smoothness dependent weights:
    γj = Γ j 0
    d
    =1
    w sj , Γ0 = w0 = s0 = 1,





    w = coordinate importance
    Γr = order size
    sj = smoothness degree
    Wu, C. F. J. & Hamada, M. Experiments: Planning, Analysis, and Parameter Design Optimization. (John Wiley
    & Sons, Inc., New York, 2000). 9/20

    View Slide

  10. Background Approx. by Fourier Known γ Inferred γ Approx. by Function Values References
    Product, Order, and Smoothness Dependent Weights
    Effect sparsity: Only a small number of effects are important
    Effect hierarchy: Lower-order effects are more important than higher-order effects
    Effect heredity: Interaction is active only if both parent effects are also active
    Effect smoothness: Coarse horizontal scales are more important than fine horizontal scales
    Consider product, order and smoothness dependent weights:
    γj = Γ j 0
    d
    =1
    w sj , Γ0 = w0 = s0 = 1,





    w = coordinate importance
    Γr = order size
    sj = smoothness degree
    j∈Nd
    0
    γ1/q
    j
    =
    u⊆1:d



    Γ1/q
    |u|
    ∈u
    w1/q



    j=1
    s1/q
    j


    |u|


    = O(d(q−1)/q)
    =⇒ f − fapp ∞
    ε for n = O(d) if f
    ∞,γ
    < ∞
    10/20

    View Slide

  11. Background Approx. by Fourier Known γ Inferred γ Approx. by Function Values References
    Special Cases of Weights
    j∈Nd
    0
    γ1/q
    j
    =
    u⊆1:d



    Γ1/q
    |u|
    ∈u
    w1/q



    j=1
    s1/q
    j


    |u|


    Want
    = O(d(q−1)/q)
    Γr = w = 1 :
    j∈Nd
    0
    γ1/p
    j
    =



    j=0
    s1/p
    j


    d
    Fail
    w = Γ1 = 1, Γr = 0 ∀r > 1 :
    j∈Nd
    0
    γ1/p
    j
    = 1 + d

    j=1
    s1/p
    j
    Near Success
    Γr = 1 :
    j∈Nd
    0
    γ1/p
    j
    exp



    k=1
    w1/q
    k

    j=1
    s1/q
    j

     Success
    11/20

    View Slide

  12. Background Approx. by Fourier Known γ Inferred γ Approx. by Function Values References
    Algorithm When Both γ and f
    ∞,γ
    Are Known
    Require: γ = vector of weights with ordering γj1
    γj2
    · · ·
    f = black-box Fourier coefficient generator
    f
    ∞,γ
    = norm of the Fourier coefficients
    ε = positive absolute error tolerance
    Ensure: f − fapp ∞
    ε
    1: Let n = min



    n :

    i=n +1
    γji
    ε
    f
    ∞,γ



    2: Compute fapp =
    n
    i=1
    f(ji)φji
    Computational cost is n = O ε−1 f
    ∞,γ
    γ
    1/q
    1/(q−1)
    ; γ determines the design
    12/20

    View Slide

  13. Background Approx. by Fourier Known γ Inferred γ Approx. by Function Values References
    Algorithm When γ Is Known and f
    ∞,γ
    Is Inferred
    Require: γ = vector of weights with ordering γj1
    γj2
    · · ·
    n0 = minimum number of wavenumbers C = inflation factor
    f = black-box Fourier coefficient generator for the function of interest, f, where
    f
    ∞,γ
    C fji
    n0
    i=1 ∞,γ
    ε = positive absolute error tolerance
    Ensure: f − fapp ∞
    ε
    1: Evaluate f(j1), . . . , f(jn0
    )
    2: Let n = min



    n > n0 :

    i=n +1
    γji
    ε
    C fji
    n0
    i=1 ∞,γ



    3: Compute fapp =
    n
    i=1
    f(ji)φji
    Computational cost is n = O ε−1C f
    ∞,γ
    γ
    1/q
    1/(q−1)
    ; γ determines the design
    13/20

    View Slide

  14. Background Approx. by Fourier Known γ Inferred γ Approx. by Function Values References
    Algorithm When Both γ and f
    ∞,γ
    Are Inferred
    The order of sampling the Fourier coefficients is determined by the γ, but in practice the
    relative size of the Fourier coefficients are not known, and thus γ should be inferred. As a first
    step we try
    γj = Γ j 0
    d
    =1
    w sj , Γ0 = w0 = s0 = 1,





    w = coordinate importance
    Γr = order size
    sj = smoothness degree
    with the Γr and sj fixed, but the w inferred. We want to infer the relative importance of the
    different coordinates.
    14/20

    View Slide

  15. Background Approx. by Fourier Known γ Inferred γ Approx. by Function Values References
    Algorithm When Both γ and f
    ∞,γ
    Are Inferred
    Require: Γ = vector of order sizes s = vector of smoothness degrees w∗ = max
    k
    wk
    n0
    = minimum number of wavenumbers in each coordinate C = inflation factor
    f = a black-box Fourier coefficient generator for the function of interest, f, where
    f
    ∞,γ
    C fj j∈J ∞,γ
    , J := {(0, . . . , 0, j, 0 . . . , 0) : j = 0, . . . , n0
    } for all γ
    ε = positive absolute error tolerance
    Ensure: f − fapp ∞
    ε
    1: Evaluate f(j) for j ∈ J
    2: Define w = min argmin
    w w∗
    fj j∈J ∞,γ
    3: Let n = min n :

    i=n +1
    γji
    ε
    C fj j∈J ∞,γ
    4: Compute fapp
    =
    n
    i=1
    f(ji
    )φji
    Computational cost is n = O ε−1C f
    ∞,γ
    γ
    1/q
    1/(q−1)
    15/20

    View Slide

  16. Background Approx. by Fourier Known γ Inferred γ Approx. by Function Values References
    A Gap Between Theory and Practice
    Theory
    using
    Fourier
    coefficients
    Photo Credit: Xinhua
    Practice
    using
    function
    values
    16/20

    View Slide

  17. Background Approx. by Fourier Known γ Inferred γ Approx. by Function Values References
    A Very Sparse Grid on [−1, 1]d
    j 0 1 2 3 4 · · ·
    van der Corput tj 0 1/2 1/4 3/4 1/8 · · ·
    ψ(tj) := 2(tj + 1/3 mod 1) − 1 −1/3 2/3 1/6 −5/6 −1/12 · · ·
    ψ(tj) := − cos(π(tj + 1/3 mod 1)) −0.5 0.8660 0.2588 −0.9659 −0.1305 · · ·
    To estimate f(j), j ∈ J, use the design {(ψ(tj1
    ), . . . , ψ(tjd
    ) : j ∈ J}. E.g., for
    J = {(0, 0, 0, 0), (1, 0, 0, 0), (0, 1, 0, 0), (0, 0, 1, 0), (0, 0, 0, 1), (2, 0, 0, 0), (3, 0, 0, 0), (1, 1, 0, 0)}
    Even Points ArcCos Points
    · ·
    17/20

    View Slide

  18. Background Approx. by Fourier Known γ Inferred γ Approx. by Function Values References
    Algorithm Using Function Values When Both γ and f
    ∞,γ
    Are Inferred
    Require: Γ = vector of order sizes s = vector of smoothness degrees w∗ = max
    k
    wk
    n0
    = minimum number of wavenumbers in each coordinate C = inflation factor
    f = a black-box function value generator ε = positive absolute error tolerance
    Ensure: f − fapp ∞
    ε
    1: Approximate f(j) for j ∈ J := {(0, . . . , 0, j, 0 . . . , 0) : j = 1, . . . , n0
    } by interpolating the function data
    {(x, f(x)) : x = ψ(tj1
    ), . . . , ψ(tjd
    ), j ∈ J}
    2: Define w = min argmin
    w w∗
    fj j∈J ∞,γ
    3: while C fj j∈J ∞,γ
    j/
    ∈J
    γji
    > ε do
    4: Add argmin
    j/
    ∈J
    γj
    to J
    5: Approximate f(j) for j ∈ J by interpolating the function data {(x, f(x)) : x = ψ(tj1
    ), . . . , ψ(tjd
    ), j ∈ J}
    6: end while
    7: Compute fapp
    =
    j∈J
    f(j)φj
    18/20

    View Slide

  19. Background Approx. by Fourier Known γ Inferred γ Approx. by Function Values References
    What Needs Attention
    Bridging the theory/practice gap
    Try some examples
    Bookkeeping on next largest γj
    If f(j)/γj is observed to be too large, may need to increase wk
    for some k
    May want to infer Γ or s
    19/20

    View Slide

  20. Thank you
    These slides are under continuous development
    and are available at
    speakerdeck.com/fjhickernell/samsi-qmc-wg-5-3-research-problem-1

    View Slide

  21. Background Approx. by Fourier Known γ Inferred γ Approx. by Function Values References
    Novak, E. & Woźniakowski, H. Tractability of Multivariate Problems Volume I: Linear
    Information. EMS Tracts in Mathematics 6 (European Mathematical Society, Zürich, 2008).
    Kühn, T., Sickel, W. & Ullrich, T. Approximation numbers of Sobolev embeddings—Sharp
    constants and tractability. J. Complexity 30, 95–116 (2014).
    Wu, C. F. J. & Hamada, M. Experiments: Planning, Analysis, and Parameter Design
    Optimization. (John Wiley & Sons, Inc., New York, 2000).
    20/20

    View Slide