Fred J. Hickernell Department of Applied Mathematics, Illinois Institute of Technology [email protected] mypages.iit.edu/~hickernell Supported by NSF-DMS-1522687 and DMS-1638521 (SAMSI) Working Group V.3, March 7, 2018
Function Values References Prologue May 7–9 we have our SAMSI-QMC Transitions Workshop where we should report on our progress. These slides summarize ongoing work by Simon Mak and me. Comments are welcome. 2/20
Function Values References Approximating Functions When Function Values Are Expensive Interested in some f : Ω → R, where Ω ⊆ Rd, e.g., the result of a climate model, or a financial calculation d is a dozen, or dozens, or a few hundred $(f) = cost to evaluate f(x) for any x ∈ Ω = hours or days or $1M Want to construct a surrogate model, fapp, with fapp ≈ f, such that $(fapp) = $0.000001 so that we may quickly explore (plot, integrate, optimize, search for sharp gradients of) f fapp is constructed using n pieces of information about f, such as values of f or Fourier coefficients of f Want f − fapp ∞ ε for n = O(dε−q) as d ↑ ∞ or ε ↓ 0 Assume $(f) nr for any practical n and any positive r 3/20
Function Values References Approximation by Fourier Coefficients f(x) = j∈Nd 0 f(j)φj(x), f(j) = f, φj , pj = φj ∞ Suppose that we may observe the Fourier coefficients f(j) at a cost of $1M each. (Eventually we want to consider the case of observing function values.) For any vector of non-negative constants, γ = (γj)j∈Nd 0 , define the family of quasi-norms on F: f q,γ = f(j) pj γj j∈Nd 0 q , 0/0 = 0, γj = 0 & f ∞,γ < ∞ =⇒ f(j) = 0 Order the wavenumbers j such that γj1 γj2 · · · . The optimal to f given n Fourier coefficients chosen optimally is fapp(x) = n i=1 f(ji)φji , f − fapp ∞ = ∞ i=n+1 f(ji)φji ∞ f −fapp 1,1 tight f ∞,γ ∞ i=n+1 γji 5/20
Function Values References In What Sense Is This Optimal? f(x) = j∈Nd 0 f(j)φj(x), f(j) = f, φj , pj = φj ∞ , f q,γ = f(j) pj γj j∈Nd 0 q γj1 γj2 · · · , fapp(x) = n i=1 f(ji)φji , f − fapp ∞ f − fapp 1,1 tight f ∞,γ ∞ i=n+1 γji For any other approximation, g, based on n Fourier coefficients, f(j) with j ∈ J and J having cardinality n, f − ^ g 1,1 = j∈J f(j) − ^ g(j) pj + j/ ∈J f(j) − ^ g(j) pj f + ^ g ∞,γ j/ ∈J γj f ∞,γ ∞ i=n+1 γji 6/20
Function Values References Recap f(x) = j∈Nd 0 f(j)φj(x), f(j) = f, φj , pj = φj ∞ , f q,γ = f(j) pj γj j∈Nd 0 q dependence of f on d is hidden γj1 γj2 · · · , fapp(x) = n i=1 f(ji)φji , f − fapp ∞ f − fapp 1,1 f ∞,γ ∞ i=n+1 γji f ∞,γ γ 1/q (q − 1)(n − 1)q−1 Want ε n = O f ∞,γ γ 1/q ε 1/(q−1) is sufficient To succeed with n = O(d) , we need γ 1/q = O(dq−1) Novak, E. & Woźniakowski, H. Tractability of Multivariate Problems Volume I: Linear Information. EMS Tracts in Mathematics 6 (European Mathematical Society, Zürich, 2008), Kühn, T. et al. Approximation numbers of Sobolev embeddings—Sharp constants and tractability. J. Complexity 30, 95–116 (2014). 8/20
Function Values References Product, Order, and Smoothness Dependent Weights f(x) = j∈Nd 0 f(j)φj(x), f(j) = f, φj , pj = φj ∞ , f q,γ = f(j) pj γj j∈Nd 0 q j∈Nd 0 γ1/q j = O(d(q−1)/q) =⇒ f − fapp ∞ ε for n = O(d) if f ∞,γ < ∞ Experimental design assumes Effect sparsity: Only a small number of effects are important Effect hierarchy: Lower-order effects are more important than higher-order effects Effect heredity: Interaction is active only if both parent effects are also active Effect smoothness: Coarse horizontal scales are more important than fine horizontal scales Consider product, order and smoothness dependent weights: γj = Γ j 0 d =1 w sj , Γ0 = w0 = s0 = 1, w = coordinate importance Γr = order size sj = smoothness degree Wu, C. F. J. & Hamada, M. Experiments: Planning, Analysis, and Parameter Design Optimization. (John Wiley & Sons, Inc., New York, 2000). 9/20
Function Values References Product, Order, and Smoothness Dependent Weights Effect sparsity: Only a small number of effects are important Effect hierarchy: Lower-order effects are more important than higher-order effects Effect heredity: Interaction is active only if both parent effects are also active Effect smoothness: Coarse horizontal scales are more important than fine horizontal scales Consider product, order and smoothness dependent weights: γj = Γ j 0 d =1 w sj , Γ0 = w0 = s0 = 1, w = coordinate importance Γr = order size sj = smoothness degree j∈Nd 0 γ1/q j = u⊆1:d Γ1/q |u| ∈u w1/q ∞ j=1 s1/q j |u| = O(d(q−1)/q) =⇒ f − fapp ∞ ε for n = O(d) if f ∞,γ < ∞ 10/20
Function Values References Algorithm When Both γ and f ∞,γ Are Known Require: γ = vector of weights with ordering γj1 γj2 · · · f = black-box Fourier coefficient generator f ∞,γ = norm of the Fourier coefficients ε = positive absolute error tolerance Ensure: f − fapp ∞ ε 1: Let n = min n : ∞ i=n +1 γji ε f ∞,γ 2: Compute fapp = n i=1 f(ji)φji Computational cost is n = O ε−1 f ∞,γ γ 1/q 1/(q−1) ; γ determines the design 12/20
Function Values References Algorithm When γ Is Known and f ∞,γ Is Inferred Require: γ = vector of weights with ordering γj1 γj2 · · · n0 = minimum number of wavenumbers C = inflation factor f = black-box Fourier coefficient generator for the function of interest, f, where f ∞,γ C fji n0 i=1 ∞,γ ε = positive absolute error tolerance Ensure: f − fapp ∞ ε 1: Evaluate f(j1), . . . , f(jn0 ) 2: Let n = min n > n0 : ∞ i=n +1 γji ε C fji n0 i=1 ∞,γ 3: Compute fapp = n i=1 f(ji)φji Computational cost is n = O ε−1C f ∞,γ γ 1/q 1/(q−1) ; γ determines the design 13/20
Function Values References Algorithm When Both γ and f ∞,γ Are Inferred The order of sampling the Fourier coefficients is determined by the γ, but in practice the relative size of the Fourier coefficients are not known, and thus γ should be inferred. As a first step we try γj = Γ j 0 d =1 w sj , Γ0 = w0 = s0 = 1, w = coordinate importance Γr = order size sj = smoothness degree with the Γr and sj fixed, but the w inferred. We want to infer the relative importance of the different coordinates. 14/20
Function Values References Algorithm When Both γ and f ∞,γ Are Inferred Require: Γ = vector of order sizes s = vector of smoothness degrees w∗ = max k wk n0 = minimum number of wavenumbers in each coordinate C = inflation factor f = a black-box Fourier coefficient generator for the function of interest, f, where f ∞,γ C fj j∈J ∞,γ , J := {(0, . . . , 0, j, 0 . . . , 0) : j = 0, . . . , n0 } for all γ ε = positive absolute error tolerance Ensure: f − fapp ∞ ε 1: Evaluate f(j) for j ∈ J 2: Define w = min argmin w w∗ fj j∈J ∞,γ 3: Let n = min n : ∞ i=n +1 γji ε C fj j∈J ∞,γ 4: Compute fapp = n i=1 f(ji )φji Computational cost is n = O ε−1C f ∞,γ γ 1/q 1/(q−1) 15/20
Function Values References Algorithm Using Function Values When Both γ and f ∞,γ Are Inferred Require: Γ = vector of order sizes s = vector of smoothness degrees w∗ = max k wk n0 = minimum number of wavenumbers in each coordinate C = inflation factor f = a black-box function value generator ε = positive absolute error tolerance Ensure: f − fapp ∞ ε 1: Approximate f(j) for j ∈ J := {(0, . . . , 0, j, 0 . . . , 0) : j = 1, . . . , n0 } by interpolating the function data {(x, f(x)) : x = ψ(tj1 ), . . . , ψ(tjd ), j ∈ J} 2: Define w = min argmin w w∗ fj j∈J ∞,γ 3: while C fj j∈J ∞,γ j/ ∈J γji > ε do 4: Add argmin j/ ∈J γj to J 5: Approximate f(j) for j ∈ J by interpolating the function data {(x, f(x)) : x = ψ(tj1 ), . . . , ψ(tjd ), j ∈ J} 6: end while 7: Compute fapp = j∈J f(j)φj 18/20
Function Values References What Needs Attention Bridging the theory/practice gap Try some examples Bookkeeping on next largest γj If f(j)/γj is observed to be too large, may need to increase wk for some k May want to infer Γ or s 19/20
Function Values References Novak, E. & Woźniakowski, H. Tractability of Multivariate Problems Volume I: Linear Information. EMS Tracts in Mathematics 6 (European Mathematical Society, Zürich, 2008). Kühn, T., Sickel, W. & Ullrich, T. Approximation numbers of Sobolev embeddings—Sharp constants and tractability. J. Complexity 30, 95–116 (2014). Wu, C. F. J. & Hamada, M. Experiments: Planning, Analysis, and Parameter Design Optimization. (John Wiley & Sons, Inc., New York, 2000). 20/20