Fred J. Hickernell
March 07, 2018
53

# SAMSI-QMC WG 5-3 Research Problem

Ongoing work with Simon Mak for SAMSI-QMC

March 07, 2018

## Transcript

1. Work in Progress:
Function Approximation When Function Values Are Expensive
Fred J. Hickernell
Department of Applied Mathematics, Illinois Institute of Technology
[email protected] mypages.iit.edu/~hickernell
Supported by NSF-DMS-1522687 and DMS-1638521 (SAMSI)
Working Group V.3, March 7, 2018

2. Background Approx. by Fourier Known γ Inferred γ Approx. by Function Values References
Prologue
May 7–9 we have our SAMSI-QMC Transitions Workshop where we should report on our
progress.
These slides summarize ongoing work by Simon Mak and me. Comments are welcome.
2/20

3. Background Approx. by Fourier Known γ Inferred γ Approx. by Function Values References
Approximating Functions When Function Values Are Expensive
Interested in some f : Ω → R, where Ω ⊆ Rd, e.g., the result of a climate model, or a
ﬁnancial calculation
d is a dozen, or dozens, or a few hundred
\$(f) = cost to evaluate f(x) for any x ∈ Ω = hours or days or \$1M
Want to construct a surrogate model, fapp, with fapp ≈ f, such that \$(fapp) = \$0.000001 so
that we may quickly explore (plot, integrate, optimize, search for sharp gradients of) f
fapp is constructed using n pieces of information about f, such as values of f or Fourier
coeﬃcients of f
Want f − fapp ∞
ε for n = O(dε−q) as d ↑ ∞ or ε ↓ 0
Assume \$(f) nr for any practical n and any positive r
3/20

4. Background Approx. by Fourier Known γ Inferred γ Approx. by Function Values References
Functions Expressed at Series
Let F be a vector space of functions f : [a, b]d → R that have L2([a, b]d, ) orthogonal series
expansions:
f(x) =
j∈Nd
0
f(j)φj(x), φj(x) = φj1
(x1) · · · φjd
(xd)
f(j) = f, φj =
[a,b]d
f(x)φj(x) (x) dx
Legendre polynomials:
1
−1
φj(x)φk(x) dx = δj,k
Chebyshev polynomials: φj(x) = pj cos(j arccos(x)),
1
−1
φj(x)φk(x)

1 − x2
dx = δj,k
4/20

5. Background Approx. by Fourier Known γ Inferred γ Approx. by Function Values References
Approximation by Fourier Coeﬃcients
f(x) =
j∈Nd
0
f(j)φj(x), f(j) = f, φj , pj = φj ∞
Suppose that we may observe the Fourier coeﬃcients f(j) at a cost of \$1M each. (Eventually
we want to consider the case of observing function values.) For any vector of non-negative
constants, γ = (γj)j∈Nd
0
, deﬁne the family of quasi-norms on F:
f
q,γ
=
f(j) pj
γj
j∈Nd
0 q
, 0/0 = 0, γj = 0 & f
∞,γ
< ∞ =⇒ f(j) = 0
Order the wavenumbers j such that γj1
γj2
· · · . The optimal to f given n Fourier
coeﬃcients chosen optimally is
fapp(x) =
n
i=1
f(ji)φji
, f − fapp ∞
=

i=n+1
f(ji)φji

f −fapp 1,1
tight
f
∞,γ

i=n+1
γji
5/20

6. Background Approx. by Fourier Known γ Inferred γ Approx. by Function Values References
In What Sense Is This Optimal?
f(x) =
j∈Nd
0
f(j)φj(x), f(j) = f, φj , pj = φj ∞
, f
q,γ
=
f(j) pj
γj
j∈Nd
0 q
γj1
γj2
· · · , fapp(x) =
n
i=1
f(ji)φji
, f − fapp ∞
f − fapp 1,1
tight
f
∞,γ

i=n+1
γji
For any other approximation, g, based on n Fourier coeﬃcients, f(j) with j ∈ J and J having
cardinality n,
f − ^
g
1,1
=
j∈J
f(j) − ^
g(j) pj +
j/
∈J
f(j) − ^
g(j) pj
f + ^
g
∞,γ
j/
∈J
γj
f
∞,γ

i=n+1
γji
6/20

7. Background Approx. by Fourier Known γ Inferred γ Approx. by Function Values References
How Quickly Does Error Decay?
f(x) =
j∈Nd
0
f(j)φj(x), f(j) = f, φj , pj = φj ∞
, f
q,γ
=
f(j) pj
γj
j∈Nd
0 q
γj1
γj2
· · · , fapp(x) =
n
i=1
f(ji)φji
, f − fapp ∞
f − fapp 1,1
f
∞,γ

i=n+1
γji
A trick that is often used (q > 0):
γjn+1
1
n
γ1/q
j1
+ · · · + γ1/q
jn
q 1
nq
γ
1/q
, γ
1/q
=
j∈Nd
0
γ1/q
j
q

i=n+1
γji
γ
1/q

i=n
1
iq
γ
1/q
(q − 1)(n − 1)q−1
7/20

8. Background Approx. by Fourier Known γ Inferred γ Approx. by Function Values References
Recap
f(x) =
j∈Nd
0
f(j)φj(x), f(j) = f, φj , pj = φj ∞
, f
q,γ
=
f(j) pj
γj
j∈Nd
0 q
dependence of f on d is hidden γj1
γj2
· · · , fapp(x) =
n
i=1
f(ji)φji
,
f − fapp ∞
f − fapp 1,1
f
∞,γ

i=n+1
γji
f
∞,γ
γ
1/q
(q − 1)(n − 1)q−1
Want
ε
n = O

f
∞,γ
γ
1/q
ε

1/(q−1)

is suﬃcient
To succeed with n = O(d) , we need γ
1/q
= O(dq−1)
Novak, E. & Woźniakowski, H. Tractability of Multivariate Problems Volume I: Linear Information. EMS Tracts in
Mathematics 6 (European Mathematical Society, Zürich, 2008), Kühn, T. et al. Approximation numbers of Sobolev
embeddings—Sharp constants and tractability. J. Complexity 30, 95–116 (2014). 8/20

9. Background Approx. by Fourier Known γ Inferred γ Approx. by Function Values References
Product, Order, and Smoothness Dependent Weights
f(x) =
j∈Nd
0
f(j)φj(x), f(j) = f, φj , pj = φj ∞
, f
q,γ
=
f(j) pj
γj
j∈Nd
0 q
j∈Nd
0
γ1/q
j
= O(d(q−1)/q) =⇒ f − fapp ∞
ε for n = O(d) if f
∞,γ
< ∞
Experimental design assumes
Eﬀect sparsity: Only a small number of eﬀects are important
Eﬀect hierarchy: Lower-order eﬀects are more important than higher-order eﬀects
Eﬀect heredity: Interaction is active only if both parent eﬀects are also active
Eﬀect smoothness: Coarse horizontal scales are more important than ﬁne horizontal scales
Consider product, order and smoothness dependent weights:
γj = Γ j 0
d
=1
w sj , Γ0 = w0 = s0 = 1,

w = coordinate importance
Γr = order size
sj = smoothness degree
Wu, C. F. J. & Hamada, M. Experiments: Planning, Analysis, and Parameter Design Optimization. (John Wiley
& Sons, Inc., New York, 2000). 9/20

10. Background Approx. by Fourier Known γ Inferred γ Approx. by Function Values References
Product, Order, and Smoothness Dependent Weights
Eﬀect sparsity: Only a small number of eﬀects are important
Eﬀect hierarchy: Lower-order eﬀects are more important than higher-order eﬀects
Eﬀect heredity: Interaction is active only if both parent eﬀects are also active
Eﬀect smoothness: Coarse horizontal scales are more important than ﬁne horizontal scales
Consider product, order and smoothness dependent weights:
γj = Γ j 0
d
=1
w sj , Γ0 = w0 = s0 = 1,

w = coordinate importance
Γr = order size
sj = smoothness degree
j∈Nd
0
γ1/q
j
=
u⊆1:d

Γ1/q
|u|
∈u
w1/q

j=1
s1/q
j

|u|

= O(d(q−1)/q)
=⇒ f − fapp ∞
ε for n = O(d) if f
∞,γ
< ∞
10/20

11. Background Approx. by Fourier Known γ Inferred γ Approx. by Function Values References
Special Cases of Weights
j∈Nd
0
γ1/q
j
=
u⊆1:d

Γ1/q
|u|
∈u
w1/q

j=1
s1/q
j

|u|

Want
= O(d(q−1)/q)
Γr = w = 1 :
j∈Nd
0
γ1/p
j
=

j=0
s1/p
j

d
Fail
w = Γ1 = 1, Γr = 0 ∀r > 1 :
j∈Nd
0
γ1/p
j
= 1 + d

j=1
s1/p
j
Near Success
Γr = 1 :
j∈Nd
0
γ1/p
j
exp

k=1
w1/q
k

j=1
s1/q
j

 Success
11/20

12. Background Approx. by Fourier Known γ Inferred γ Approx. by Function Values References
Algorithm When Both γ and f
∞,γ
Are Known
Require: γ = vector of weights with ordering γj1
γj2
· · ·
f = black-box Fourier coeﬃcient generator
f
∞,γ
= norm of the Fourier coeﬃcients
ε = positive absolute error tolerance
Ensure: f − fapp ∞
ε
1: Let n = min

n :

i=n +1
γji
ε
f
∞,γ

2: Compute fapp =
n
i=1
f(ji)φji
Computational cost is n = O ε−1 f
∞,γ
γ
1/q
1/(q−1)
; γ determines the design
12/20

13. Background Approx. by Fourier Known γ Inferred γ Approx. by Function Values References
Algorithm When γ Is Known and f
∞,γ
Is Inferred
Require: γ = vector of weights with ordering γj1
γj2
· · ·
n0 = minimum number of wavenumbers C = inﬂation factor
f = black-box Fourier coeﬃcient generator for the function of interest, f, where
f
∞,γ
C fji
n0
i=1 ∞,γ
ε = positive absolute error tolerance
Ensure: f − fapp ∞
ε
1: Evaluate f(j1), . . . , f(jn0
)
2: Let n = min

n > n0 :

i=n +1
γji
ε
C fji
n0
i=1 ∞,γ

3: Compute fapp =
n
i=1
f(ji)φji
Computational cost is n = O ε−1C f
∞,γ
γ
1/q
1/(q−1)
; γ determines the design
13/20

14. Background Approx. by Fourier Known γ Inferred γ Approx. by Function Values References
Algorithm When Both γ and f
∞,γ
Are Inferred
The order of sampling the Fourier coeﬃcients is determined by the γ, but in practice the
relative size of the Fourier coeﬃcients are not known, and thus γ should be inferred. As a ﬁrst
step we try
γj = Γ j 0
d
=1
w sj , Γ0 = w0 = s0 = 1,

w = coordinate importance
Γr = order size
sj = smoothness degree
with the Γr and sj ﬁxed, but the w inferred. We want to infer the relative importance of the
diﬀerent coordinates.
14/20

15. Background Approx. by Fourier Known γ Inferred γ Approx. by Function Values References
Algorithm When Both γ and f
∞,γ
Are Inferred
Require: Γ = vector of order sizes s = vector of smoothness degrees w∗ = max
k
wk
n0
= minimum number of wavenumbers in each coordinate C = inﬂation factor
f = a black-box Fourier coeﬃcient generator for the function of interest, f, where
f
∞,γ
C fj j∈J ∞,γ
, J := {(0, . . . , 0, j, 0 . . . , 0) : j = 0, . . . , n0
} for all γ
ε = positive absolute error tolerance
Ensure: f − fapp ∞
ε
1: Evaluate f(j) for j ∈ J
2: Deﬁne w = min argmin
w w∗
fj j∈J ∞,γ
3: Let n = min n :

i=n +1
γji
ε
C fj j∈J ∞,γ
4: Compute fapp
=
n
i=1
f(ji
)φji
Computational cost is n = O ε−1C f
∞,γ
γ
1/q
1/(q−1)
15/20

16. Background Approx. by Fourier Known γ Inferred γ Approx. by Function Values References
A Gap Between Theory and Practice
Theory
using
Fourier
coeﬃcients
Photo Credit: Xinhua
Practice
using
function
values
16/20

17. Background Approx. by Fourier Known γ Inferred γ Approx. by Function Values References
A Very Sparse Grid on [−1, 1]d
j 0 1 2 3 4 · · ·
van der Corput tj 0 1/2 1/4 3/4 1/8 · · ·
ψ(tj) := 2(tj + 1/3 mod 1) − 1 −1/3 2/3 1/6 −5/6 −1/12 · · ·
ψ(tj) := − cos(π(tj + 1/3 mod 1)) −0.5 0.8660 0.2588 −0.9659 −0.1305 · · ·
To estimate f(j), j ∈ J, use the design {(ψ(tj1
), . . . , ψ(tjd
) : j ∈ J}. E.g., for
J = {(0, 0, 0, 0), (1, 0, 0, 0), (0, 1, 0, 0), (0, 0, 1, 0), (0, 0, 0, 1), (2, 0, 0, 0), (3, 0, 0, 0), (1, 1, 0, 0)}
Even Points ArcCos Points
· ·
17/20

18. Background Approx. by Fourier Known γ Inferred γ Approx. by Function Values References
Algorithm Using Function Values When Both γ and f
∞,γ
Are Inferred
Require: Γ = vector of order sizes s = vector of smoothness degrees w∗ = max
k
wk
n0
= minimum number of wavenumbers in each coordinate C = inﬂation factor
f = a black-box function value generator ε = positive absolute error tolerance
Ensure: f − fapp ∞
ε
1: Approximate f(j) for j ∈ J := {(0, . . . , 0, j, 0 . . . , 0) : j = 1, . . . , n0
} by interpolating the function data
{(x, f(x)) : x = ψ(tj1
), . . . , ψ(tjd
), j ∈ J}
2: Deﬁne w = min argmin
w w∗
fj j∈J ∞,γ
3: while C fj j∈J ∞,γ
j/
∈J
γji
> ε do
j/
∈J
γj
to J
5: Approximate f(j) for j ∈ J by interpolating the function data {(x, f(x)) : x = ψ(tj1
), . . . , ψ(tjd
), j ∈ J}
6: end while
7: Compute fapp
=
j∈J
f(j)φj
18/20

19. Background Approx. by Fourier Known γ Inferred γ Approx. by Function Values References
What Needs Attention
Bridging the theory/practice gap
Try some examples
Bookkeeping on next largest γj
If f(j)/γj is observed to be too large, may need to increase wk
for some k
May want to infer Γ or s
19/20

20. Thank you
These slides are under continuous development
and are available at
speakerdeck.com/fjhickernell/samsi-qmc-wg-5-3-research-problem-1

21. Background Approx. by Fourier Known γ Inferred γ Approx. by Function Values References
Novak, E. & Woźniakowski, H. Tractability of Multivariate Problems Volume I: Linear
Information. EMS Tracts in Mathematics 6 (European Mathematical Society, Zürich, 2008).
Kühn, T., Sickel, W. & Ullrich, T. Approximation numbers of Sobolev embeddings—Sharp
constants and tractability. J. Complexity 30, 95–116 (2014).
Wu, C. F. J. & Hamada, M. Experiments: Planning, Analysis, and Parameter Design
Optimization. (John Wiley & Sons, Inc., New York, 2000).
20/20