42

# Right Ingredients for Adaptive Function Approximation March 05, 2020

## Transcript

1. The Right Ingredients for
Fred J. Hickernell
Department of Applied Mathematics
Center for Interdisciplinary Scientiﬁc Computation
Illinois Institute of Technology
[email protected] mypages.iit.edu/~hickernell
with Sou-Cheng Choi, Yuhan Ding, Mac Hyman, Xin Tong, and the GAIL team
partially supported by NSF-DMS-1522687 and NSF-DMS-1638521 (SAMSI)
Thanks to Guohui Song for the invitation and hospitality
Old Dominion University, March 5, 2020

2. Introduction Univariate, Low Accuracy Multivariate, Reproducing Kernel Hilbert Space Summary References
The Guaranteed Automatic Integration Library (GAIL) and QMCPy Teams
Sou-Cheng Choi (Chief Data Scientist, Kamakura)
Yuhan Ding (IIT PhD ’15, Lecturer, IIT)
Lan Jiang (IIT PhD ’16, Compass)
Lluís Antoni Jiménez Rugama (IIT PhD ’17, UBS)
Jagadeeswaran Rathinavel (IIT PhD ’19, Wi-Tronix)
Aleksei Sorokin (IIT BS + MAS ’21 exp.)
Tong Xin (IIT MS, UIC PhD ’20 exp.)
Kan Zhang (IIT PhD ’20 exp.)
Yizhi Zhang (IIT PhD ’18, Jamran Int’l)
Xuan Zhou (IIT PhD ’15, JP Morgan)
and others
Adaptive software libraries GAIL and QMCPy
2/14

3. Introduction Univariate, Low Accuracy Multivariate, Reproducing Kernel Hilbert Space Summary References
Problem
Given black-box function routine f : X ⊆ Rd → R, e.g., output of a computer simulation
Expensive cost of a function value, \$(f)
Want ﬁxed tolerance algorithm ALG : C × (0, ∞) → L∞(X) such that
f − ALG(f, ε) ∞
ε ∀f ∈ C candidate set
cheap cost of an ALG(f, ε) value, e.g., spline
3/14

4. Introduction Univariate, Low Accuracy Multivariate, Reproducing Kernel Hilbert Space Summary References
Problem
Given black-box function routine f : X ⊆ Rd → R, e.g., output of a computer simulation
Expensive cost of a function value, \$(f)
Want ﬁxed tolerance algorithm ALG : C × (0, ∞) → L∞(X) such that
f − ALG(f, ε) ∞
ε ∀f ∈ C candidate set
cheap cost of an ALG(f, ε) value
design or node array X ∈ Xn ⊆ Rn×d, function data y = f(X) ∈ Rn
xn+1
= argmax
x∈X
ACQ(x, X, y) acquisition function
f − APP(X, y) ∞ ERR(X, y) data-driven error bound ∀n ∈ N, f ∈ C
n∗ = min {n ∈ N: ERR(X, y) ε} stopping criterion
ALG(f, ε) = APP(X, y) ﬁxed budget approximation for this n∗
3/14

5. Introduction Univariate, Low Accuracy Multivariate, Reproducing Kernel Hilbert Space Summary References
Problem
Given black-box function routine f : X ⊆ Rd → R, e.g., output of a computer simulation
Expensive cost of a function value, \$(f)
Want ﬁxed tolerance algorithm ALG : C × (0, ∞) → L∞(X) such that
f − ALG(f, ε) ∞
ε ∀f ∈ C candidate set
cheap cost of an ALG(f, ε) value
design or node array X ∈ Xn ⊆ Rn×d, function data y = f(X) ∈ Rn
xn+1
= argmax
x∈X
ACQ(x, X, y) acquisition function
f − APP(X, y) ∞ ERR(X, y) data-driven error bound ∀n ∈ N, f ∈ C
n∗ = min {n ∈ N: ERR(X, y) ε} stopping criterion
ALG(f, ε) = APP(X, y) ﬁxed budget approximation for this n∗
Adaptive sample size, design, and ﬁxed budget approximation
Assumes that what you see is almost what you get
3/14

6. Introduction Univariate, Low Accuracy Multivariate, Reproducing Kernel Hilbert Space Summary References
Linear Splines X
f : [a, b] → R
a =: x0
< x1
< · · · < xn
:= b, X = xi
n
i=0
data sites
function data y = f(X)
linear spline APP(X, y) :=
x − xi
xi−1
− xi
yi−1
+
x − xi−1
xi
− xi−1
yi
, xi−1
x xi
, i ∈ 1:n
f − APP(X, y) ∞,[xi−1,xi]
(xi
− xi−1
)2 f ∞,[xi−1,xi]
8
, i ∈ 1:n, f ∈ W2,∞
4/14

7. Introduction Univariate, Low Accuracy Multivariate, Reproducing Kernel Hilbert Space Summary References
Linear Splines X
f : [a, b] → R
a =: x0
< x1
< · · · < xn
:= b, X = xi
n
i=0
data sites
function data y = f(X)
linear spline APP(X, y) :=
x − xi
xi−1
− xi
yi−1
+
x − xi−1
xi
− xi−1
yi
, xi−1
x xi
, i ∈ 1:n
f − APP(X, y) ∞,[xi−1,xi]
(xi
− xi−1
)2 f ∞,[xi−1,xi]
8
, i ∈ 1:n, f ∈ W2,∞
Numerical analysis often stops here, leaving unanswered questions:
How big should n be to make f − APP(X, y) ∞
ε?
How big is f ∞,[xi−1,xi]?
How best to choose X?
4/14

8. Introduction Univariate, Low Accuracy Multivariate, Reproducing Kernel Hilbert Space Summary References
Linear Splines Error
f : [a, b] → R
a =: x0
< x1
< · · · < xn
:= b, X = xi
n
i=0
data sites
function data y = f(X)
linear spline APP(X, y) :=
x − xi
xi−1
− xi
yi−1
+
x − xi−1
xi
− xi−1
yi
, xi−1
x xi
, i ∈ 1:n
f − APP(X, y) ∞,[xi−1,xi]
1
8
(xi
− xi−1
)2 f ∞,[xi−1,xi]
, i ∈ 1:n, f ∈ W2,∞
f
−∞,[xi−1,xi+1]
yi+1−yi
xi+1−xi
− yi−yi−1
xi−xi−1
(xi+1
− xi−1
)/2
Di(X,y)=2|f[xi−1,xi,xi+1]| data based
abs. 2nd deriv. of interp. poly.
f ∞,[xi−1,xi+1]
5/14

9. Introduction Univariate, Low Accuracy Multivariate, Reproducing Kernel Hilbert Space Summary References
Linear Splines Error
f : [a, b] → R
a =: x0
< x1
< · · · < xn
:= b, X = xi
n
i=0
data sites
function data y = f(X)
linear spline APP(X, y) :=
x − xi
xi−1
− xi
yi−1
+
x − xi−1
xi
− xi−1
yi
, xi−1
x xi
, i ∈ 1:n
f − APP(X, y) ∞,[xi−1,xi]
1
8
(xi
− xi−1
)2 f ∞,[xi−1,xi]
, i ∈ 1:n, f ∈ W2,∞
f
−∞,[xi−1,xi+1]
yi+1−yi
xi+1−xi
− yi−yi−1
xi−xi−1
(xi+1
− xi−1
)/2
Di(X,y)=2|f[xi−1,xi,xi+1]| data based
abs. 2nd deriv. of interp. poly.
f ∞,[xi−1,xi+1]
candidate set C := f ∈ W2,∞ : |f (x)| max C(h−
) |f (x − h−
)| , C(h+
) |f (x + h+
)| , 0 < h±
< h,
a < x < b inﬂation factor C(h) :=
C0
h
h − h
|f | does not change abruptly
5/14

10. Introduction Univariate, Low Accuracy Multivariate, Reproducing Kernel Hilbert Space Summary References
Linear Splines Error
f : [a, b] → R
a =: x0
< x1
< · · · < xn
:= b, X = xi
n
i=0
data sites
function data y = f(X)
f − APP(X, y) ∞,[xi−1,xi]
1
8
(xi
− xi−1
)2 f ∞,[xi−1,xi] max
±
ERRi,±
(X, y), i ∈ 1:n, f ∈ C
candidate set C := f ∈ W2,∞ : |f (x)| max C(h−
) |f (x − h−
)| , C(h+
) |f (x + h+
)| , 0 < h±
< h,
a < x < b inﬂation factor C(h) :=
C0
h
h − h
|f | does not change abruptly
ERRi,−
(X, y) =
1
8
(xi
− xi−1
)2C(xi
− xi−3
)Di−2
(X, y),
ERRi,+
(X, y) =
1
8
(xi
− xi−1
)2C(xi+2
− xi−1
)Di+1
(X, y)
Di
(X, y) = 2 |f[xi−1
, xi
, xi+1
]| data based, absolute 2nd derivative of interpoplating polynomial
5/14

11. Introduction Univariate, Low Accuracy Multivariate, Reproducing Kernel Hilbert Space Summary References
X
Given ninit
4, C0
1:
h =
3(b − a)
ninit
− 1
, C(h) =
C0
h
h − h
n = ninit
, xi
= a + i(b − a)/n
Choi, S.-C. T., Ding, Y., H., F. J. & Tong, X. Local Adaption for Approximation and Minimization of Univariate Functions. J.
Complexity 40, 17–33 (2017). 6/14

12. Introduction Univariate, Low Accuracy Multivariate, Reproducing Kernel Hilbert Space Summary References
X
Given ninit
4, C0
1:
h =
3(b − a)
ninit
− 1
, C(h) =
C0
h
h − h
n = ninit
, xi
= a + i(b − a)/n
Step 1. Compute data based ERRi,±
(X, y) for i = 1, . . . , n.
Step 2. Construct I, the index set of subintervals that might be split:
I = i ∈ 1:n : ERRi±j,∓
(X, y) > ε, j = 0, 1, 2}
Choi, S.-C. T., Ding, Y., H., F. J. & Tong, X. Local Adaption for Approximation and Minimization of Univariate Functions. J.
Complexity 40, 17–33 (2017). 6/14

13. Introduction Univariate, Low Accuracy Multivariate, Reproducing Kernel Hilbert Space Summary References
X
Given ninit
4, C0
1:
h =
3(b − a)
ninit
− 1
, C(h) =
C0
h
h − h
n = ninit
, xi
= a + i(b − a)/n
Step 1. Compute data based ERRi,±
(X, y) for i = 1, . . . , n.
Step 2. Construct I, the index set of subintervals that might be split:
I = i ∈ 1:n : ERRi±j,∓
(X, y) > ε, j = 0, 1, 2}
Step 3. If I = ∅, return ALG(f, ε) = APP(X, y) as the approximation satisfying the error
tolerance. Otherwise split those intervals in I with largest width and go to Step 1
(acquisition function).
Choi, S.-C. T., Ding, Y., H., F. J. & Tong, X. Local Adaption for Approximation and Minimization of Univariate Functions. J.
Complexity 40, 17–33 (2017). 6/14

14. Introduction Univariate, Low Accuracy Multivariate, Reproducing Kernel Hilbert Space Summary References
X
Given ninit
4, C0
1:
h =
3(b − a)
ninit
− 1
, C(h) =
C0
h
h − h
n = ninit
, xi
= a + i(b − a)/n
Step 1. Compute data based ERRi,±
(X, y) for i = 1, . . . , n.
Step 2. Construct I, the index set of subintervals that might be split:
I = i ∈ 1:n : ERRi±j,∓
(X, y) > ε, j = 0, 1, 2}
Step 3. If I = ∅, return ALG(f, ε) = APP(X, y) as the approximation satisfying the error
tolerance. Otherwise split those intervals in I with largest width and go to Step 1
(acquisition function).
Choi, S.-C. T., Ding, Y., H., F. J. & Tong, X. Local Adaption for Approximation and Minimization of Univariate Functions. J.
Complexity 40, 17–33 (2017). 6/14

15. Introduction Univariate, Low Accuracy Multivariate, Reproducing Kernel Hilbert Space Summary References
Highlights of Adaptive Linear Spline Algorithm
X
Deﬁned for cone candidate set, C, whose deﬁnition
does not depend on the algorithm
Choi, S.-C. T., Ding, Y., H., F. J. & Tong, X. Local Adaption for Approximation and Minimization of Univariate Functions. J.
Complexity 40, 17–33 (2017). 7/14

16. Introduction Univariate, Low Accuracy Multivariate, Reproducing Kernel Hilbert Space Summary References
Highlights of Adaptive Linear Spline Algorithm
X
Deﬁned for cone candidate set, C, whose deﬁnition
does not depend on the algorithm
Guaranteed to succeed for all f ∈ C
Choi, S.-C. T., Ding, Y., H., F. J. & Tong, X. Local Adaption for Approximation and Minimization of Univariate Functions. J.
Complexity 40, 17–33 (2017). 7/14

17. Introduction Univariate, Low Accuracy Multivariate, Reproducing Kernel Hilbert Space Summary References
Highlights of Adaptive Linear Spline Algorithm
X
Deﬁned for cone candidate set, C, whose deﬁnition
does not depend on the algorithm
Guaranteed to succeed for all f ∈ C
Candidate set C excludes spikes,
i.e., two nearby inﬂection points
Choi, S.-C. T., Ding, Y., H., F. J. & Tong, X. Local Adaption for Approximation and Minimization of Univariate Functions. J.
Complexity 40, 17–33 (2017). 7/14

18. Introduction Univariate, Low Accuracy Multivariate, Reproducing Kernel Hilbert Space Summary References
Highlights of Adaptive Linear Spline Algorithm
X
Deﬁned for cone candidate set, C, whose deﬁnition
does not depend on the algorithm
Guaranteed to succeed for all f ∈ C
Candidate set C excludes spikes,
i.e., two nearby inﬂection points
C formalizes what you see is almost what you get
Choi, S.-C. T., Ding, Y., H., F. J. & Tong, X. Local Adaption for Approximation and Minimization of Univariate Functions. J.
Complexity 40, 17–33 (2017). 7/14

19. Introduction Univariate, Low Accuracy Multivariate, Reproducing Kernel Hilbert Space Summary References
Highlights of Adaptive Linear Spline Algorithm
X
Deﬁned for cone candidate set, C, whose deﬁnition
does not depend on the algorithm
Guaranteed to succeed for all f ∈ C
Candidate set C excludes spikes,
i.e., two nearby inﬂection points
C formalizes what you see is almost what you get
Impossible to have an algorithm for all f ∈ W2,∞
since W2,∞
contains arbitrarily large functions that look like 0
Choi, S.-C. T., Ding, Y., H., F. J. & Tong, X. Local Adaption for Approximation and Minimization of Univariate Functions. J.
Complexity 40, 17–33 (2017). 7/14

20. Introduction Univariate, Low Accuracy Multivariate, Reproducing Kernel Hilbert Space Summary References
Highlights of Adaptive Linear Spline Algorithm
X
Deﬁned for cone candidate set, C, whose deﬁnition
does not depend on the algorithm
Guaranteed to succeed for all f ∈ C
Candidate set C excludes spikes,
i.e., two nearby inﬂection points
C formalizes what you see is almost what you get
Impossible to have an algorithm for all f ∈ W2,∞
since W2,∞
contains arbitrarily large functions that look like 0
Adaptive algorithms do not help for ball candidate sets C = {f : f ∞
R}
Choi, S.-C. T., Ding, Y., H., F. J. & Tong, X. Local Adaption for Approximation and Minimization of Univariate Functions. J.
Complexity 40, 17–33 (2017). 7/14

21. Introduction Univariate, Low Accuracy Multivariate, Reproducing Kernel Hilbert Space Summary References
Highlights of Adaptive Linear Spline Algorithm
X
Deﬁned for cone candidate set, C, whose deﬁnition
does not depend on the algorithm
Guaranteed to succeed for all f ∈ C
Candidate set C excludes spikes,
i.e., two nearby inﬂection points
C formalizes what you see is almost what you get
Impossible to have an algorithm for all f ∈ W2,∞
since W2,∞
contains arbitrarily large functions that look like 0
Adaptive algorithms do not help for ball candidate sets C = {f : f ∞
R}
cost(ALG, f, ε, C)
C0
f 1
2
ε comp(f, ε, C) optimal
Choi, S.-C. T., Ding, Y., H., F. J. & Tong, X. Local Adaption for Approximation and Minimization of Univariate Functions. J.
Complexity 40, 17–33 (2017). 7/14

22. Introduction Univariate, Low Accuracy Multivariate, Reproducing Kernel Hilbert Space Summary References
Highlights of Adaptive Linear Spline Algorithm
X
Deﬁned for cone candidate set, C, whose deﬁnition
does not depend on the algorithm
Guaranteed to succeed for all f ∈ C
Candidate set C excludes spikes,
i.e., two nearby inﬂection points
C formalizes what you see is almost what you get
Impossible to have an algorithm for all f ∈ W2,∞
since W2,∞
contains arbitrarily large functions that look like 0
Adaptive algorithms do not help for ball candidate sets C = {f : f ∞
R}
cost(ALG, f, ε, C)
C0
f 1
2
ε comp(f, ε, C) optimal
Does not allow for smoothness to be inferred
Choi, S.-C. T., Ding, Y., H., F. J. & Tong, X. Local Adaption for Approximation and Minimization of Univariate Functions. J.
Complexity 40, 17–33 (2017). 7/14

23. Introduction Univariate, Low Accuracy Multivariate, Reproducing Kernel Hilbert Space Summary References
Highlights of Adaptive Linear Spline Algorithm
X
Deﬁned for cone candidate set, C, whose deﬁnition
does not depend on the algorithm
Guaranteed to succeed for all f ∈ C
Candidate set C excludes spikes,
i.e., two nearby inﬂection points
C formalizes what you see is almost what you get
Impossible to have an algorithm for all f ∈ W2,∞
since W2,∞
contains arbitrarily large functions that look like 0
Adaptive algorithms do not help for ball candidate sets C = {f : f ∞
R}
cost(ALG, f, ε, C)
C0
f 1
2
ε comp(f, ε, C) optimal
Does not allow for smoothness to be inferred
Not multivariate
Choi, S.-C. T., Ding, Y., H., F. J. & Tong, X. Local Adaption for Approximation and Minimization of Univariate Functions. J.
Complexity 40, 17–33 (2017). 7/14

24. Introduction Univariate, Low Accuracy Multivariate, Reproducing Kernel Hilbert Space Summary References
Approximation via Reproducing Kernel Hilbert Spaces (RKHSs)
X
F is a Hilbert space with reproducing kernel
K : X × X → R
K(X, X) positive deﬁnite ∀X
K(·, x) ∈ F, f(x) = K(·, x), f
F
∀x ∈ X,
e.g., K(t, x) = (1 + t − x 2
) exp − t − x 2 Matérn
Fasshauer, G. E. Meshfree Approximation Methods with M . (World Scientiﬁc Publishing Co., Singapore, 2007),
Fasshauer, G. E. & McCourt, M. Kernel-based Approximation Methods using MATLAB. (World Scientiﬁc Publishing Co.,
Singapore, 2015). 8/14

25. Introduction Univariate, Low Accuracy Multivariate, Reproducing Kernel Hilbert Space Summary References
Approximation via Reproducing Kernel Hilbert Spaces (RKHSs)
X
F is a Hilbert space with reproducing kernel
K : X × X → R
K(X, X) positive deﬁnite ∀X
K(·, x) ∈ F, f(x) = K(·, x), f
F
∀x ∈ X,
e.g., K(t, x) = (1 + t − x 2
) exp − t − x 2 Matérn
Optimal (minimum norm) interpolant is
APP(X, y) = K(·, X) K(X, X) −1
y, y = f(X)
f − APP(X, y) 2

K(·, ·) − K(·, X) K(X, X) −1
K(X, ·)

f − APP(X, y) 2
F known
Fasshauer, G. E. Meshfree Approximation Methods with M . (World Scientiﬁc Publishing Co., Singapore, 2007),
Fasshauer, G. E. & McCourt, M. Kernel-based Approximation Methods using MATLAB. (World Scientiﬁc Publishing Co.,
Singapore, 2015). 8/14

26. Introduction Univariate, Low Accuracy Multivariate, Reproducing Kernel Hilbert Space Summary References
Approximation via Reproducing Kernel Hilbert Spaces (RKHSs)
X
F is a Hilbert space with reproducing kernel
K : X × X → R
K(X, X) positive deﬁnite ∀X
K(·, x) ∈ F, f(x) = K(·, x), f
F
∀x ∈ X,
e.g., K(t, x) = (1 + t − x 2
) exp − t − x 2 Matérn
Optimal (minimum norm) interpolant is
APP(X, y) = K(·, X) K(X, X) −1
y, y = f(X)
f − APP(X, y) 2

K(·, ·) − K(·, X) K(X, X) −1
K(X, ·)

f − APP(X, y) 2
F known
candidate set C = f ∈ F : f − APP(X, y)
F
C(X) f
F
Fasshauer, G. E. Meshfree Approximation Methods with M . (World Scientiﬁc Publishing Co., Singapore, 2007),
Fasshauer, G. E. & McCourt, M. Kernel-based Approximation Methods using MATLAB. (World Scientiﬁc Publishing Co.,
Singapore, 2015). 8/14

27. Introduction Univariate, Low Accuracy Multivariate, Reproducing Kernel Hilbert Space Summary References
Approximation via Reproducing Kernel Hilbert Spaces (RKHSs)
X
F is a Hilbert space with reproducing kernel
K : X × X → R
K(X, X) positive deﬁnite ∀X
K(·, x) ∈ F, f(x) = K(·, x), f
F
∀x ∈ X,
e.g., K(t, x) = (1 + t − x 2
) exp − t − x 2 Matérn
Optimal (minimum norm) interpolant is
APP(X, y) = K(·, X) K(X, X) −1
y, y = f(X)
f − APP(X, y) 2

K(·, ·) − K(·, X) K(X, X) −1
K(X, ·)

f − APP(X, y) 2
F known
K(·, ·) − K(·, X) K(X, X) −1
K(X, ·)

C2(X)
1 − C2(X) APP(X, y) 2
F
=: ERR2(X, y)
candidate set C = f ∈ F : f − APP(X, y)
F
C(X) f
F
Fasshauer, G. E. Meshfree Approximation Methods with M . (World Scientiﬁc Publishing Co., Singapore, 2007),
Fasshauer, G. E. & McCourt, M. Kernel-based Approximation Methods using MATLAB. (World Scientiﬁc Publishing Co.,
Singapore, 2015). 8/14

28. Introduction Univariate, Low Accuracy Multivariate, Reproducing Kernel Hilbert Space Summary References
Error and Acquisition for Optimal RKHS Approximation
X
F is a Hilbert space with reproducing kernel
K : X × X → R
e.g., K(t, x) = (1 + t − x 2
) exp − t − x 2 Matérn
APP(X, y) = K(·, X) K(X, X) −1
y, y = f(X)
candidate set C = f ∈ F : f − APP(X, y)
F
C(X) f
F
f − APP(X, y) 2

K(·, ·) − K(·, X) K(X, X) −1
K(X, ·)

f − APP(X, y) 2
F known
K(·, ·) − K(·, X) K(X, X) −1
K(X, ·)

C2(X)
1 − C2(X) APP(X, y) 2
F
=: ERR2(X, y)
ACQ(x, X, y) := K(x, x) − K(x, X) K(X, X) −1
K(X, x)
C2(X)
1 − C2(X) APP(X, y) 2
F
yT(K(X,X))−1y
xn+1
= argmax
x∈X
ACQ(x, X, y) acquisition function
ALG(f, ε) = APP(X, y) for n∗ = min {n ∈ N: ERR(X, y) ε} stopping criterion
9/14

29. Introduction Univariate, Low Accuracy Multivariate, Reproducing Kernel Hilbert Space Summary References
Error and Acquisition for Optimal RKHS Approximation
X
X X
F is a Hilbert space with reproducing kernel
K : X × X → R
e.g., K(t, x) = (1 + t − x 2
) exp − t − x 2 Matérn
APP(X, y) = K(·, X) K(X, X) −1
y, y = f(X)
candidate set C = f ∈ F : f − APP(X, y)
F
C(X) f
F
f − APP(X, y) 2

K(·, ·) − K(·, X) K(X, X) −1
K(X, ·)

f − APP(X, y) 2
F known
K(·, ·) − K(·, X) K(X, X) −1
K(X, ·)

C2(X)
1 − C2(X) APP(X, y) 2
F
=: ERR2(X, y)
ACQ(x, X, y) := K(x, x) − K(x, X) K(X, X) −1
K(X, x)
C2(X)
1 − C2(X) APP(X, y) 2
F
yT(K(X,X))−1y
xn+1
= argmax
x∈X
ACQ(x, X, y) acquisition function
ALG(f, ε) = APP(X, y) for n∗ = min {n ∈ N: ERR(X, y) ε} stopping criterion
9/14

30. Introduction Univariate, Low Accuracy Multivariate, Reproducing Kernel Hilbert Space Summary References
Must Infer Kernel from y = f(X)
Fθ is a Hilbert space with reproducing kernel Kθ
C = f ∈ ∪
θ
Fθ : f − APP(X, y)
Fθ∗
C(X) f
Fθ∗
∀X, y = f(X), θ∗(X, y) given below
e.g., Kθ(t, x) = (1 + θ (t − x) 2
) exp − θ (t − x) 2
Choose the θ (inspired by empirical Bayes) by minimizing the ellipsoid in Rn
of function data yielding
interpolants with no greater norm than that observed:
θ∗ = argmin
θ
1
n log det(Kθ) + log yT
K−1
θ
y
f − APP(X, y) 2

K(·, ·) − K(·, X) K(X, X) −1
K(X, ·)

C2(X)
1 − C2(X)
yT(K(X, X))−1y =: ERR2(X, y)
ACQ(x, X, y) := K(x, x) − K(x, X) K(X, X) −1
K(X, x)
C2(X)
1 − C2(X)
yT(K(X, X))−1y
xn+1
= argmax
x∈X
ACQ(x, X, y) acquisition function
10/14

31. Introduction Univariate, Low Accuracy Multivariate, Reproducing Kernel Hilbert Space Summary References
Must Infer Kernel from y = f(X)
X
X X
Fθ is a Hilbert space with reproducing kernel Kθ
C = f ∈ ∪
θ
Fθ : f − APP(X, y)
Fθ∗
C(X) f
Fθ∗
∀X, y = f(X), θ∗(X, y) given below
e.g., Kθ(t, x) = (1 + θ (t − x) 2
) exp − θ (t − x) 2
Choose the θ (inspired by empirical Bayes) by minimizing the ellipsoid in Rn
of function data yielding
interpolants with no greater norm than that observed:
θ∗ = argmin
θ
1
n log det(Kθ) + log yT
K−1
θ
y
f − APP(X, y) 2

K(·, ·) − K(·, X) K(X, X) −1
K(X, ·)

C2(X)
1 − C2(X)
yT(K(X, X))−1y =: ERR2(X, y)
ACQ(x, X, y) := K(x, x) − K(x, X) K(X, X) −1
K(X, x)
C2(X)
1 − C2(X)
yT(K(X, X))−1y
xn+1
= argmax
x∈X
ACQ(x, X, y) acquisition function
10/14

32. Introduction Univariate, Low Accuracy Multivariate, Reproducing Kernel Hilbert Space Summary References
Must Infer Kernel from y = f(X)
X
X X
Fθ is a Hilbert space with reproducing kernel Kθ
C = f ∈ ∪
θ
Fθ : f − APP(X, y)
Fθ∗
C(X) f
Fθ∗
∀X, y = f(X), θ∗(X, y) given below
e.g., Kθ(t, x) = exp(bT(t + x))
× (1 + a (t − x) 2
) exp − a (t − x) 2
, θ = (a, b)
Choose the θ (inspired by empirical Bayes) by minimizing the ellipsoid in Rn
of function data yielding
interpolants with no greater norm than that observed:
θ∗ = argmin
θ
1
n log det(Kθ) + log yT
K−1
θ
y
f − APP(X, y) 2

K(·, ·) − K(·, X) K(X, X) −1
K(X, ·)

C2(X)
1 − C2(X)
yT(K(X, X))−1y =: ERR2(X, y)
ACQ(x, X, y) := K(x, x) − K(x, X) K(X, X) −1
K(X, x)
C2(X)
1 − C2(X)
yT(K(X, X))−1y
xn+1
= argmax
x∈X
ACQ(x, X, y) acquisition function
10/14

33. Introduction Univariate, Low Accuracy Multivariate, Reproducing Kernel Hilbert Space Summary References
Must Infer Kernel from y = f(X)
X
X X
Fθ is a Hilbert space with reproducing kernel Kθ
C = f ∈ ∪
θ
Fθ : f − APP(X, y)
Fθ∗
C(X) f
Fθ∗
∀X, y = f(X), θ∗(X, y) given below
e.g., Kθ(t, x) = exp(bT(t + x))
× (1 + a (t − x) 2
) exp − a (t − x) 2
, θ = (a, b)
Choose the θ (inspired by empirical Bayes) by minimizing the ellipsoid in Rn
of function data yielding
interpolants with no greater norm than that observed:
θ∗ = argmin
θ
1
n log det(Kθ) + log yT
K−1
θ
y
f − APP(X, y) 2

K(·, ·) − K(·, X) K(X, X) −1
K(X, ·)

C2(X)
1 − C2(X)
yT(K(X, X))−1y =: ERR2(X, y)
ACQ(x, X, y) := K(x, x) − K(x, X) K(X, X) −1
K(X, x)
C2(X)
1 − C2(X)
yT(K(X, X))−1y
xn+1
= argmax
x∈X
ACQ(x, X, y) acquisition function
10/14

34. Introduction Univariate, Low Accuracy Multivariate, Reproducing Kernel Hilbert Space Summary References
Must Infer Kernel from y = f(X)
X
X X
Fθ is a Hilbert space with reproducing kernel Kθ
C = f ∈ ∪
θ
Fθ : f − APP(X, y)
Fθ∗
C(X) f
Fθ∗
∀X, y = f(X), θ∗(X, y) given below
e.g., Kθ(t, x) = exp(bT(t + x))
× (1 + a (t − x) 2
) exp − a (t − x) 2
, θ = (a, b)
Choose the θ (inspired by empirical Bayes) by minimizing the ellipsoid in Rn
of function data yielding
interpolants with no greater norm than that observed:
θ∗ = argmin
θ
1
n log det(Kθ) + log yT
K−1
θ
y
f − APP(X, y) 2

K(·, ·) − K(·, X) K(X, X) −1
K(X, ·)

C2(X)
1 − C2(X)
yT(K(X, X))−1y =: ERR2(X, y)
ACQ(x, X, y) := K(x, x) − K(x, X) K(X, X) −1
K(X, x)
C2(X)
1 − C2(X)
yT(K(X, X))−1y
xn+1
= argmax
x∈X
ACQ(x, X, y) acquisition function
10/14

35. Introduction Univariate, Low Accuracy Multivariate, Reproducing Kernel Hilbert Space Summary References
Cheng and Sandu Function
f(x) = cos(x1
+ x2
) exp(x1
x2
) error w/ Matérn & θ = 1 error w/ mod. Matérn + opt. θ
ε = 0.05
Bingham, D. & Surjano, S. Virtual Library of Simulation Experiments. 2013. https://www.sfu.ca/~ssurjano/, Cheng, H.
& Sandu, A. Collocation least-squares polynomial chaos method. in Proceedings of the 2010 Spring Simulation Multiconference,
Society for Computer Simulation International. (2010). 11/14

36. Introduction Univariate, Low Accuracy Multivariate, Reproducing Kernel Hilbert Space Summary References
Cheng and Sandu Function
f(x) = cos(x1
+ x2
) exp(x1
x2
) error w/ Matérn & θ = 1 error w/ mod. Matérn + opt. θ
ε = 0.05
Bingham, D. & Surjano, S. Virtual Library of Simulation Experiments. 2013. https://www.sfu.ca/~ssurjano/, Cheng, H.
& Sandu, A. Collocation least-squares polynomial chaos method. in Proceedings of the 2010 Spring Simulation Multiconference,
Society for Computer Simulation International. (2010). 11/14

37. Introduction Univariate, Low Accuracy Multivariate, Reproducing Kernel Hilbert Space Summary References
What Are the Right Ingredients for Adaptive Function Approximation?
A ﬁxed budget homogeneous approximation, APP : Xn × Rn → L∞(X), with an error bound, e.g.,
linear splines, RKHS approximation
An unbounded, non-convex candidate set, C, for which the error bound can be bounded in
data-driven way; what you see is almost what you get
Necessary conditions for f to lie in C; will not have suﬃcient conditions
A rich enough candidate set from which the right approximation can be inferred; attention to
underﬁtting and overﬁtting
12/14

38. Introduction Univariate, Low Accuracy Multivariate, Reproducing Kernel Hilbert Space Summary References
What Are the Right Ingredients for Adaptive Function Approximation?
A ﬁxed budget homogeneous approximation, APP : Xn × Rn → L∞(X), with an error bound, e.g.,
linear splines, RKHS approximation
An unbounded, non-convex candidate set, C, for which the error bound can be bounded in
data-driven way; what you see is almost what you get
Necessary conditions for f to lie in C; will not have suﬃcient conditions
A rich enough candidate set from which the right approximation can be inferred; attention to
underﬁtting and overﬁtting
More work is needed on
What makes a good initial sample
Balancing the richness of the candidate set with overﬁtting
Numerical instability and computational eﬀort challenges for larger numbers of data sites.
12/14

39. Thank you
These slides are available at
speakerdeck.com/fjhickernell/

40. Introduction Univariate, Low Accuracy Multivariate, Reproducing Kernel Hilbert Space Summary References
References
Choi, S.-C. T., Ding, Y., H., F. J. & Tong, X. Local Adaption for Approximation and Minimization of
Univariate Functions. J. Complexity 40, 17–33 (2017).
Fasshauer, G. E. Meshfree Approximation Methods with M . (World Scientiﬁc Publishing Co.,
Singapore, 2007).
Fasshauer, G. E. & McCourt, M. Kernel-based Approximation Methods using MATLAB. (World
Scientiﬁc Publishing Co., Singapore, 2015).
Bingham, D. & Surjano, S. Virtual Library of Simulation Experiments. 2013.
https://www.sfu.ca/~ssurjano/.
Cheng, H. & Sandu, A. Collocation least-squares polynomial chaos method. in Proceedings of the
2010 Spring Simulation Multiconference, Society for Computer Simulation International. (2010).
14/14