Slide 1

Slide 1 text

The Right Ingredients for Adaptive Function Approximation Algorithms Fred J. Hickernell Department of Applied Mathematics Center for Interdisciplinary Scientific Computation Illinois Institute of Technology [email protected] mypages.iit.edu/~hickernell with Sou-Cheng Choi, Yuhan Ding, Mac Hyman, Xin Tong, and the GAIL team partially supported by NSF-DMS-1522687 and NSF-DMS-1638521 (SAMSI) Thanks to Guohui Song for the invitation and hospitality Old Dominion University, March 5, 2020

Slide 2

Slide 2 text

Introduction Univariate, Low Accuracy Multivariate, Reproducing Kernel Hilbert Space Summary References The Guaranteed Automatic Integration Library (GAIL) and QMCPy Teams Sou-Cheng Choi (Chief Data Scientist, Kamakura) Yuhan Ding (IIT PhD ’15, Lecturer, IIT) Lan Jiang (IIT PhD ’16, Compass) Lluís Antoni Jiménez Rugama (IIT PhD ’17, UBS) Jagadeeswaran Rathinavel (IIT PhD ’19, Wi-Tronix) Aleksei Sorokin (IIT BS + MAS ’21 exp.) Tong Xin (IIT MS, UIC PhD ’20 exp.) Kan Zhang (IIT PhD ’20 exp.) Yizhi Zhang (IIT PhD ’18, Jamran Int’l) Xuan Zhou (IIT PhD ’15, JP Morgan) and others Adaptive software libraries GAIL and QMCPy 2/14

Slide 3

Slide 3 text

Introduction Univariate, Low Accuracy Multivariate, Reproducing Kernel Hilbert Space Summary References Problem Given black-box function routine f : X ⊆ Rd → R, e.g., output of a computer simulation Expensive cost of a function value, $(f) Want fixed tolerance algorithm ALG : C × (0, ∞) → L∞(X) such that f − ALG(f, ε) ∞ ε ∀f ∈ C candidate set cheap cost of an ALG(f, ε) value, e.g., spline 3/14

Slide 4

Slide 4 text

Introduction Univariate, Low Accuracy Multivariate, Reproducing Kernel Hilbert Space Summary References Problem Given black-box function routine f : X ⊆ Rd → R, e.g., output of a computer simulation Expensive cost of a function value, $(f) Want fixed tolerance algorithm ALG : C × (0, ∞) → L∞(X) such that f − ALG(f, ε) ∞ ε ∀f ∈ C candidate set cheap cost of an ALG(f, ε) value design or node array X ∈ Xn ⊆ Rn×d, function data y = f(X) ∈ Rn xn+1 = argmax x∈X ACQ(x, X, y) acquisition function f − APP(X, y) ∞ ERR(X, y) data-driven error bound ∀n ∈ N, f ∈ C n∗ = min {n ∈ N: ERR(X, y) ε} stopping criterion ALG(f, ε) = APP(X, y) fixed budget approximation for this n∗ 3/14

Slide 5

Slide 5 text

Introduction Univariate, Low Accuracy Multivariate, Reproducing Kernel Hilbert Space Summary References Problem Given black-box function routine f : X ⊆ Rd → R, e.g., output of a computer simulation Expensive cost of a function value, $(f) Want fixed tolerance algorithm ALG : C × (0, ∞) → L∞(X) such that f − ALG(f, ε) ∞ ε ∀f ∈ C candidate set cheap cost of an ALG(f, ε) value design or node array X ∈ Xn ⊆ Rn×d, function data y = f(X) ∈ Rn xn+1 = argmax x∈X ACQ(x, X, y) acquisition function f − APP(X, y) ∞ ERR(X, y) data-driven error bound ∀n ∈ N, f ∈ C n∗ = min {n ∈ N: ERR(X, y) ε} stopping criterion ALG(f, ε) = APP(X, y) fixed budget approximation for this n∗ Adaptive sample size, design, and fixed budget approximation Assumes that what you see is almost what you get 3/14

Slide 6

Slide 6 text

Introduction Univariate, Low Accuracy Multivariate, Reproducing Kernel Hilbert Space Summary References Linear Splines X f : [a, b] → R a =: x0 < x1 < · · · < xn := b, X = xi n i=0 data sites function data y = f(X) linear spline APP(X, y) := x − xi xi−1 − xi yi−1 + x − xi−1 xi − xi−1 yi , xi−1 x xi , i ∈ 1:n f − APP(X, y) ∞,[xi−1,xi] (xi − xi−1 )2 f ∞,[xi−1,xi] 8 , i ∈ 1:n, f ∈ W2,∞ 4/14

Slide 7

Slide 7 text

Introduction Univariate, Low Accuracy Multivariate, Reproducing Kernel Hilbert Space Summary References Linear Splines X f : [a, b] → R a =: x0 < x1 < · · · < xn := b, X = xi n i=0 data sites function data y = f(X) linear spline APP(X, y) := x − xi xi−1 − xi yi−1 + x − xi−1 xi − xi−1 yi , xi−1 x xi , i ∈ 1:n f − APP(X, y) ∞,[xi−1,xi] (xi − xi−1 )2 f ∞,[xi−1,xi] 8 , i ∈ 1:n, f ∈ W2,∞ Numerical analysis often stops here, leaving unanswered questions: How big should n be to make f − APP(X, y) ∞ ε? How big is f ∞,[xi−1,xi]? How best to choose X? 4/14

Slide 8

Slide 8 text

Introduction Univariate, Low Accuracy Multivariate, Reproducing Kernel Hilbert Space Summary References Linear Splines Error f : [a, b] → R a =: x0 < x1 < · · · < xn := b, X = xi n i=0 data sites function data y = f(X) linear spline APP(X, y) := x − xi xi−1 − xi yi−1 + x − xi−1 xi − xi−1 yi , xi−1 x xi , i ∈ 1:n f − APP(X, y) ∞,[xi−1,xi] 1 8 (xi − xi−1 )2 f ∞,[xi−1,xi] , i ∈ 1:n, f ∈ W2,∞ f −∞,[xi−1,xi+1] yi+1−yi xi+1−xi − yi−yi−1 xi−xi−1 (xi+1 − xi−1 )/2 Di(X,y)=2|f[xi−1,xi,xi+1]| data based abs. 2nd deriv. of interp. poly. f ∞,[xi−1,xi+1] 5/14

Slide 9

Slide 9 text

Introduction Univariate, Low Accuracy Multivariate, Reproducing Kernel Hilbert Space Summary References Linear Splines Error f : [a, b] → R a =: x0 < x1 < · · · < xn := b, X = xi n i=0 data sites function data y = f(X) linear spline APP(X, y) := x − xi xi−1 − xi yi−1 + x − xi−1 xi − xi−1 yi , xi−1 x xi , i ∈ 1:n f − APP(X, y) ∞,[xi−1,xi] 1 8 (xi − xi−1 )2 f ∞,[xi−1,xi] , i ∈ 1:n, f ∈ W2,∞ f −∞,[xi−1,xi+1] yi+1−yi xi+1−xi − yi−yi−1 xi−xi−1 (xi+1 − xi−1 )/2 Di(X,y)=2|f[xi−1,xi,xi+1]| data based abs. 2nd deriv. of interp. poly. f ∞,[xi−1,xi+1] candidate set C := f ∈ W2,∞ : |f (x)| max C(h− ) |f (x − h− )| , C(h+ ) |f (x + h+ )| , 0 < h± < h, a < x < b inflation factor C(h) := C0 h h − h |f | does not change abruptly 5/14

Slide 10

Slide 10 text

Introduction Univariate, Low Accuracy Multivariate, Reproducing Kernel Hilbert Space Summary References Linear Splines Error f : [a, b] → R a =: x0 < x1 < · · · < xn := b, X = xi n i=0 data sites function data y = f(X) f − APP(X, y) ∞,[xi−1,xi] 1 8 (xi − xi−1 )2 f ∞,[xi−1,xi] max ± ERRi,± (X, y), i ∈ 1:n, f ∈ C candidate set C := f ∈ W2,∞ : |f (x)| max C(h− ) |f (x − h− )| , C(h+ ) |f (x + h+ )| , 0 < h± < h, a < x < b inflation factor C(h) := C0 h h − h |f | does not change abruptly ERRi,− (X, y) = 1 8 (xi − xi−1 )2C(xi − xi−3 )Di−2 (X, y), ERRi,+ (X, y) = 1 8 (xi − xi−1 )2C(xi+2 − xi−1 )Di+1 (X, y) Di (X, y) = 2 |f[xi−1 , xi , xi+1 ]| data based, absolute 2nd derivative of interpoplating polynomial 5/14

Slide 11

Slide 11 text

Introduction Univariate, Low Accuracy Multivariate, Reproducing Kernel Hilbert Space Summary References Adaptive Linear Spline Algorithm X Given ninit 4, C0 1: h = 3(b − a) ninit − 1 , C(h) = C0 h h − h n = ninit , xi = a + i(b − a)/n Choi, S.-C. T., Ding, Y., H., F. J. & Tong, X. Local Adaption for Approximation and Minimization of Univariate Functions. J. Complexity 40, 17–33 (2017). 6/14

Slide 12

Slide 12 text

Introduction Univariate, Low Accuracy Multivariate, Reproducing Kernel Hilbert Space Summary References Adaptive Linear Spline Algorithm X Given ninit 4, C0 1: h = 3(b − a) ninit − 1 , C(h) = C0 h h − h n = ninit , xi = a + i(b − a)/n Step 1. Compute data based ERRi,± (X, y) for i = 1, . . . , n. Step 2. Construct I, the index set of subintervals that might be split: I = i ∈ 1:n : ERRi±j,∓ (X, y) > ε, j = 0, 1, 2} Choi, S.-C. T., Ding, Y., H., F. J. & Tong, X. Local Adaption for Approximation and Minimization of Univariate Functions. J. Complexity 40, 17–33 (2017). 6/14

Slide 13

Slide 13 text

Introduction Univariate, Low Accuracy Multivariate, Reproducing Kernel Hilbert Space Summary References Adaptive Linear Spline Algorithm X Given ninit 4, C0 1: h = 3(b − a) ninit − 1 , C(h) = C0 h h − h n = ninit , xi = a + i(b − a)/n Step 1. Compute data based ERRi,± (X, y) for i = 1, . . . , n. Step 2. Construct I, the index set of subintervals that might be split: I = i ∈ 1:n : ERRi±j,∓ (X, y) > ε, j = 0, 1, 2} Step 3. If I = ∅, return ALG(f, ε) = APP(X, y) as the approximation satisfying the error tolerance. Otherwise split those intervals in I with largest width and go to Step 1 (acquisition function). Choi, S.-C. T., Ding, Y., H., F. J. & Tong, X. Local Adaption for Approximation and Minimization of Univariate Functions. J. Complexity 40, 17–33 (2017). 6/14

Slide 14

Slide 14 text

Introduction Univariate, Low Accuracy Multivariate, Reproducing Kernel Hilbert Space Summary References Adaptive Linear Spline Algorithm X Given ninit 4, C0 1: h = 3(b − a) ninit − 1 , C(h) = C0 h h − h n = ninit , xi = a + i(b − a)/n Step 1. Compute data based ERRi,± (X, y) for i = 1, . . . , n. Step 2. Construct I, the index set of subintervals that might be split: I = i ∈ 1:n : ERRi±j,∓ (X, y) > ε, j = 0, 1, 2} Step 3. If I = ∅, return ALG(f, ε) = APP(X, y) as the approximation satisfying the error tolerance. Otherwise split those intervals in I with largest width and go to Step 1 (acquisition function). Choi, S.-C. T., Ding, Y., H., F. J. & Tong, X. Local Adaption for Approximation and Minimization of Univariate Functions. J. Complexity 40, 17–33 (2017). 6/14

Slide 15

Slide 15 text

Introduction Univariate, Low Accuracy Multivariate, Reproducing Kernel Hilbert Space Summary References Highlights of Adaptive Linear Spline Algorithm X Defined for cone candidate set, C, whose definition does not depend on the algorithm Choi, S.-C. T., Ding, Y., H., F. J. & Tong, X. Local Adaption for Approximation and Minimization of Univariate Functions. J. Complexity 40, 17–33 (2017). 7/14

Slide 16

Slide 16 text

Introduction Univariate, Low Accuracy Multivariate, Reproducing Kernel Hilbert Space Summary References Highlights of Adaptive Linear Spline Algorithm X Defined for cone candidate set, C, whose definition does not depend on the algorithm Guaranteed to succeed for all f ∈ C Choi, S.-C. T., Ding, Y., H., F. J. & Tong, X. Local Adaption for Approximation and Minimization of Univariate Functions. J. Complexity 40, 17–33 (2017). 7/14

Slide 17

Slide 17 text

Introduction Univariate, Low Accuracy Multivariate, Reproducing Kernel Hilbert Space Summary References Highlights of Adaptive Linear Spline Algorithm X Defined for cone candidate set, C, whose definition does not depend on the algorithm Guaranteed to succeed for all f ∈ C Candidate set C excludes spikes, i.e., two nearby inflection points Choi, S.-C. T., Ding, Y., H., F. J. & Tong, X. Local Adaption for Approximation and Minimization of Univariate Functions. J. Complexity 40, 17–33 (2017). 7/14

Slide 18

Slide 18 text

Introduction Univariate, Low Accuracy Multivariate, Reproducing Kernel Hilbert Space Summary References Highlights of Adaptive Linear Spline Algorithm X Defined for cone candidate set, C, whose definition does not depend on the algorithm Guaranteed to succeed for all f ∈ C Candidate set C excludes spikes, i.e., two nearby inflection points C formalizes what you see is almost what you get Choi, S.-C. T., Ding, Y., H., F. J. & Tong, X. Local Adaption for Approximation and Minimization of Univariate Functions. J. Complexity 40, 17–33 (2017). 7/14

Slide 19

Slide 19 text

Introduction Univariate, Low Accuracy Multivariate, Reproducing Kernel Hilbert Space Summary References Highlights of Adaptive Linear Spline Algorithm X Defined for cone candidate set, C, whose definition does not depend on the algorithm Guaranteed to succeed for all f ∈ C Candidate set C excludes spikes, i.e., two nearby inflection points C formalizes what you see is almost what you get Impossible to have an algorithm for all f ∈ W2,∞ since W2,∞ contains arbitrarily large functions that look like 0 Choi, S.-C. T., Ding, Y., H., F. J. & Tong, X. Local Adaption for Approximation and Minimization of Univariate Functions. J. Complexity 40, 17–33 (2017). 7/14

Slide 20

Slide 20 text

Introduction Univariate, Low Accuracy Multivariate, Reproducing Kernel Hilbert Space Summary References Highlights of Adaptive Linear Spline Algorithm X Defined for cone candidate set, C, whose definition does not depend on the algorithm Guaranteed to succeed for all f ∈ C Candidate set C excludes spikes, i.e., two nearby inflection points C formalizes what you see is almost what you get Impossible to have an algorithm for all f ∈ W2,∞ since W2,∞ contains arbitrarily large functions that look like 0 Adaptive algorithms do not help for ball candidate sets C = {f : f ∞ R} Choi, S.-C. T., Ding, Y., H., F. J. & Tong, X. Local Adaption for Approximation and Minimization of Univariate Functions. J. Complexity 40, 17–33 (2017). 7/14

Slide 21

Slide 21 text

Introduction Univariate, Low Accuracy Multivariate, Reproducing Kernel Hilbert Space Summary References Highlights of Adaptive Linear Spline Algorithm X Defined for cone candidate set, C, whose definition does not depend on the algorithm Guaranteed to succeed for all f ∈ C Candidate set C excludes spikes, i.e., two nearby inflection points C formalizes what you see is almost what you get Impossible to have an algorithm for all f ∈ W2,∞ since W2,∞ contains arbitrarily large functions that look like 0 Adaptive algorithms do not help for ball candidate sets C = {f : f ∞ R} cost(ALG, f, ε, C) C0 f 1 2 ε comp(f, ε, C) optimal Choi, S.-C. T., Ding, Y., H., F. J. & Tong, X. Local Adaption for Approximation and Minimization of Univariate Functions. J. Complexity 40, 17–33 (2017). 7/14

Slide 22

Slide 22 text

Introduction Univariate, Low Accuracy Multivariate, Reproducing Kernel Hilbert Space Summary References Highlights of Adaptive Linear Spline Algorithm X Defined for cone candidate set, C, whose definition does not depend on the algorithm Guaranteed to succeed for all f ∈ C Candidate set C excludes spikes, i.e., two nearby inflection points C formalizes what you see is almost what you get Impossible to have an algorithm for all f ∈ W2,∞ since W2,∞ contains arbitrarily large functions that look like 0 Adaptive algorithms do not help for ball candidate sets C = {f : f ∞ R} cost(ALG, f, ε, C) C0 f 1 2 ε comp(f, ε, C) optimal Does not allow for smoothness to be inferred Choi, S.-C. T., Ding, Y., H., F. J. & Tong, X. Local Adaption for Approximation and Minimization of Univariate Functions. J. Complexity 40, 17–33 (2017). 7/14

Slide 23

Slide 23 text

Introduction Univariate, Low Accuracy Multivariate, Reproducing Kernel Hilbert Space Summary References Highlights of Adaptive Linear Spline Algorithm X Defined for cone candidate set, C, whose definition does not depend on the algorithm Guaranteed to succeed for all f ∈ C Candidate set C excludes spikes, i.e., two nearby inflection points C formalizes what you see is almost what you get Impossible to have an algorithm for all f ∈ W2,∞ since W2,∞ contains arbitrarily large functions that look like 0 Adaptive algorithms do not help for ball candidate sets C = {f : f ∞ R} cost(ALG, f, ε, C) C0 f 1 2 ε comp(f, ε, C) optimal Does not allow for smoothness to be inferred Not multivariate Choi, S.-C. T., Ding, Y., H., F. J. & Tong, X. Local Adaption for Approximation and Minimization of Univariate Functions. J. Complexity 40, 17–33 (2017). 7/14

Slide 24

Slide 24 text

Introduction Univariate, Low Accuracy Multivariate, Reproducing Kernel Hilbert Space Summary References Approximation via Reproducing Kernel Hilbert Spaces (RKHSs) X F is a Hilbert space with reproducing kernel K : X × X → R K(X, X) positive definite ∀X K(·, x) ∈ F, f(x) = K(·, x), f F ∀x ∈ X, e.g., K(t, x) = (1 + t − x 2 ) exp − t − x 2 Matérn Fasshauer, G. E. Meshfree Approximation Methods with M . (World Scientific Publishing Co., Singapore, 2007), Fasshauer, G. E. & McCourt, M. Kernel-based Approximation Methods using MATLAB. (World Scientific Publishing Co., Singapore, 2015). 8/14

Slide 25

Slide 25 text

Introduction Univariate, Low Accuracy Multivariate, Reproducing Kernel Hilbert Space Summary References Approximation via Reproducing Kernel Hilbert Spaces (RKHSs) X F is a Hilbert space with reproducing kernel K : X × X → R K(X, X) positive definite ∀X K(·, x) ∈ F, f(x) = K(·, x), f F ∀x ∈ X, e.g., K(t, x) = (1 + t − x 2 ) exp − t − x 2 Matérn Optimal (minimum norm) interpolant is APP(X, y) = K(·, X) K(X, X) −1 y, y = f(X) f − APP(X, y) 2 ∞ K(·, ·) − K(·, X) K(X, X) −1 K(X, ·) ∞ f − APP(X, y) 2 F known Fasshauer, G. E. Meshfree Approximation Methods with M . (World Scientific Publishing Co., Singapore, 2007), Fasshauer, G. E. & McCourt, M. Kernel-based Approximation Methods using MATLAB. (World Scientific Publishing Co., Singapore, 2015). 8/14

Slide 26

Slide 26 text

Introduction Univariate, Low Accuracy Multivariate, Reproducing Kernel Hilbert Space Summary References Approximation via Reproducing Kernel Hilbert Spaces (RKHSs) X F is a Hilbert space with reproducing kernel K : X × X → R K(X, X) positive definite ∀X K(·, x) ∈ F, f(x) = K(·, x), f F ∀x ∈ X, e.g., K(t, x) = (1 + t − x 2 ) exp − t − x 2 Matérn Optimal (minimum norm) interpolant is APP(X, y) = K(·, X) K(X, X) −1 y, y = f(X) f − APP(X, y) 2 ∞ K(·, ·) − K(·, X) K(X, X) −1 K(X, ·) ∞ f − APP(X, y) 2 F known candidate set C = f ∈ F : f − APP(X, y) F C(X) f F Fasshauer, G. E. Meshfree Approximation Methods with M . (World Scientific Publishing Co., Singapore, 2007), Fasshauer, G. E. & McCourt, M. Kernel-based Approximation Methods using MATLAB. (World Scientific Publishing Co., Singapore, 2015). 8/14

Slide 27

Slide 27 text

Introduction Univariate, Low Accuracy Multivariate, Reproducing Kernel Hilbert Space Summary References Approximation via Reproducing Kernel Hilbert Spaces (RKHSs) X F is a Hilbert space with reproducing kernel K : X × X → R K(X, X) positive definite ∀X K(·, x) ∈ F, f(x) = K(·, x), f F ∀x ∈ X, e.g., K(t, x) = (1 + t − x 2 ) exp − t − x 2 Matérn Optimal (minimum norm) interpolant is APP(X, y) = K(·, X) K(X, X) −1 y, y = f(X) f − APP(X, y) 2 ∞ K(·, ·) − K(·, X) K(X, X) −1 K(X, ·) ∞ f − APP(X, y) 2 F known K(·, ·) − K(·, X) K(X, X) −1 K(X, ·) ∞ C2(X) 1 − C2(X) APP(X, y) 2 F =: ERR2(X, y) candidate set C = f ∈ F : f − APP(X, y) F C(X) f F Fasshauer, G. E. Meshfree Approximation Methods with M . (World Scientific Publishing Co., Singapore, 2007), Fasshauer, G. E. & McCourt, M. Kernel-based Approximation Methods using MATLAB. (World Scientific Publishing Co., Singapore, 2015). 8/14

Slide 28

Slide 28 text

Introduction Univariate, Low Accuracy Multivariate, Reproducing Kernel Hilbert Space Summary References Error and Acquisition for Optimal RKHS Approximation X F is a Hilbert space with reproducing kernel K : X × X → R e.g., K(t, x) = (1 + t − x 2 ) exp − t − x 2 Matérn APP(X, y) = K(·, X) K(X, X) −1 y, y = f(X) candidate set C = f ∈ F : f − APP(X, y) F C(X) f F f − APP(X, y) 2 ∞ K(·, ·) − K(·, X) K(X, X) −1 K(X, ·) ∞ f − APP(X, y) 2 F known K(·, ·) − K(·, X) K(X, X) −1 K(X, ·) ∞ C2(X) 1 − C2(X) APP(X, y) 2 F =: ERR2(X, y) ACQ(x, X, y) := K(x, x) − K(x, X) K(X, X) −1 K(X, x) C2(X) 1 − C2(X) APP(X, y) 2 F yT(K(X,X))−1y xn+1 = argmax x∈X ACQ(x, X, y) acquisition function ALG(f, ε) = APP(X, y) for n∗ = min {n ∈ N: ERR(X, y) ε} stopping criterion 9/14

Slide 29

Slide 29 text

Introduction Univariate, Low Accuracy Multivariate, Reproducing Kernel Hilbert Space Summary References Error and Acquisition for Optimal RKHS Approximation X X X F is a Hilbert space with reproducing kernel K : X × X → R e.g., K(t, x) = (1 + t − x 2 ) exp − t − x 2 Matérn APP(X, y) = K(·, X) K(X, X) −1 y, y = f(X) candidate set C = f ∈ F : f − APP(X, y) F C(X) f F f − APP(X, y) 2 ∞ K(·, ·) − K(·, X) K(X, X) −1 K(X, ·) ∞ f − APP(X, y) 2 F known K(·, ·) − K(·, X) K(X, X) −1 K(X, ·) ∞ C2(X) 1 − C2(X) APP(X, y) 2 F =: ERR2(X, y) ACQ(x, X, y) := K(x, x) − K(x, X) K(X, X) −1 K(X, x) C2(X) 1 − C2(X) APP(X, y) 2 F yT(K(X,X))−1y xn+1 = argmax x∈X ACQ(x, X, y) acquisition function ALG(f, ε) = APP(X, y) for n∗ = min {n ∈ N: ERR(X, y) ε} stopping criterion 9/14

Slide 30

Slide 30 text

Introduction Univariate, Low Accuracy Multivariate, Reproducing Kernel Hilbert Space Summary References Must Infer Kernel from y = f(X) Fθ is a Hilbert space with reproducing kernel Kθ C = f ∈ ∪ θ Fθ : f − APP(X, y) Fθ∗ C(X) f Fθ∗ ∀X, y = f(X), θ∗(X, y) given below e.g., Kθ(t, x) = (1 + θ (t − x) 2 ) exp − θ (t − x) 2 Choose the θ (inspired by empirical Bayes) by minimizing the ellipsoid in Rn of function data yielding interpolants with no greater norm than that observed: θ∗ = argmin θ 1 n log det(Kθ) + log yT K−1 θ y f − APP(X, y) 2 ∞ K(·, ·) − K(·, X) K(X, X) −1 K(X, ·) ∞ C2(X) 1 − C2(X) yT(K(X, X))−1y =: ERR2(X, y) ACQ(x, X, y) := K(x, x) − K(x, X) K(X, X) −1 K(X, x) C2(X) 1 − C2(X) yT(K(X, X))−1y xn+1 = argmax x∈X ACQ(x, X, y) acquisition function 10/14

Slide 31

Slide 31 text

Introduction Univariate, Low Accuracy Multivariate, Reproducing Kernel Hilbert Space Summary References Must Infer Kernel from y = f(X) X X X Fθ is a Hilbert space with reproducing kernel Kθ C = f ∈ ∪ θ Fθ : f − APP(X, y) Fθ∗ C(X) f Fθ∗ ∀X, y = f(X), θ∗(X, y) given below e.g., Kθ(t, x) = (1 + θ (t − x) 2 ) exp − θ (t − x) 2 Choose the θ (inspired by empirical Bayes) by minimizing the ellipsoid in Rn of function data yielding interpolants with no greater norm than that observed: θ∗ = argmin θ 1 n log det(Kθ) + log yT K−1 θ y f − APP(X, y) 2 ∞ K(·, ·) − K(·, X) K(X, X) −1 K(X, ·) ∞ C2(X) 1 − C2(X) yT(K(X, X))−1y =: ERR2(X, y) ACQ(x, X, y) := K(x, x) − K(x, X) K(X, X) −1 K(X, x) C2(X) 1 − C2(X) yT(K(X, X))−1y xn+1 = argmax x∈X ACQ(x, X, y) acquisition function 10/14

Slide 32

Slide 32 text

Introduction Univariate, Low Accuracy Multivariate, Reproducing Kernel Hilbert Space Summary References Must Infer Kernel from y = f(X) X X X Fθ is a Hilbert space with reproducing kernel Kθ C = f ∈ ∪ θ Fθ : f − APP(X, y) Fθ∗ C(X) f Fθ∗ ∀X, y = f(X), θ∗(X, y) given below e.g., Kθ(t, x) = exp(bT(t + x)) × (1 + a (t − x) 2 ) exp − a (t − x) 2 , θ = (a, b) Choose the θ (inspired by empirical Bayes) by minimizing the ellipsoid in Rn of function data yielding interpolants with no greater norm than that observed: θ∗ = argmin θ 1 n log det(Kθ) + log yT K−1 θ y f − APP(X, y) 2 ∞ K(·, ·) − K(·, X) K(X, X) −1 K(X, ·) ∞ C2(X) 1 − C2(X) yT(K(X, X))−1y =: ERR2(X, y) ACQ(x, X, y) := K(x, x) − K(x, X) K(X, X) −1 K(X, x) C2(X) 1 − C2(X) yT(K(X, X))−1y xn+1 = argmax x∈X ACQ(x, X, y) acquisition function 10/14

Slide 33

Slide 33 text

Introduction Univariate, Low Accuracy Multivariate, Reproducing Kernel Hilbert Space Summary References Must Infer Kernel from y = f(X) X X X Fθ is a Hilbert space with reproducing kernel Kθ C = f ∈ ∪ θ Fθ : f − APP(X, y) Fθ∗ C(X) f Fθ∗ ∀X, y = f(X), θ∗(X, y) given below e.g., Kθ(t, x) = exp(bT(t + x)) × (1 + a (t − x) 2 ) exp − a (t − x) 2 , θ = (a, b) Choose the θ (inspired by empirical Bayes) by minimizing the ellipsoid in Rn of function data yielding interpolants with no greater norm than that observed: θ∗ = argmin θ 1 n log det(Kθ) + log yT K−1 θ y f − APP(X, y) 2 ∞ K(·, ·) − K(·, X) K(X, X) −1 K(X, ·) ∞ C2(X) 1 − C2(X) yT(K(X, X))−1y =: ERR2(X, y) ACQ(x, X, y) := K(x, x) − K(x, X) K(X, X) −1 K(X, x) C2(X) 1 − C2(X) yT(K(X, X))−1y xn+1 = argmax x∈X ACQ(x, X, y) acquisition function 10/14

Slide 34

Slide 34 text

Introduction Univariate, Low Accuracy Multivariate, Reproducing Kernel Hilbert Space Summary References Must Infer Kernel from y = f(X) X X X Fθ is a Hilbert space with reproducing kernel Kθ C = f ∈ ∪ θ Fθ : f − APP(X, y) Fθ∗ C(X) f Fθ∗ ∀X, y = f(X), θ∗(X, y) given below e.g., Kθ(t, x) = exp(bT(t + x)) × (1 + a (t − x) 2 ) exp − a (t − x) 2 , θ = (a, b) Choose the θ (inspired by empirical Bayes) by minimizing the ellipsoid in Rn of function data yielding interpolants with no greater norm than that observed: θ∗ = argmin θ 1 n log det(Kθ) + log yT K−1 θ y f − APP(X, y) 2 ∞ K(·, ·) − K(·, X) K(X, X) −1 K(X, ·) ∞ C2(X) 1 − C2(X) yT(K(X, X))−1y =: ERR2(X, y) ACQ(x, X, y) := K(x, x) − K(x, X) K(X, X) −1 K(X, x) C2(X) 1 − C2(X) yT(K(X, X))−1y xn+1 = argmax x∈X ACQ(x, X, y) acquisition function 10/14

Slide 35

Slide 35 text

Introduction Univariate, Low Accuracy Multivariate, Reproducing Kernel Hilbert Space Summary References Cheng and Sandu Function f(x) = cos(x1 + x2 ) exp(x1 x2 ) error w/ Matérn & θ = 1 error w/ mod. Matérn + opt. θ ε = 0.05 Bingham, D. & Surjano, S. Virtual Library of Simulation Experiments. 2013. https://www.sfu.ca/~ssurjano/, Cheng, H. & Sandu, A. Collocation least-squares polynomial chaos method. in Proceedings of the 2010 Spring Simulation Multiconference, Society for Computer Simulation International. (2010). 11/14

Slide 36

Slide 36 text

Introduction Univariate, Low Accuracy Multivariate, Reproducing Kernel Hilbert Space Summary References Cheng and Sandu Function f(x) = cos(x1 + x2 ) exp(x1 x2 ) error w/ Matérn & θ = 1 error w/ mod. Matérn + opt. θ ε = 0.05 Bingham, D. & Surjano, S. Virtual Library of Simulation Experiments. 2013. https://www.sfu.ca/~ssurjano/, Cheng, H. & Sandu, A. Collocation least-squares polynomial chaos method. in Proceedings of the 2010 Spring Simulation Multiconference, Society for Computer Simulation International. (2010). 11/14

Slide 37

Slide 37 text

Introduction Univariate, Low Accuracy Multivariate, Reproducing Kernel Hilbert Space Summary References What Are the Right Ingredients for Adaptive Function Approximation? A fixed budget homogeneous approximation, APP : Xn × Rn → L∞(X), with an error bound, e.g., linear splines, RKHS approximation An unbounded, non-convex candidate set, C, for which the error bound can be bounded in data-driven way; what you see is almost what you get Necessary conditions for f to lie in C; will not have sufficient conditions A rich enough candidate set from which the right approximation can be inferred; attention to underfitting and overfitting 12/14

Slide 38

Slide 38 text

Introduction Univariate, Low Accuracy Multivariate, Reproducing Kernel Hilbert Space Summary References What Are the Right Ingredients for Adaptive Function Approximation? A fixed budget homogeneous approximation, APP : Xn × Rn → L∞(X), with an error bound, e.g., linear splines, RKHS approximation An unbounded, non-convex candidate set, C, for which the error bound can be bounded in data-driven way; what you see is almost what you get Necessary conditions for f to lie in C; will not have sufficient conditions A rich enough candidate set from which the right approximation can be inferred; attention to underfitting and overfitting More work is needed on What makes a good initial sample Balancing the richness of the candidate set with overfitting Numerical instability and computational effort challenges for larger numbers of data sites. 12/14

Slide 39

Slide 39 text

Thank you These slides are available at speakerdeck.com/fjhickernell/ right-ingredients-for-adaptive-function-approximation

Slide 40

Slide 40 text

Introduction Univariate, Low Accuracy Multivariate, Reproducing Kernel Hilbert Space Summary References References Choi, S.-C. T., Ding, Y., H., F. J. & Tong, X. Local Adaption for Approximation and Minimization of Univariate Functions. J. Complexity 40, 17–33 (2017). Fasshauer, G. E. Meshfree Approximation Methods with M . (World Scientific Publishing Co., Singapore, 2007). Fasshauer, G. E. & McCourt, M. Kernel-based Approximation Methods using MATLAB. (World Scientific Publishing Co., Singapore, 2015). Bingham, D. & Surjano, S. Virtual Library of Simulation Experiments. 2013. https://www.sfu.ca/~ssurjano/. Cheng, H. & Sandu, A. Collocation least-squares polynomial chaos method. in Proceedings of the 2010 Spring Simulation Multiconference, Society for Computer Simulation International. (2010). 14/14