Upgrade to Pro — share decks privately, control downloads, hide ads and more …

Right Ingredients for Adaptive Function Approximation

Right Ingredients for Adaptive Function Approximation

Fred J. Hickernell

March 05, 2020
Tweet

More Decks by Fred J. Hickernell

Other Decks in Research

Transcript

  1. The Right Ingredients for
    Adaptive Function Approximation Algorithms
    Fred J. Hickernell
    Department of Applied Mathematics
    Center for Interdisciplinary Scientific Computation
    Illinois Institute of Technology
    [email protected] mypages.iit.edu/~hickernell
    with Sou-Cheng Choi, Yuhan Ding, Mac Hyman, Xin Tong, and the GAIL team
    partially supported by NSF-DMS-1522687 and NSF-DMS-1638521 (SAMSI)
    Thanks to Guohui Song for the invitation and hospitality
    Old Dominion University, March 5, 2020

    View Slide

  2. Introduction Univariate, Low Accuracy Multivariate, Reproducing Kernel Hilbert Space Summary References
    The Guaranteed Automatic Integration Library (GAIL) and QMCPy Teams
    Sou-Cheng Choi (Chief Data Scientist, Kamakura)
    Yuhan Ding (IIT PhD ’15, Lecturer, IIT)
    Lan Jiang (IIT PhD ’16, Compass)
    Lluís Antoni Jiménez Rugama (IIT PhD ’17, UBS)
    Jagadeeswaran Rathinavel (IIT PhD ’19, Wi-Tronix)
    Aleksei Sorokin (IIT BS + MAS ’21 exp.)
    Tong Xin (IIT MS, UIC PhD ’20 exp.)
    Kan Zhang (IIT PhD ’20 exp.)
    Yizhi Zhang (IIT PhD ’18, Jamran Int’l)
    Xuan Zhou (IIT PhD ’15, JP Morgan)
    and others
    Adaptive software libraries GAIL and QMCPy
    2/14

    View Slide

  3. Introduction Univariate, Low Accuracy Multivariate, Reproducing Kernel Hilbert Space Summary References
    Problem
    Given black-box function routine f : X ⊆ Rd → R, e.g., output of a computer simulation
    Expensive cost of a function value, $(f)
    Want fixed tolerance algorithm ALG : C × (0, ∞) → L∞(X) such that
    f − ALG(f, ε) ∞
    ε ∀f ∈ C candidate set
    cheap cost of an ALG(f, ε) value, e.g., spline
    3/14

    View Slide

  4. Introduction Univariate, Low Accuracy Multivariate, Reproducing Kernel Hilbert Space Summary References
    Problem
    Given black-box function routine f : X ⊆ Rd → R, e.g., output of a computer simulation
    Expensive cost of a function value, $(f)
    Want fixed tolerance algorithm ALG : C × (0, ∞) → L∞(X) such that
    f − ALG(f, ε) ∞
    ε ∀f ∈ C candidate set
    cheap cost of an ALG(f, ε) value
    design or node array X ∈ Xn ⊆ Rn×d, function data y = f(X) ∈ Rn
    xn+1
    = argmax
    x∈X
    ACQ(x, X, y) acquisition function
    f − APP(X, y) ∞ ERR(X, y) data-driven error bound ∀n ∈ N, f ∈ C
    n∗ = min {n ∈ N: ERR(X, y) ε} stopping criterion
    ALG(f, ε) = APP(X, y) fixed budget approximation for this n∗
    3/14

    View Slide

  5. Introduction Univariate, Low Accuracy Multivariate, Reproducing Kernel Hilbert Space Summary References
    Problem
    Given black-box function routine f : X ⊆ Rd → R, e.g., output of a computer simulation
    Expensive cost of a function value, $(f)
    Want fixed tolerance algorithm ALG : C × (0, ∞) → L∞(X) such that
    f − ALG(f, ε) ∞
    ε ∀f ∈ C candidate set
    cheap cost of an ALG(f, ε) value
    design or node array X ∈ Xn ⊆ Rn×d, function data y = f(X) ∈ Rn
    xn+1
    = argmax
    x∈X
    ACQ(x, X, y) acquisition function
    f − APP(X, y) ∞ ERR(X, y) data-driven error bound ∀n ∈ N, f ∈ C
    n∗ = min {n ∈ N: ERR(X, y) ε} stopping criterion
    ALG(f, ε) = APP(X, y) fixed budget approximation for this n∗
    Adaptive sample size, design, and fixed budget approximation
    Assumes that what you see is almost what you get
    3/14

    View Slide

  6. Introduction Univariate, Low Accuracy Multivariate, Reproducing Kernel Hilbert Space Summary References
    Linear Splines X
    f : [a, b] → R
    a =: x0
    < x1
    < · · · < xn
    := b, X = xi
    n
    i=0
    data sites
    function data y = f(X)
    linear spline APP(X, y) :=
    x − xi
    xi−1
    − xi
    yi−1
    +
    x − xi−1
    xi
    − xi−1
    yi
    , xi−1
    x xi
    , i ∈ 1:n
    f − APP(X, y) ∞,[xi−1,xi]
    (xi
    − xi−1
    )2 f ∞,[xi−1,xi]
    8
    , i ∈ 1:n, f ∈ W2,∞
    4/14

    View Slide

  7. Introduction Univariate, Low Accuracy Multivariate, Reproducing Kernel Hilbert Space Summary References
    Linear Splines X
    f : [a, b] → R
    a =: x0
    < x1
    < · · · < xn
    := b, X = xi
    n
    i=0
    data sites
    function data y = f(X)
    linear spline APP(X, y) :=
    x − xi
    xi−1
    − xi
    yi−1
    +
    x − xi−1
    xi
    − xi−1
    yi
    , xi−1
    x xi
    , i ∈ 1:n
    f − APP(X, y) ∞,[xi−1,xi]
    (xi
    − xi−1
    )2 f ∞,[xi−1,xi]
    8
    , i ∈ 1:n, f ∈ W2,∞
    Numerical analysis often stops here, leaving unanswered questions:
    How big should n be to make f − APP(X, y) ∞
    ε?
    How big is f ∞,[xi−1,xi]?
    How best to choose X?
    4/14

    View Slide

  8. Introduction Univariate, Low Accuracy Multivariate, Reproducing Kernel Hilbert Space Summary References
    Linear Splines Error
    f : [a, b] → R
    a =: x0
    < x1
    < · · · < xn
    := b, X = xi
    n
    i=0
    data sites
    function data y = f(X)
    linear spline APP(X, y) :=
    x − xi
    xi−1
    − xi
    yi−1
    +
    x − xi−1
    xi
    − xi−1
    yi
    , xi−1
    x xi
    , i ∈ 1:n
    f − APP(X, y) ∞,[xi−1,xi]
    1
    8
    (xi
    − xi−1
    )2 f ∞,[xi−1,xi]
    , i ∈ 1:n, f ∈ W2,∞
    f
    −∞,[xi−1,xi+1]
    yi+1−yi
    xi+1−xi
    − yi−yi−1
    xi−xi−1
    (xi+1
    − xi−1
    )/2
    Di(X,y)=2|f[xi−1,xi,xi+1]| data based
    abs. 2nd deriv. of interp. poly.
    f ∞,[xi−1,xi+1]
    5/14

    View Slide

  9. Introduction Univariate, Low Accuracy Multivariate, Reproducing Kernel Hilbert Space Summary References
    Linear Splines Error
    f : [a, b] → R
    a =: x0
    < x1
    < · · · < xn
    := b, X = xi
    n
    i=0
    data sites
    function data y = f(X)
    linear spline APP(X, y) :=
    x − xi
    xi−1
    − xi
    yi−1
    +
    x − xi−1
    xi
    − xi−1
    yi
    , xi−1
    x xi
    , i ∈ 1:n
    f − APP(X, y) ∞,[xi−1,xi]
    1
    8
    (xi
    − xi−1
    )2 f ∞,[xi−1,xi]
    , i ∈ 1:n, f ∈ W2,∞
    f
    −∞,[xi−1,xi+1]
    yi+1−yi
    xi+1−xi
    − yi−yi−1
    xi−xi−1
    (xi+1
    − xi−1
    )/2
    Di(X,y)=2|f[xi−1,xi,xi+1]| data based
    abs. 2nd deriv. of interp. poly.
    f ∞,[xi−1,xi+1]
    candidate set C := f ∈ W2,∞ : |f (x)| max C(h−
    ) |f (x − h−
    )| , C(h+
    ) |f (x + h+
    )| , 0 < h±
    < h,
    a < x < b inflation factor C(h) :=
    C0
    h
    h − h
    |f | does not change abruptly
    5/14

    View Slide

  10. Introduction Univariate, Low Accuracy Multivariate, Reproducing Kernel Hilbert Space Summary References
    Linear Splines Error
    f : [a, b] → R
    a =: x0
    < x1
    < · · · < xn
    := b, X = xi
    n
    i=0
    data sites
    function data y = f(X)
    f − APP(X, y) ∞,[xi−1,xi]
    1
    8
    (xi
    − xi−1
    )2 f ∞,[xi−1,xi] max
    ±
    ERRi,±
    (X, y), i ∈ 1:n, f ∈ C
    candidate set C := f ∈ W2,∞ : |f (x)| max C(h−
    ) |f (x − h−
    )| , C(h+
    ) |f (x + h+
    )| , 0 < h±
    < h,
    a < x < b inflation factor C(h) :=
    C0
    h
    h − h
    |f | does not change abruptly
    ERRi,−
    (X, y) =
    1
    8
    (xi
    − xi−1
    )2C(xi
    − xi−3
    )Di−2
    (X, y),
    ERRi,+
    (X, y) =
    1
    8
    (xi
    − xi−1
    )2C(xi+2
    − xi−1
    )Di+1
    (X, y)
    Di
    (X, y) = 2 |f[xi−1
    , xi
    , xi+1
    ]| data based, absolute 2nd derivative of interpoplating polynomial
    5/14

    View Slide

  11. Introduction Univariate, Low Accuracy Multivariate, Reproducing Kernel Hilbert Space Summary References
    Adaptive Linear Spline Algorithm
    X
    Given ninit
    4, C0
    1:
    h =
    3(b − a)
    ninit
    − 1
    , C(h) =
    C0
    h
    h − h
    n = ninit
    , xi
    = a + i(b − a)/n
    Choi, S.-C. T., Ding, Y., H., F. J. & Tong, X. Local Adaption for Approximation and Minimization of Univariate Functions. J.
    Complexity 40, 17–33 (2017). 6/14

    View Slide

  12. Introduction Univariate, Low Accuracy Multivariate, Reproducing Kernel Hilbert Space Summary References
    Adaptive Linear Spline Algorithm
    X
    Given ninit
    4, C0
    1:
    h =
    3(b − a)
    ninit
    − 1
    , C(h) =
    C0
    h
    h − h
    n = ninit
    , xi
    = a + i(b − a)/n
    Step 1. Compute data based ERRi,±
    (X, y) for i = 1, . . . , n.
    Step 2. Construct I, the index set of subintervals that might be split:
    I = i ∈ 1:n : ERRi±j,∓
    (X, y) > ε, j = 0, 1, 2}
    Choi, S.-C. T., Ding, Y., H., F. J. & Tong, X. Local Adaption for Approximation and Minimization of Univariate Functions. J.
    Complexity 40, 17–33 (2017). 6/14

    View Slide

  13. Introduction Univariate, Low Accuracy Multivariate, Reproducing Kernel Hilbert Space Summary References
    Adaptive Linear Spline Algorithm
    X
    Given ninit
    4, C0
    1:
    h =
    3(b − a)
    ninit
    − 1
    , C(h) =
    C0
    h
    h − h
    n = ninit
    , xi
    = a + i(b − a)/n
    Step 1. Compute data based ERRi,±
    (X, y) for i = 1, . . . , n.
    Step 2. Construct I, the index set of subintervals that might be split:
    I = i ∈ 1:n : ERRi±j,∓
    (X, y) > ε, j = 0, 1, 2}
    Step 3. If I = ∅, return ALG(f, ε) = APP(X, y) as the approximation satisfying the error
    tolerance. Otherwise split those intervals in I with largest width and go to Step 1
    (acquisition function).
    Choi, S.-C. T., Ding, Y., H., F. J. & Tong, X. Local Adaption for Approximation and Minimization of Univariate Functions. J.
    Complexity 40, 17–33 (2017). 6/14

    View Slide

  14. Introduction Univariate, Low Accuracy Multivariate, Reproducing Kernel Hilbert Space Summary References
    Adaptive Linear Spline Algorithm
    X
    Given ninit
    4, C0
    1:
    h =
    3(b − a)
    ninit
    − 1
    , C(h) =
    C0
    h
    h − h
    n = ninit
    , xi
    = a + i(b − a)/n
    Step 1. Compute data based ERRi,±
    (X, y) for i = 1, . . . , n.
    Step 2. Construct I, the index set of subintervals that might be split:
    I = i ∈ 1:n : ERRi±j,∓
    (X, y) > ε, j = 0, 1, 2}
    Step 3. If I = ∅, return ALG(f, ε) = APP(X, y) as the approximation satisfying the error
    tolerance. Otherwise split those intervals in I with largest width and go to Step 1
    (acquisition function).
    Choi, S.-C. T., Ding, Y., H., F. J. & Tong, X. Local Adaption for Approximation and Minimization of Univariate Functions. J.
    Complexity 40, 17–33 (2017). 6/14

    View Slide

  15. Introduction Univariate, Low Accuracy Multivariate, Reproducing Kernel Hilbert Space Summary References
    Highlights of Adaptive Linear Spline Algorithm
    X
    Defined for cone candidate set, C, whose definition
    does not depend on the algorithm
    Choi, S.-C. T., Ding, Y., H., F. J. & Tong, X. Local Adaption for Approximation and Minimization of Univariate Functions. J.
    Complexity 40, 17–33 (2017). 7/14

    View Slide

  16. Introduction Univariate, Low Accuracy Multivariate, Reproducing Kernel Hilbert Space Summary References
    Highlights of Adaptive Linear Spline Algorithm
    X
    Defined for cone candidate set, C, whose definition
    does not depend on the algorithm
    Guaranteed to succeed for all f ∈ C
    Choi, S.-C. T., Ding, Y., H., F. J. & Tong, X. Local Adaption for Approximation and Minimization of Univariate Functions. J.
    Complexity 40, 17–33 (2017). 7/14

    View Slide

  17. Introduction Univariate, Low Accuracy Multivariate, Reproducing Kernel Hilbert Space Summary References
    Highlights of Adaptive Linear Spline Algorithm
    X
    Defined for cone candidate set, C, whose definition
    does not depend on the algorithm
    Guaranteed to succeed for all f ∈ C
    Candidate set C excludes spikes,
    i.e., two nearby inflection points
    Choi, S.-C. T., Ding, Y., H., F. J. & Tong, X. Local Adaption for Approximation and Minimization of Univariate Functions. J.
    Complexity 40, 17–33 (2017). 7/14

    View Slide

  18. Introduction Univariate, Low Accuracy Multivariate, Reproducing Kernel Hilbert Space Summary References
    Highlights of Adaptive Linear Spline Algorithm
    X
    Defined for cone candidate set, C, whose definition
    does not depend on the algorithm
    Guaranteed to succeed for all f ∈ C
    Candidate set C excludes spikes,
    i.e., two nearby inflection points
    C formalizes what you see is almost what you get
    Choi, S.-C. T., Ding, Y., H., F. J. & Tong, X. Local Adaption for Approximation and Minimization of Univariate Functions. J.
    Complexity 40, 17–33 (2017). 7/14

    View Slide

  19. Introduction Univariate, Low Accuracy Multivariate, Reproducing Kernel Hilbert Space Summary References
    Highlights of Adaptive Linear Spline Algorithm
    X
    Defined for cone candidate set, C, whose definition
    does not depend on the algorithm
    Guaranteed to succeed for all f ∈ C
    Candidate set C excludes spikes,
    i.e., two nearby inflection points
    C formalizes what you see is almost what you get
    Impossible to have an algorithm for all f ∈ W2,∞
    since W2,∞
    contains arbitrarily large functions that look like 0
    Choi, S.-C. T., Ding, Y., H., F. J. & Tong, X. Local Adaption for Approximation and Minimization of Univariate Functions. J.
    Complexity 40, 17–33 (2017). 7/14

    View Slide

  20. Introduction Univariate, Low Accuracy Multivariate, Reproducing Kernel Hilbert Space Summary References
    Highlights of Adaptive Linear Spline Algorithm
    X
    Defined for cone candidate set, C, whose definition
    does not depend on the algorithm
    Guaranteed to succeed for all f ∈ C
    Candidate set C excludes spikes,
    i.e., two nearby inflection points
    C formalizes what you see is almost what you get
    Impossible to have an algorithm for all f ∈ W2,∞
    since W2,∞
    contains arbitrarily large functions that look like 0
    Adaptive algorithms do not help for ball candidate sets C = {f : f ∞
    R}
    Choi, S.-C. T., Ding, Y., H., F. J. & Tong, X. Local Adaption for Approximation and Minimization of Univariate Functions. J.
    Complexity 40, 17–33 (2017). 7/14

    View Slide

  21. Introduction Univariate, Low Accuracy Multivariate, Reproducing Kernel Hilbert Space Summary References
    Highlights of Adaptive Linear Spline Algorithm
    X
    Defined for cone candidate set, C, whose definition
    does not depend on the algorithm
    Guaranteed to succeed for all f ∈ C
    Candidate set C excludes spikes,
    i.e., two nearby inflection points
    C formalizes what you see is almost what you get
    Impossible to have an algorithm for all f ∈ W2,∞
    since W2,∞
    contains arbitrarily large functions that look like 0
    Adaptive algorithms do not help for ball candidate sets C = {f : f ∞
    R}
    cost(ALG, f, ε, C)
    C0
    f 1
    2
    ε comp(f, ε, C) optimal
    Choi, S.-C. T., Ding, Y., H., F. J. & Tong, X. Local Adaption for Approximation and Minimization of Univariate Functions. J.
    Complexity 40, 17–33 (2017). 7/14

    View Slide

  22. Introduction Univariate, Low Accuracy Multivariate, Reproducing Kernel Hilbert Space Summary References
    Highlights of Adaptive Linear Spline Algorithm
    X
    Defined for cone candidate set, C, whose definition
    does not depend on the algorithm
    Guaranteed to succeed for all f ∈ C
    Candidate set C excludes spikes,
    i.e., two nearby inflection points
    C formalizes what you see is almost what you get
    Impossible to have an algorithm for all f ∈ W2,∞
    since W2,∞
    contains arbitrarily large functions that look like 0
    Adaptive algorithms do not help for ball candidate sets C = {f : f ∞
    R}
    cost(ALG, f, ε, C)
    C0
    f 1
    2
    ε comp(f, ε, C) optimal
    Does not allow for smoothness to be inferred
    Choi, S.-C. T., Ding, Y., H., F. J. & Tong, X. Local Adaption for Approximation and Minimization of Univariate Functions. J.
    Complexity 40, 17–33 (2017). 7/14

    View Slide

  23. Introduction Univariate, Low Accuracy Multivariate, Reproducing Kernel Hilbert Space Summary References
    Highlights of Adaptive Linear Spline Algorithm
    X
    Defined for cone candidate set, C, whose definition
    does not depend on the algorithm
    Guaranteed to succeed for all f ∈ C
    Candidate set C excludes spikes,
    i.e., two nearby inflection points
    C formalizes what you see is almost what you get
    Impossible to have an algorithm for all f ∈ W2,∞
    since W2,∞
    contains arbitrarily large functions that look like 0
    Adaptive algorithms do not help for ball candidate sets C = {f : f ∞
    R}
    cost(ALG, f, ε, C)
    C0
    f 1
    2
    ε comp(f, ε, C) optimal
    Does not allow for smoothness to be inferred
    Not multivariate
    Choi, S.-C. T., Ding, Y., H., F. J. & Tong, X. Local Adaption for Approximation and Minimization of Univariate Functions. J.
    Complexity 40, 17–33 (2017). 7/14

    View Slide

  24. Introduction Univariate, Low Accuracy Multivariate, Reproducing Kernel Hilbert Space Summary References
    Approximation via Reproducing Kernel Hilbert Spaces (RKHSs)
    X
    F is a Hilbert space with reproducing kernel
    K : X × X → R
    K(X, X) positive definite ∀X
    K(·, x) ∈ F, f(x) = K(·, x), f
    F
    ∀x ∈ X,
    e.g., K(t, x) = (1 + t − x 2
    ) exp − t − x 2 Matérn
    Fasshauer, G. E. Meshfree Approximation Methods with M . (World Scientific Publishing Co., Singapore, 2007),
    Fasshauer, G. E. & McCourt, M. Kernel-based Approximation Methods using MATLAB. (World Scientific Publishing Co.,
    Singapore, 2015). 8/14

    View Slide

  25. Introduction Univariate, Low Accuracy Multivariate, Reproducing Kernel Hilbert Space Summary References
    Approximation via Reproducing Kernel Hilbert Spaces (RKHSs)
    X
    F is a Hilbert space with reproducing kernel
    K : X × X → R
    K(X, X) positive definite ∀X
    K(·, x) ∈ F, f(x) = K(·, x), f
    F
    ∀x ∈ X,
    e.g., K(t, x) = (1 + t − x 2
    ) exp − t − x 2 Matérn
    Optimal (minimum norm) interpolant is
    APP(X, y) = K(·, X) K(X, X) −1
    y, y = f(X)
    f − APP(X, y) 2

    K(·, ·) − K(·, X) K(X, X) −1
    K(X, ·)

    f − APP(X, y) 2
    F known
    Fasshauer, G. E. Meshfree Approximation Methods with M . (World Scientific Publishing Co., Singapore, 2007),
    Fasshauer, G. E. & McCourt, M. Kernel-based Approximation Methods using MATLAB. (World Scientific Publishing Co.,
    Singapore, 2015). 8/14

    View Slide

  26. Introduction Univariate, Low Accuracy Multivariate, Reproducing Kernel Hilbert Space Summary References
    Approximation via Reproducing Kernel Hilbert Spaces (RKHSs)
    X
    F is a Hilbert space with reproducing kernel
    K : X × X → R
    K(X, X) positive definite ∀X
    K(·, x) ∈ F, f(x) = K(·, x), f
    F
    ∀x ∈ X,
    e.g., K(t, x) = (1 + t − x 2
    ) exp − t − x 2 Matérn
    Optimal (minimum norm) interpolant is
    APP(X, y) = K(·, X) K(X, X) −1
    y, y = f(X)
    f − APP(X, y) 2

    K(·, ·) − K(·, X) K(X, X) −1
    K(X, ·)

    f − APP(X, y) 2
    F known
    candidate set C = f ∈ F : f − APP(X, y)
    F
    C(X) f
    F
    Fasshauer, G. E. Meshfree Approximation Methods with M . (World Scientific Publishing Co., Singapore, 2007),
    Fasshauer, G. E. & McCourt, M. Kernel-based Approximation Methods using MATLAB. (World Scientific Publishing Co.,
    Singapore, 2015). 8/14

    View Slide

  27. Introduction Univariate, Low Accuracy Multivariate, Reproducing Kernel Hilbert Space Summary References
    Approximation via Reproducing Kernel Hilbert Spaces (RKHSs)
    X
    F is a Hilbert space with reproducing kernel
    K : X × X → R
    K(X, X) positive definite ∀X
    K(·, x) ∈ F, f(x) = K(·, x), f
    F
    ∀x ∈ X,
    e.g., K(t, x) = (1 + t − x 2
    ) exp − t − x 2 Matérn
    Optimal (minimum norm) interpolant is
    APP(X, y) = K(·, X) K(X, X) −1
    y, y = f(X)
    f − APP(X, y) 2

    K(·, ·) − K(·, X) K(X, X) −1
    K(X, ·)

    f − APP(X, y) 2
    F known
    K(·, ·) − K(·, X) K(X, X) −1
    K(X, ·)

    C2(X)
    1 − C2(X) APP(X, y) 2
    F
    =: ERR2(X, y)
    candidate set C = f ∈ F : f − APP(X, y)
    F
    C(X) f
    F
    Fasshauer, G. E. Meshfree Approximation Methods with M . (World Scientific Publishing Co., Singapore, 2007),
    Fasshauer, G. E. & McCourt, M. Kernel-based Approximation Methods using MATLAB. (World Scientific Publishing Co.,
    Singapore, 2015). 8/14

    View Slide

  28. Introduction Univariate, Low Accuracy Multivariate, Reproducing Kernel Hilbert Space Summary References
    Error and Acquisition for Optimal RKHS Approximation
    X
    F is a Hilbert space with reproducing kernel
    K : X × X → R
    e.g., K(t, x) = (1 + t − x 2
    ) exp − t − x 2 Matérn
    APP(X, y) = K(·, X) K(X, X) −1
    y, y = f(X)
    candidate set C = f ∈ F : f − APP(X, y)
    F
    C(X) f
    F
    f − APP(X, y) 2

    K(·, ·) − K(·, X) K(X, X) −1
    K(X, ·)

    f − APP(X, y) 2
    F known
    K(·, ·) − K(·, X) K(X, X) −1
    K(X, ·)

    C2(X)
    1 − C2(X) APP(X, y) 2
    F
    =: ERR2(X, y)
    ACQ(x, X, y) := K(x, x) − K(x, X) K(X, X) −1
    K(X, x)
    C2(X)
    1 − C2(X) APP(X, y) 2
    F
    yT(K(X,X))−1y
    xn+1
    = argmax
    x∈X
    ACQ(x, X, y) acquisition function
    ALG(f, ε) = APP(X, y) for n∗ = min {n ∈ N: ERR(X, y) ε} stopping criterion
    9/14

    View Slide

  29. Introduction Univariate, Low Accuracy Multivariate, Reproducing Kernel Hilbert Space Summary References
    Error and Acquisition for Optimal RKHS Approximation
    X
    X X
    F is a Hilbert space with reproducing kernel
    K : X × X → R
    e.g., K(t, x) = (1 + t − x 2
    ) exp − t − x 2 Matérn
    APP(X, y) = K(·, X) K(X, X) −1
    y, y = f(X)
    candidate set C = f ∈ F : f − APP(X, y)
    F
    C(X) f
    F
    f − APP(X, y) 2

    K(·, ·) − K(·, X) K(X, X) −1
    K(X, ·)

    f − APP(X, y) 2
    F known
    K(·, ·) − K(·, X) K(X, X) −1
    K(X, ·)

    C2(X)
    1 − C2(X) APP(X, y) 2
    F
    =: ERR2(X, y)
    ACQ(x, X, y) := K(x, x) − K(x, X) K(X, X) −1
    K(X, x)
    C2(X)
    1 − C2(X) APP(X, y) 2
    F
    yT(K(X,X))−1y
    xn+1
    = argmax
    x∈X
    ACQ(x, X, y) acquisition function
    ALG(f, ε) = APP(X, y) for n∗ = min {n ∈ N: ERR(X, y) ε} stopping criterion
    9/14

    View Slide

  30. Introduction Univariate, Low Accuracy Multivariate, Reproducing Kernel Hilbert Space Summary References
    Must Infer Kernel from y = f(X)
    Fθ is a Hilbert space with reproducing kernel Kθ
    C = f ∈ ∪
    θ
    Fθ : f − APP(X, y)
    Fθ∗
    C(X) f
    Fθ∗
    ∀X, y = f(X), θ∗(X, y) given below
    e.g., Kθ(t, x) = (1 + θ (t − x) 2
    ) exp − θ (t − x) 2
    Choose the θ (inspired by empirical Bayes) by minimizing the ellipsoid in Rn
    of function data yielding
    interpolants with no greater norm than that observed:
    θ∗ = argmin
    θ
    1
    n log det(Kθ) + log yT
    K−1
    θ
    y
    f − APP(X, y) 2

    K(·, ·) − K(·, X) K(X, X) −1
    K(X, ·)

    C2(X)
    1 − C2(X)
    yT(K(X, X))−1y =: ERR2(X, y)
    ACQ(x, X, y) := K(x, x) − K(x, X) K(X, X) −1
    K(X, x)
    C2(X)
    1 − C2(X)
    yT(K(X, X))−1y
    xn+1
    = argmax
    x∈X
    ACQ(x, X, y) acquisition function
    10/14

    View Slide

  31. Introduction Univariate, Low Accuracy Multivariate, Reproducing Kernel Hilbert Space Summary References
    Must Infer Kernel from y = f(X)
    X
    X X
    Fθ is a Hilbert space with reproducing kernel Kθ
    C = f ∈ ∪
    θ
    Fθ : f − APP(X, y)
    Fθ∗
    C(X) f
    Fθ∗
    ∀X, y = f(X), θ∗(X, y) given below
    e.g., Kθ(t, x) = (1 + θ (t − x) 2
    ) exp − θ (t − x) 2
    Choose the θ (inspired by empirical Bayes) by minimizing the ellipsoid in Rn
    of function data yielding
    interpolants with no greater norm than that observed:
    θ∗ = argmin
    θ
    1
    n log det(Kθ) + log yT
    K−1
    θ
    y
    f − APP(X, y) 2

    K(·, ·) − K(·, X) K(X, X) −1
    K(X, ·)

    C2(X)
    1 − C2(X)
    yT(K(X, X))−1y =: ERR2(X, y)
    ACQ(x, X, y) := K(x, x) − K(x, X) K(X, X) −1
    K(X, x)
    C2(X)
    1 − C2(X)
    yT(K(X, X))−1y
    xn+1
    = argmax
    x∈X
    ACQ(x, X, y) acquisition function
    10/14

    View Slide

  32. Introduction Univariate, Low Accuracy Multivariate, Reproducing Kernel Hilbert Space Summary References
    Must Infer Kernel from y = f(X)
    X
    X X
    Fθ is a Hilbert space with reproducing kernel Kθ
    C = f ∈ ∪
    θ
    Fθ : f − APP(X, y)
    Fθ∗
    C(X) f
    Fθ∗
    ∀X, y = f(X), θ∗(X, y) given below
    e.g., Kθ(t, x) = exp(bT(t + x))
    × (1 + a (t − x) 2
    ) exp − a (t − x) 2
    , θ = (a, b)
    Choose the θ (inspired by empirical Bayes) by minimizing the ellipsoid in Rn
    of function data yielding
    interpolants with no greater norm than that observed:
    θ∗ = argmin
    θ
    1
    n log det(Kθ) + log yT
    K−1
    θ
    y
    f − APP(X, y) 2

    K(·, ·) − K(·, X) K(X, X) −1
    K(X, ·)

    C2(X)
    1 − C2(X)
    yT(K(X, X))−1y =: ERR2(X, y)
    ACQ(x, X, y) := K(x, x) − K(x, X) K(X, X) −1
    K(X, x)
    C2(X)
    1 − C2(X)
    yT(K(X, X))−1y
    xn+1
    = argmax
    x∈X
    ACQ(x, X, y) acquisition function
    10/14

    View Slide

  33. Introduction Univariate, Low Accuracy Multivariate, Reproducing Kernel Hilbert Space Summary References
    Must Infer Kernel from y = f(X)
    X
    X X
    Fθ is a Hilbert space with reproducing kernel Kθ
    C = f ∈ ∪
    θ
    Fθ : f − APP(X, y)
    Fθ∗
    C(X) f
    Fθ∗
    ∀X, y = f(X), θ∗(X, y) given below
    e.g., Kθ(t, x) = exp(bT(t + x))
    × (1 + a (t − x) 2
    ) exp − a (t − x) 2
    , θ = (a, b)
    Choose the θ (inspired by empirical Bayes) by minimizing the ellipsoid in Rn
    of function data yielding
    interpolants with no greater norm than that observed:
    θ∗ = argmin
    θ
    1
    n log det(Kθ) + log yT
    K−1
    θ
    y
    f − APP(X, y) 2

    K(·, ·) − K(·, X) K(X, X) −1
    K(X, ·)

    C2(X)
    1 − C2(X)
    yT(K(X, X))−1y =: ERR2(X, y)
    ACQ(x, X, y) := K(x, x) − K(x, X) K(X, X) −1
    K(X, x)
    C2(X)
    1 − C2(X)
    yT(K(X, X))−1y
    xn+1
    = argmax
    x∈X
    ACQ(x, X, y) acquisition function
    10/14

    View Slide

  34. Introduction Univariate, Low Accuracy Multivariate, Reproducing Kernel Hilbert Space Summary References
    Must Infer Kernel from y = f(X)
    X
    X X
    Fθ is a Hilbert space with reproducing kernel Kθ
    C = f ∈ ∪
    θ
    Fθ : f − APP(X, y)
    Fθ∗
    C(X) f
    Fθ∗
    ∀X, y = f(X), θ∗(X, y) given below
    e.g., Kθ(t, x) = exp(bT(t + x))
    × (1 + a (t − x) 2
    ) exp − a (t − x) 2
    , θ = (a, b)
    Choose the θ (inspired by empirical Bayes) by minimizing the ellipsoid in Rn
    of function data yielding
    interpolants with no greater norm than that observed:
    θ∗ = argmin
    θ
    1
    n log det(Kθ) + log yT
    K−1
    θ
    y
    f − APP(X, y) 2

    K(·, ·) − K(·, X) K(X, X) −1
    K(X, ·)

    C2(X)
    1 − C2(X)
    yT(K(X, X))−1y =: ERR2(X, y)
    ACQ(x, X, y) := K(x, x) − K(x, X) K(X, X) −1
    K(X, x)
    C2(X)
    1 − C2(X)
    yT(K(X, X))−1y
    xn+1
    = argmax
    x∈X
    ACQ(x, X, y) acquisition function
    10/14

    View Slide

  35. Introduction Univariate, Low Accuracy Multivariate, Reproducing Kernel Hilbert Space Summary References
    Cheng and Sandu Function
    f(x) = cos(x1
    + x2
    ) exp(x1
    x2
    ) error w/ Matérn & θ = 1 error w/ mod. Matérn + opt. θ
    ε = 0.05
    Bingham, D. & Surjano, S. Virtual Library of Simulation Experiments. 2013. https://www.sfu.ca/~ssurjano/, Cheng, H.
    & Sandu, A. Collocation least-squares polynomial chaos method. in Proceedings of the 2010 Spring Simulation Multiconference,
    Society for Computer Simulation International. (2010). 11/14

    View Slide

  36. Introduction Univariate, Low Accuracy Multivariate, Reproducing Kernel Hilbert Space Summary References
    Cheng and Sandu Function
    f(x) = cos(x1
    + x2
    ) exp(x1
    x2
    ) error w/ Matérn & θ = 1 error w/ mod. Matérn + opt. θ
    ε = 0.05
    Bingham, D. & Surjano, S. Virtual Library of Simulation Experiments. 2013. https://www.sfu.ca/~ssurjano/, Cheng, H.
    & Sandu, A. Collocation least-squares polynomial chaos method. in Proceedings of the 2010 Spring Simulation Multiconference,
    Society for Computer Simulation International. (2010). 11/14

    View Slide

  37. Introduction Univariate, Low Accuracy Multivariate, Reproducing Kernel Hilbert Space Summary References
    What Are the Right Ingredients for Adaptive Function Approximation?
    A fixed budget homogeneous approximation, APP : Xn × Rn → L∞(X), with an error bound, e.g.,
    linear splines, RKHS approximation
    An unbounded, non-convex candidate set, C, for which the error bound can be bounded in
    data-driven way; what you see is almost what you get
    Necessary conditions for f to lie in C; will not have sufficient conditions
    A rich enough candidate set from which the right approximation can be inferred; attention to
    underfitting and overfitting
    12/14

    View Slide

  38. Introduction Univariate, Low Accuracy Multivariate, Reproducing Kernel Hilbert Space Summary References
    What Are the Right Ingredients for Adaptive Function Approximation?
    A fixed budget homogeneous approximation, APP : Xn × Rn → L∞(X), with an error bound, e.g.,
    linear splines, RKHS approximation
    An unbounded, non-convex candidate set, C, for which the error bound can be bounded in
    data-driven way; what you see is almost what you get
    Necessary conditions for f to lie in C; will not have sufficient conditions
    A rich enough candidate set from which the right approximation can be inferred; attention to
    underfitting and overfitting
    More work is needed on
    What makes a good initial sample
    Balancing the richness of the candidate set with overfitting
    Numerical instability and computational effort challenges for larger numbers of data sites.
    12/14

    View Slide

  39. Thank you
    These slides are available at
    speakerdeck.com/fjhickernell/
    right-ingredients-for-adaptive-function-approximation

    View Slide

  40. Introduction Univariate, Low Accuracy Multivariate, Reproducing Kernel Hilbert Space Summary References
    References
    Choi, S.-C. T., Ding, Y., H., F. J. & Tong, X. Local Adaption for Approximation and Minimization of
    Univariate Functions. J. Complexity 40, 17–33 (2017).
    Fasshauer, G. E. Meshfree Approximation Methods with M . (World Scientific Publishing Co.,
    Singapore, 2007).
    Fasshauer, G. E. & McCourt, M. Kernel-based Approximation Methods using MATLAB. (World
    Scientific Publishing Co., Singapore, 2015).
    Bingham, D. & Surjano, S. Virtual Library of Simulation Experiments. 2013.
    https://www.sfu.ca/~ssurjano/.
    Cheng, H. & Sandu, A. Collocation least-squares polynomial chaos method. in Proceedings of the
    2010 Spring Simulation Multiconference, Society for Computer Simulation International. (2010).
    14/14

    View Slide