a Cone Fred J. Hickernell Department of Applied Mathematics Center for Interdisciplinary Scientific Computation Illinois Institute of Technology hickernell@iit.edu mypages.iit.edu/~hickernell Joint work with Yuhan Ding, Peter Kritzer, and Simon Mak This work partially supported by NSF-DMS-1522687 and NSF-DMS-1638521 (SAMSI) RICAM Workshop on Multivariate Algorithms and Information-Based Complexity, November 9, 2018
Context Tidbits from talks this week My reponse Henryk, Klaus, Greg, Stefan, Erich, ... Many results for f ∈ Let’s obtain analogous re- sults for f ∈ ? Houman Let’s learn the appropriate ker- nel from the function data Will only work for nice functions in a Klaus “This adaptive algorithm has no theory” We want to construct adaptive algorithms with theory Henryk Tractability Yes! Greg POD weights Yes! Mac Function values are expensive Yes! 3/21
Multivariate Linear Problems Given f ∈ F find S(f) ∈ G, where S : F → G is linear, e.g., S(f) = Rd f(x) (x) dx S(f) = f S(f) = ∂f ∂x1 −∇2S(f) = f, S(f) = 0 on boundary Successful algorithms A(C, Λ) := {A : C × (0, ∞) → G such that S(f) − A(f, ε) G ε ∀f ∈ C ⊆ F, ε > 0} where A(f, ε) depends on function values, Λstd, Fourier coefficients, Λser, or any linear function- als, Λall, e.g., Sapp(f, n) = n i=1 f(xi) gi, gi ∈ G Sapp(f, n) = n i=1 fi gi, gi ∈ G Sapp(f, n) = n i=1 Li(f) gi, gi ∈ G A(f, ε) = Sapp(f, n)+ stopping criterion C is a 4/21
Issues Given f ∈ F find S(f) ∈ G, where S : F → G is linear Successful algorithms A(C, Λ) := {A : C×(0, ∞) → G such that S(f) − A(f, ε) G ε ∀f ∈ C ⊆ F, ε > 0} where A(f, ε) depends on function values, Λstd, Fourier coefficients, Λser, or any linear functionals, Λall Solvability : A(C, Λ) = ∅ Construction: Find A ∈ A(C, Λ) Kunsch, R. J., Novak, E. & Rudolf, D. Solvable Integration Problems and Optimal Sample Size Selection. Journal of Complexity. To appear (2018). Traub, J. F., Wasilkowski, G. W. & Woźniakowski, H. Information-Based Complexity. (Academic Press, Boston, 1988). Novak, E. & Woźniakowski, H. Tractability of Multivariate Problems Volume I: Linear Information. EMS Tracts in Mathematics 6 (European Mathematical Society, Zürich, 2008). 5/21
Issues Given f ∈ F find S(f) ∈ G, where S : F → G is linear Successful algorithms A(C, Λ) := {A : C×(0, ∞) → G such that S(f) − A(f, ε) G ε ∀f ∈ C ⊆ F, ε > 0} where A(f, ε) depends on function values, Λstd, Fourier coefficients, Λser, or any linear functionals, Λall Solvability : A(C, Λ) = ∅ Construction: Find A ∈ A(C, Λ) Cost: cost(A, f, ε) = # of function data cost(A, C, ε, ρ) = max f∈C∩Bρ cost(A, f, ε) Bρ := {f ∈ F : f F ρ} Complexity : comp(A(C, Λ), ε, ρ) = min A∈A(C,Λ) cost(A, C, ε, ρ) Optimality: cost(A, C, ε, ρ) comp(A(C, Λ), ωε, ρ) Kunsch, R. J., Novak, E. & Rudolf, D. Solvable Integration Problems and Optimal Sample Size Selection. Journal of Complexity. To appear (2018). Traub, J. F., Wasilkowski, G. W. & Woźniakowski, H. Information-Based Complexity. (Academic Press, Boston, 1988). Novak, E. & Woźniakowski, H. Tractability of Multivariate Problems Volume I: Linear Information. EMS Tracts in Mathematics 6 (European Mathematical Society, Zürich, 2008). 5/21
Issues Given f ∈ F find S(f) ∈ G, where S : F → G is linear Successful algorithms A(C, Λ) := {A : C×(0, ∞) → G such that S(f) − A(f, ε) G ε ∀f ∈ C ⊆ F, ε > 0} where A(f, ε) depends on function values, Λstd, Fourier coefficients, Λser, or any linear functionals, Λall Solvability : A(C, Λ) = ∅ Construction: Find A ∈ A(C, Λ) Cost: cost(A, f, ε) = # of function data cost(A, C, ε, ρ) = max f∈C∩Bρ cost(A, f, ε) Bρ := {f ∈ F : f F ρ} Complexity : comp(A(C, Λ), ε, ρ) = min A∈A(C,Λ) cost(A, C, ε, ρ) Optimality: cost(A, C, ε, ρ) comp(A(C, Λ), ωε, ρ) Tractability : comp(A(C, Λ), ε, ρ) Cρpε−pdq Kunsch, R. J., Novak, E. & Rudolf, D. Solvable Integration Problems and Optimal Sample Size Selection. Journal of Complexity. To appear (2018). Traub, J. F., Wasilkowski, G. W. & Woźniakowski, H. Information-Based Complexity. (Academic Press, Boston, 1988). Novak, E. & Woźniakowski, H. Tractability of Multivariate Problems Volume I: Linear Information. EMS Tracts in Mathematics 6 (European Mathematical Society, Zürich, 2008). 5/21
Issues Given f ∈ F find S(f) ∈ G, where S : F → G is linear Successful algorithms A(C, Λ) := {A : C×(0, ∞) → G such that S(f) − A(f, ε) G ε ∀f ∈ C ⊆ F, ε > 0} where A(f, ε) depends on function values, Λstd, Fourier coefficients, Λser, or any linear functionals, Λall Solvability : A(C, Λ) = ∅ Construction: Find A ∈ A(C, Λ) Cost: cost(A, f, ε) = # of function data cost(A, C, ε, ρ) = max f∈C∩Bρ cost(A, f, ε) Bρ := {f ∈ F : f F ρ} Complexity : comp(A(C, Λ), ε, ρ) = min A∈A(C,Λ) cost(A, C, ε, ρ) Optimality: cost(A, C, ε, ρ) comp(A(C, Λ), ωε, ρ) Tractability : comp(A(C, Λ), ε, ρ) Cρpε−pdq Implementation in open source software Kunsch, R. J., Novak, E. & Rudolf, D. Solvable Integration Problems and Optimal Sample Size Selection. Journal of Complexity. To appear (2018). Traub, J. F., Wasilkowski, G. W. & Woźniakowski, H. Information-Based Complexity. (Academic Press, Boston, 1988). Novak, E. & Woźniakowski, H. Tractability of Multivariate Problems Volume I: Linear Information. EMS Tracts in Mathematics 6 (European Mathematical Society, Zürich, 2008). 5/21
Cones × Ball Bρ := {f ∈ F : f F ρ} (Non-Convex) Cone C Assume set of inputs, C ⊆ F, is a cone, not a ball Cone means f ∈ C =⇒ af ∈ C ∀a ∈ R Cones are unbounded If we can bound the S(f) − Sapp(f, n) G for f ∈ cone, then we can typically also bound the error for af Philosophy: What we cannot observe about f is not much worse than what we can observe about f 6/21
“But I Like !” How might you construct an algorithm if you insist on using ? Step 1 Pick with a default radius ρ, and assume input f ∈ Bρ Step 2 Choose n large enough so that S(f) − Sapp(f, n) G S − Sapp(·, n) F→G ρ ε where Sapp(f, n) = n i=1 Li(f)gi then return A(f, ε) = Sapp(f, n) 7/21
“But I Like !” How might you construct anadaptive algorithm if you insist on using ? Step 1 Pick with a default radius ρ, and assume input f ∈ Bρ Step 2 Choose n large enough so that S(f) − Sapp(f, n) G S − Sapp(·, n) F→G ρ ε where Sapp(f, n) = n i=1 Li(f)gi Step 3 Let fmin ∈ F be the minimum norm interpolant of the data L1(f), . . . , Ln(f) Step 4 If C fmin F ρ for some preset inflation factor, C, then return A(f, ε) = Sapp(f, n); otherwise, choose ρ = 2C fmin F , and go to Step 2 This succeeds for the cone defined as those functions in F whose norms are not much larger than their minimum norm interpolants. 7/21
When Is (S : C ⊆ F → G, Λ) Solvable? A(C, Λ) := {A : C × (0, ∞) → G such that S(f) − A(f, ε) G ε ∀f ∈ C ⊆ F, ε > 0} where A(f, ε) depends on {Li(f)}n i=1 ∈ Λn ⊆ F∗n Definition (S : C ⊆ F → G, Λ) solvable ⇐⇒ A(C, Λ) = ∅ Lemma f1, f2 ∈ C and A(f1, ε) = A(f2, ε) =⇒ S(f1 − f2) G 2ε Corollary (S : C ⊆ F → G, Λ) solvable and ∃f ∈ C, ε > 0 with A(f, ε) = A(0, ε) =⇒ S(f) = 0 Theorem (S : C ⊆ F → G, Λ) solvable and C is a vector space ⇐⇒ ∃ n ∈ N, L ∈ Λn, g ∈ Gn such that S(f) = n i=1 Li(f)gi ∀f ∈ C exactly Proof E.g. [0,1]d ·(x) dx : C1,...,1[0, 1]d → R, Λstd/all is unsolvable/solvable 8/21
Inputs and Outputs Input: f = j∈N f(j)uj, Output: g = j∈N ^ g(j)vj, S(uj) = vj, E.g., Integration: S(uj) = [0,1]d uj(x) dx = vj, Function recovery: S(uj) = uj = vj, Fixed sample size algorithm: Sapp(f, n) = n i=1 Li(f)gi A(f, ε) = Sapp(f, n) + stopping criterion Three scenarios presented here: Integration use Λstd, but cost, complexity, etc. lacking General Linear Problems use Λser, have cost, complexity, and optimality Function Approximation use Λser, learn coordinate and smoothness weights 9/21
Integration Using Function Values, Λstd S(f) = [0,1]d f(x) dx f = j∈N f(j)uj, uj = Cosine/Sine or Walsh, S(uj) = δj,0 Sapp(f, n) = 1 n n i=1 f(xi) S(f) − Sapp(f, n) 0=j∈dual set f(j) Dick, J., Kuo, F. & Sloan, I. H. High dimensional integration — the Quasi-Monte Carlo way. Acta Numer. 22, 133–288 (2013). 10/21
Cones of Integrands Whose Fourier Series Coefficients Decay Steadily n0 = 0, nk = 2k−1, k ∈ N, j ordered, true coef. = f(j) ≈ fdisc (j) = discrete coef. σk(f) = nk i=nk−1 +1 f(ji) , ^ σk, (f) = nk i=nk−1 +1 ∞ m=1 f(ji+mn ) , σdisc,k(f) = nk i=nk−1 +1 fdisc (ji) C := f ∈ F : σ (f) a1 b −k 1 σk(f), ^ σk, (f) a2b −k 2 σ (f) ∀k a1, a2 > 1 > b1, b2 Sapp(f, nk) = 1 nk nk i=1 f(xi) S(f) − Sapp(f, nk) 0=j∈dual set f(j) C(k)σdisc,k(f) A(f, ε) = S(f, nk) for the smallest k satisfying C(k)σdisc,k(f) ε Have cost(f, A, ε); No cost(A, C, ε, ρ), comp(A(C, Λstd), ε, ρ), or tractability results yet H., F. J. & Jiménez Rugama, Ll. A. Reliable Adaptive Cubature Using Digital Sequences. in Monte Carlo and Quasi-Monte Carlo Methods: MCQMC, Leuven, Belgium, April 2014 (eds Cools, R. & Nuyens, D.) 163. arXiv:1410.8615 [math.NA] (Springer-Verlag, Berlin, 2016), 367–383, Jiménez Rugama, Ll. A. & H., F. J. Adaptive Multidimensional Integration Based on Rank-1 Lattices. in Monte Carlo and Quasi-Monte Carlo Methods: MCQMC, Leuven, Belgium, April 2014 (eds Cools, R. & Nuyens, D.) 163. arXiv:1411.1966 (Springer-Verlag, Berlin, 2016), 407–422. 12/21
Solving General Linear Problems Using Series Coefficients, Λser F := f = j∈N f(j)uj : f F := f(j) λj j∈N 2 λ affects convergence rate & tractability G := g = j∈N ^ g(j)vj : g G := ^ g 2 , vj = S(uj) λj1 λj2 · · · , n0 < n1 < n2 < · · · , σk(f) = nk i=nk−1 +1 f(ji) 2 , k ∈ N C := f ∈ F : σ (f) ab −kσk(f) ∀k a > 1 > b series coef. decay steadily Sapp(f, nk) = nk i=1 f(ji)vji is optimal for fixed nk, S(f) − Sapp(f, nk) G abσk(f) √ 1 − b2 A(f, ε) = Sapp(f, nk) for the smallest k satisfying σk(f) ε √ 1 − b2 ab Ding, Y., H., F. J. & Jiménez Rugama, Ll. A. An Adaptive Algorithm Employing Continuous Linear Functionals. 2018+. 15/21
Solving General Linear Problems Using Series Coefficients, Λser λj1 λj2 · · · , n0 < n1 < n2 < · · · , σk(f) = nk i=nk−1 +1 f(ji) 2 , k ∈ N C := f ∈ F : σ (f) ab −kσk(f) ∀k a > 1 > b series coef. decay steadily Sapp(f, nk) = nk i=1 f(ji)vji is optimal for fixed nk, S(f) − Sapp(f, nk) G abσk(f) √ 1 − b2 A(f, ε) = Sapp(f, nk) for the smallest k satisfying σk(f) ε √ 1 − b2 ab cost(A, C, ε, ρ) = n † , † min ∈ N : ρ2 ε2 (1 − b2) a2b2 −1 k=1 b2(k− ) a2λ2 nk−1 +1 + 1 λ2 n −1+1 cost(A, C, ε, ρ) essentially no worse than comp(A(C, Λall), ε, ρ) No tractability results Ding, Y., H., F. J. & Jiménez Rugama, Ll. A. An Adaptive Algorithm Employing Continuous Linear Functionals. 2018+. 15/21
Function Approximation when Function Values Are Expensive F := f = j∈Nd 0 f(j)uj : f F := f(j) λj j∈N ∞ λ affects convergence rate & tractability G := g = j∈Nd 0 ^ g(j)vj : g G := ^ g 1 , vj = S(uj) = uj λj = Γ j 0 d =1 j >0 γ sj γ = coordinate importance Γr = order size sj = smoothness degree POSD weights reflect effect sparsity, effect hierarchy, effect heredity, and effect smoothness λj1 λj2 · · · , Sapp(f, n) = n i=1 f(ji)uji , S(f) − Sapp(f, nk) G f F i=n+1 λji C := f ∈ F : those functions for which f F can be inferred from a pilot sample Wu, C. F. J. & Hamada, M. Experiments: Planning, Analysis, and Parameter Design Optimization. (John Wiley & Sons, Inc., New York, 2000), Kuo, F. Y., Schwab, C. & Sloan, I. H. Quasi-Monte Carlo Finite Element Methods for a Class of Elliptic Partial Differential Equations with Random Coefficients. SIAM J. Numer. Anal. 50, 3351–3374 (2012). 16/21
Cheng and Sandu Function Chebyshev polynomials, Order weights Γk = 1, Coordinate weights γ inferred, Smoothness weights sj inferred, Λstd Bingham, D. The Virtual Library of Simulation Experiments: Test Functions and Data Sets. 2017. https://www.sfu.ca/~ssurjano/index.html (2017). 18/21
Take Home Messages Assuming that the input functions lie in convex cones allow us to construct adaptive algorithms Cone definitions reflect prior beliefs and/or practical considerations Demonstration of concept Integration using Λstd Constructed A ∈ A(C, Λstd) No cost, complexity, or tractability yet General linear problems Constructed A ∈ A(C, Λser), upper bound on cost(A, C, ε, ρ), lower bound on comp(A(C, Λall), ε, ρ), optimality No tractability yet Function approximation (recovery) Constructed A ∈ A(C, Λser), algorithm learns weights, works in practice for Λstd Remaining theory in progress Much to be done 19/21
Kunsch, R. J., Novak, E. & Rudolf, D. Solvable Integration Problems and Optimal Sample Size Selection. Journal of Complexity. To appear (2018). Traub, J. F., Wasilkowski, G. W. & Woźniakowski, H. Information-Based Complexity. (Academic Press, Boston, 1988). Novak, E. & Woźniakowski, H. Tractability of Multivariate Problems Volume I: Linear Information. EMS Tracts in Mathematics 6 (European Mathematical Society, Zürich, 2008). Dick, J., Kuo, F. & Sloan, I. H. High dimensional integration — the Quasi-Monte Carlo way. Acta Numer. 22, 133–288 (2013). H., F. J. & Jiménez Rugama, Ll. A. Reliable Adaptive Cubature Using Digital Sequences. in Monte Carlo and Quasi-Monte Carlo Methods: MCQMC, Leuven, Belgium, April 2014 (eds Cools, R. & Nuyens, D.) 163. arXiv:1410.8615 [math.NA] (Springer-Verlag, Berlin, 2016), 367–383. 20/21
Jiménez Rugama, Ll. A. & H., F. J. Adaptive Multidimensional Integration Based on Rank-1 Lattices. in Monte Carlo and Quasi-Monte Carlo Methods: MCQMC, Leuven, Belgium, April 2014 (eds Cools, R. & Nuyens, D.) 163. arXiv:1411.1966 (Springer-Verlag, Berlin, 2016), 407–422. Choi, S.-C. T. et al. GAIL: Guaranteed Automatic Integration Library (Versions 1.0–2.2). MATLAB software. 2013–2017. http://gailgithub.github.io/GAIL_Dev/. Ding, Y., H., F. J. & Jiménez Rugama, Ll. A. An Adaptive Algorithm Employing Continuous Linear Functionals. 2018+. Wu, C. F. J. & Hamada, M. Experiments: Planning, Analysis, and Parameter Design Optimization. (John Wiley & Sons, Inc., New York, 2000). Kuo, F. Y., Schwab, C. & Sloan, I. H. Quasi-Monte Carlo Finite Element Methods for a Class of Elliptic Partial Differential Equations with Random Coefficients. SIAM J. Numer. Anal. 50, 3351–3374 (2012). 20/21
Bingham, D. The Virtual Library of Simulation Experiments: Test Functions and Data Sets. 2017. https://www.sfu.ca/~ssurjano/index.html (2017). (eds Cools, R. & Nuyens, D.) Monte Carlo and Quasi-Monte Carlo Methods: MCQMC, Leuven, Belgium, April 2014. 163 (Springer-Verlag, Berlin, 2016). 21/21
Proof of Theorem for Solvability on a Vector Space Let the cone C be a vector space and let A be a successful algorithm ε > 0 be any positive tolerance {L1, . . . , LM} ⊂ Λ be the linear functionals used by A(0, ε), and {L1, . . . , Lm} be a basis for span({L1, . . . , LM}) n = min(m, dim(C)) {f1, . . . , fn} ⊂ C satisfy Li(fj) = δi,j, i = 1, . . . , n, j = 1, . . . , m For any f ∈ C, let f = f − n i=1 Li(f)fi, and note that Lj(f) = 0 for j = 1, . . . , M. Thus, A(f, ε) = A(0, ε), and so by the Corollary , 0 = S(f) = S(f) − n i=1 Li(f)S(fi), which implies S(f) = n i=1 Li(f)S(fi) Back 21/21