Upgrade to Pro — share decks privately, control downloads, hide ads and more …

Ouafae Karmouda

Ouafae Karmouda

(SIGMA team at CRIStAL laboratory, Lille, France)

https://s3-seminar.github.io/seminars/ouafae-karmouda/

Title — Speeding up of kernel-based learning for high-order tensor
Abstract — Supervised learning is a major task to classify datasets. In our context, we are interested into classification from high-order tensors datasets. The “curse of dimensionality” states that the complexities in terms of storage and computation grow exponentially with the order. As a consequence, the method from the state-of-art based on the Higher-Order SVD (HOSVD) works well but suffers from severe limitation in terms of complexities. In this work, we propose a fast Grassmannian kernel-based method for high-order tensor learning based on the equivalence between the Tucker and the tensor-train decompositions. Our solution is linked to the tensor network, where the aim is to break the initial high-order tensor into a collection of low-order tensors (at most 3-order). We show on several real datasets that the proposed method reaches a similar accuracy classification rate as the Grassmannian kernel-based method based on the HOSVD but for a much lower complexity.

Biography — Ouafae KARMOUDA receives a Master’s degree in Applied Mathematics from the Faculty of Science and Technologies of Fes, Morocco, in 2018. She receives a Master's degree in Data Science from the Aix-Marseille University in 2019. Currently, she is a second year PhD student under the supervision of Rémy Boyer and Jérémie Boulanger in SIGMA team at CRIStAL laboratory, Lille. Her research interests focus on developping/improving Maching learning Algorithms for multidimensional data (tensors). She is particulary interested in kernel methods and Deep Learning techniques for high dimensional data. The challenge when dealing with tensors lies in the « curse of dimensionality ». In other words, the computational complexity grows exponentially with the order of tensors. To mitigate this issue, she is interested in Tensor network models and particularly the Tensor Train decomposition.

C3bc10b8a72ed3c3bfd843793b8a9868?s=128

S³ Seminar

March 12, 2021
Tweet

Transcript

  1. SPEEDING UP OF KERNEL-BASED LEARNING FOR HIGH-ORDER TENSOR 12/03/2020 Presented

    by : Ouafae Karmouda ouafae.karmouda@univ-lille.fr Supervised by : R´ emy Boyer and J´ er´ emie Boulanger Ouafae Karmouda 12/03/2020 1 / 24
  2. 1 Introduction 2 Background in Tensor Algebra 3 Method of

    the sate of the art: A Kernel-based Framework to tensorial Data Analysis 4 Fast Kernel Subspace Estimation based on Tensor Train Decomposition 5 Numerical Experiments 6 Conclusion Ouafae Karmouda 12/03/2020 2 / 24
  3. Introduction Applications of Support Vector Machines (SVMs) Ouafae Karmouda 12/03/2020

    3 / 24
  4. Introduction Principle of SVMs Figure: SVMs looks for the optimal

    hyperplane to separate the classes. SVMs assume data is linearly separable. Parameters of the hyperplane can be computed by soving a quadratic optimization problem where the inner product between data samples is needed. Ouafae Karmouda 12/03/2020 4 / 24
  5. Introduction SVMs and Kernel functions Figure: Projection of data in

    a feature space. Kernel trick : k(., .) =< φ(.), φ(.) > Examples of kernel functions for vectors: Radius-Basis function kernel (RBF): k(x, y) = exp(−γ||x − y||2). Polynomial kernel : k(x, y) = (xT y + c)d . Ouafae Karmouda 12/03/2020 5 / 24
  6. Introduction How to define kernel functions for tensors ? Ouafae

    Karmouda 12/03/2020 6 / 24
  7. Background in Tensor Algebra What is a tensor ? Algebraic

    view :A tensor is a multidimensional array. The order of a tensor is the number of its dimensions, also known as modes or ways. Figure: Different orders of a Tensor Ouafae Karmouda 12/03/2020 7 / 24
  8. Background in Tensor Algebra Fibers and Slices of a 3rd-order

    tensor Figure: Top: Fibers of a 3rd-order tensor. Down: Slices of a 3rd-order tensor [G. Kolda and W. Bader, SIAM 2009]. Ouafae Karmouda 12/03/2020 8 / 24
  9. Background in Tensor Algebra Matrix representation of a higher-order tensor

    Figure: Mode-1, mode-2 and mode-3 matricization (unfolding) of a 3rd ord tensor. [Y.Chen, R.Xu ,MIPPR 2009] Ouafae Karmouda 12/03/2020 9 / 24
  10. Background in Tensor Algebra Tensor n-Mode Multiplication Let X ∈

    RI1×I2×...×IQ , U ∈ RJ×In , the n-mode product is denoted by Y = X ×n U. and its elements are given by: Y is of size I1 × · · · In−1 × J × In+1 × · · · × IN, Yi1,...,in−1,jn,in+1,...,iN = In in=1 xi1...iN ujin . In terms of unfolded tensors: Y = X ×n U ⇒ Y(n) = UX(n) . Example : X ∈ RI1×I2×I3 , B ∈ RM×I2 . Y = X ×2 B ⇒ Y(2) = BX(2) Ouafae Karmouda 12/03/2020 10 / 24
  11. Background in Tensor Algebra Higher-Order SVD (HOSVD) Theorem Every complex

    (I1 × · · · × IQ)-tensor can be approximated by [L.De Lathauwer, B.De Moor et al, SIAM 2000]: X ≈ G ×1 U1 ×2 ... ×Q UQ, (1) Uq is an Iq × Tq orthonormal matrix. G is a (T1 × · · · × TQ) all-orthogonal tensor. Tq are multilinear ranks of X. The q-mode singular matrix Uq is the left singular matrix of the q-mode matrix unfolding. The complexity of HOSVD is O(QTIQ). G ≈ X ×1 UT 1 ×2 ... ×Q UT Q . (2) Ouafae Karmouda 12/03/2020 11 / 24
  12. Background in Tensor Algebra Visual illustration of the HOSVD of

    3rd order tensor Figure: Visualisation of HOSVD of a multilinear rank-(R1,R2,R3) tensor and the different spaces. [Multilinear singular value decomposition and low multilinear rank approximation, Tensorlab] Ouafae Karmouda 12/03/2020 12 / 24
  13. Background in Tensor Algebra Question: How to define a similarity

    measure based on the mulidimensional structure of input tensors? Possible answer: *Regarding the input tensor as the collection of linear subspaces coming from each matricization. *Define a kernel between subspaces in a Grassmann manifold. Ouafae Karmouda 12/03/2020 13 / 24
  14. Method of the sate of the art: A Kernel-based Framework

    to tensorial Data Analysis Grassmann Manifold For integers n ≥ k > 0, the Grassmann Manifold is defined by: G(n, k) = {span(M) : M ∈ Rn×kMT M = Ik}. Figure: Example of a Grassmann Manifold. X, Y , Z: Points on the Grassmann manifold: subspaces. [Grassmannian Learning, 2018]. Ouafae Karmouda 12/03/2020 14 / 24
  15. Method of the sate of the art: A Kernel-based Framework

    to tensorial Data Analysis Kernel on a Grassmann manifold Consider the HOSVD of X, Y ∈ RI1×···×IQ , X = G ×1 U1 ×2 ... ×Q UQ (3) Y = H ×1 V1 ×2 ... ×Q VQ (4) The kernel-based part of the proposed method in [M.Signoretto, L.De Lathauwer, J. Suykens, 2011] is : k(X, Y) = Q q=1 ˜ k (span(Uq), span(Vq)) , (5) where span(Uq), span(Vq) ∈ G(Iq, Tq) and, ˜ k (span(Uq), span(Vq)) = exp −2γ UqUT q − VqV T q 2 F . Ouafae Karmouda 12/03/2020 15 / 24
  16. Method of the sate of the art: A Kernel-based Framework

    to tensorial Data Analysis Limitation of the method of the state if the art The complexity of HOSVD is O(QRIQ). The limitation becomes severe for higher-order tensors. Objectif: Reduce the complexity of the HOSVD. Mean : Use an algebraic equivalence between HOSVD and the structured Tensor Train Decomposition (TTD). What is TTD ? Ouafae Karmouda 12/03/2020 16 / 24
  17. Fast Kernel Subspace Estimation based on Tensor Train Decomposition Tensor-Train

    Decomposition (TTD) X(i1, . . . , iQ) = R1,··· ,RQ r1,...,rQ−1 G1(i1, r1)G2(r1, i2, r2) . . . GQ−1(rQ−2, iQ−1, rQ−1)GQ(rQ−1, iQ), GQ ∈ RRQ−1×IQ . Gq ∈ RRq−1×Iq×Rq , q ∈ {2, . . . , Q − 1}, G1 ∈ RI1×R1 . Figure: TT decomposition of a D-order Tensor.[I.V.Oseledets, SIAM 2011] Ouafae Karmouda 12/03/2020 17 / 24
  18. Fast Kernel Subspace Estimation based on Tensor Train Decomposition Key

    property HOSVD : X = G ×1 U1 ×2 ... ×Q UQ, TTD : X = G1 ×1 2 G2 · · · ×1 Q−1 GQ−1 ×1 Q GQ Interesting property: span(U1) = span(G1), span(UQ) = span(GT Q ), span(Uq) = span (Fq) , whereFq = Matrix of the left singular vectors of SVD of (Gq)(2) . Figure: In the case of Tq = 2 with Uq = [U1 q , U2 q ] Ouafae Karmouda 12/03/2020 18 / 24
  19. Fast Kernel Subspace Estimation based on Tensor Train Decomposition Equivalence

    TTD and HOSVD Assume that tensor X follows a Q-order HOSVD of multilinear rank-(T1, · · · , TQ). A TTD of X is given by [Zniyed, Boyer et al, LAA]: G1 = U1 Gq = Tq ×2 Uq(1 < q < ¯ q) with Tq = reshape(IRq ; T1 . . . Tq−1, Tq, T1 · · · Tq) G¯ q = G¯ q ×2 U¯ q(1 < q < ¯ q) with G¯ q = reshape(G; R ¯ Q−1 , T¯ q, R¯ q) Gq = Tq ×2 Uq(¯ q < q < Q) with ¯ Tq = reshape(IRq−1 ; Tq . . . TQ, Tq, Tq+1 · · · TQ) GQ = UT Q ¯ q is the smallest q that verifies Q i=1 Ti ≥ Q i=q+1 Ti Ouafae Karmouda 12/03/2020 19 / 24
  20. Fast Kernel Subspace Estimation based on Tensor Train Decomposition FAKSETT:

    Fast Kernel Subspace Estimation based on Tensor Train decomposition Consider the TTD of X, Y ∈ RI1×···×IQ , X = G1 ×1 2 G2 · · · ×1 Q−1 GQ−1 ×1 Q GQ (6) Y = G1 ×1 2 G 2 · · · ×1 Q−1 G Q−1 ×1 Q GQ (7) The kernel-based part of the proposed method is [Karmouda, Boulanger, Boyer, ICASSP 2021]: k(X, Y) = Q q=1 ˜ k span(Fq), span(Fq ) , (8) where Fq = Matrix of the left singular vectors of (Gq)(2) , Fq = Matrix of the left singular vectors of (G q)(2) and ˜ k span(Fq), span(Fq ) = exp −2γ (Fq)(FT q ) − (Fq )(Fq )T 2 F . Ouafae Karmouda 12/03/2020 20 / 24
  21. Numerical Experiments Classification performance for the UCF11 dataset UCF11 :

    Composed of videos that contain human actions of size 240 frames × 240 × ×320 × 3 dim.frames . Figure: Two human actions considered. s% m-ranks FAKSETT native method %50 [2,2,2,2] 0.72(10−2) 0.73(10−2) %60 [3,3,3,3] 0.7(10−2) 0.7(10−2) %80 [3,3,3,3] 0.76(10−2) 0.77(10−2) Table: Mean accuracy (standard deviation) on test data for UCF11 database Ouafae Karmouda 12/03/2020 21 / 24
  22. Numerical Experiments Classification Performance for Extended Yale dataset Extended Yale

    dataset : This dataset contains images of size 9 nb.poses × 480 × 640 dim.images × 16 nb.illum of 28 human subjects. Figure: 3 classes of Extended Yale dataset. s% m-ranks FAKSETT native method %50 [1,3,2,1] 0.98(10−2) 0.99(10−2) %60 [1,2,2,1] 0.99(10−2) 0.99(10−2) Table: Mean accuracy (standard deviation) on test data for Extended Yale database Ouafae Karmouda 12/03/2020 22 / 24
  23. Numerical Experiments Computational time Database m-ranks FAKSETT native method UCF11

    [2,2,2,2] 14(0.42) 69(3) [3,3,3,3] 15(0.63) 104(5) Extended Yale [1,2,2,1] 2.56(0.09) 9.47(0.1) Table: Mean time (standard deviation) on seconds consumed to compute HOSVD for different databases w.r.t to different values of multi-linear ranks. Ouafae Karmouda 12/03/2020 23 / 24
  24. Conclusion Conclusion Despite of a good classification, the method of

    the state of the art suffers from a high complexity cost. Exploit some algebraic link beween TTD and HOSVD to speed up the native method. We have proposed the FAKSETT method. FAKSETT reaches similar scores and considerably reduces computational time. Ouafae Karmouda 12/03/2020 24 / 24