Ouafae Karmouda - Speaker Deck

Slide 1

Slide 1 text

SPEEDING UP OF KERNEL-BASED LEARNING FOR HIGH-ORDER TENSOR 12/03/2020 Presented by : Ouafae Karmouda [email protected] Supervised by : R´ emy Boyer and J´ er´ emie Boulanger Ouafae Karmouda 12/03/2020 1 / 24

Slide 2

Slide 2 text

1 Introduction 2 Background in Tensor Algebra 3 Method of the sate of the art: A Kernel-based Framework to tensorial Data Analysis 4 Fast Kernel Subspace Estimation based on Tensor Train Decomposition 5 Numerical Experiments 6 Conclusion Ouafae Karmouda 12/03/2020 2 / 24

Slide 3

Slide 3 text

Introduction Applications of Support Vector Machines (SVMs) Ouafae Karmouda 12/03/2020 3 / 24

Slide 4

Slide 4 text

Introduction Principle of SVMs Figure: SVMs looks for the optimal hyperplane to separate the classes. SVMs assume data is linearly separable. Parameters of the hyperplane can be computed by soving a quadratic optimization problem where the inner product between data samples is needed. Ouafae Karmouda 12/03/2020 4 / 24

Slide 5

Slide 5 text

Introduction SVMs and Kernel functions Figure: Projection of data in a feature space. Kernel trick : k(., .) =< φ(.), φ(.) > Examples of kernel functions for vectors: Radius-Basis function kernel (RBF): k(x, y) = exp(−γ||x − y||2). Polynomial kernel : k(x, y) = (xT y + c)d . Ouafae Karmouda 12/03/2020 5 / 24

Slide 6

Slide 6 text

Introduction How to deﬁne kernel functions for tensors ? Ouafae Karmouda 12/03/2020 6 / 24

Slide 7

Slide 7 text

Background in Tensor Algebra What is a tensor ? Algebraic view :A tensor is a multidimensional array. The order of a tensor is the number of its dimensions, also known as modes or ways. Figure: Diﬀerent orders of a Tensor Ouafae Karmouda 12/03/2020 7 / 24

Slide 8

Slide 8 text

Background in Tensor Algebra Fibers and Slices of a 3rd-order tensor Figure: Top: Fibers of a 3rd-order tensor. Down: Slices of a 3rd-order tensor [G. Kolda and W. Bader, SIAM 2009]. Ouafae Karmouda 12/03/2020 8 / 24

Slide 9

Slide 9 text

Background in Tensor Algebra Matrix representation of a higher-order tensor Figure: Mode-1, mode-2 and mode-3 matricization (unfolding) of a 3rd ord tensor. [Y.Chen, R.Xu ,MIPPR 2009] Ouafae Karmouda 12/03/2020 9 / 24

Slide 10

Slide 10 text

Background in Tensor Algebra Tensor n-Mode Multiplication Let X ∈ RI1×I2×...×IQ , U ∈ RJ×In , the n-mode product is denoted by Y = X ×n U. and its elements are given by: Y is of size I1 × · · · In−1 × J × In+1 × · · · × IN, Yi1,...,in−1,jn,in+1,...,iN = In in=1 xi1...iN ujin . In terms of unfolded tensors: Y = X ×n U ⇒ Y(n) = UX(n) . Example : X ∈ RI1×I2×I3 , B ∈ RM×I2 . Y = X ×2 B ⇒ Y(2) = BX(2) Ouafae Karmouda 12/03/2020 10 / 24

Slide 11

Slide 11 text

Background in Tensor Algebra Higher-Order SVD (HOSVD) Theorem Every complex (I1 × · · · × IQ)-tensor can be approximated by [L.De Lathauwer, B.De Moor et al, SIAM 2000]: X ≈ G ×1 U1 ×2 ... ×Q UQ, (1) Uq is an Iq × Tq orthonormal matrix. G is a (T1 × · · · × TQ) all-orthogonal tensor. Tq are multilinear ranks of X. The q-mode singular matrix Uq is the left singular matrix of the q-mode matrix unfolding. The complexity of HOSVD is O(QTIQ). G ≈ X ×1 UT 1 ×2 ... ×Q UT Q . (2) Ouafae Karmouda 12/03/2020 11 / 24

Slide 12

Slide 12 text

Background in Tensor Algebra Visual illustration of the HOSVD of 3rd order tensor Figure: Visualisation of HOSVD of a multilinear rank-(R1,R2,R3) tensor and the diﬀerent spaces. [Multilinear singular value decomposition and low multilinear rank approximation, Tensorlab] Ouafae Karmouda 12/03/2020 12 / 24

Slide 13

Slide 13 text

Background in Tensor Algebra Question: How to deﬁne a similarity measure based on the mulidimensional structure of input tensors? Possible answer: *Regarding the input tensor as the collection of linear subspaces coming from each matricization. *Deﬁne a kernel between subspaces in a Grassmann manifold. Ouafae Karmouda 12/03/2020 13 / 24

Slide 14

Slide 14 text

Method of the sate of the art: A Kernel-based Framework to tensorial Data Analysis Grassmann Manifold For integers n ≥ k > 0, the Grassmann Manifold is deﬁned by: G(n, k) = {span(M) : M ∈ Rn×kMT M = Ik}. Figure: Example of a Grassmann Manifold. X, Y , Z: Points on the Grassmann manifold: subspaces. [Grassmannian Learning, 2018]. Ouafae Karmouda 12/03/2020 14 / 24

Slide 15

Slide 15 text

Method of the sate of the art: A Kernel-based Framework to tensorial Data Analysis Kernel on a Grassmann manifold Consider the HOSVD of X, Y ∈ RI1×···×IQ , X = G ×1 U1 ×2 ... ×Q UQ (3) Y = H ×1 V1 ×2 ... ×Q VQ (4) The kernel-based part of the proposed method in [M.Signoretto, L.De Lathauwer, J. Suykens, 2011] is : k(X, Y) = Q q=1 ˜ k (span(Uq), span(Vq)) , (5) where span(Uq), span(Vq) ∈ G(Iq, Tq) and, ˜ k (span(Uq), span(Vq)) = exp −2γ UqUT q − VqV T q 2 F . Ouafae Karmouda 12/03/2020 15 / 24

Slide 16

Slide 16 text

Method of the sate of the art: A Kernel-based Framework to tensorial Data Analysis Limitation of the method of the state if the art The complexity of HOSVD is O(QRIQ). The limitation becomes severe for higher-order tensors. Objectif: Reduce the complexity of the HOSVD. Mean : Use an algebraic equivalence between HOSVD and the structured Tensor Train Decomposition (TTD). What is TTD ? Ouafae Karmouda 12/03/2020 16 / 24

Slide 17

Slide 17 text

Fast Kernel Subspace Estimation based on Tensor Train Decomposition Tensor-Train Decomposition (TTD) X(i1, . . . , iQ) = R1,··· ,RQ r1,...,rQ−1 G1(i1, r1)G2(r1, i2, r2) . . . GQ−1(rQ−2, iQ−1, rQ−1)GQ(rQ−1, iQ), GQ ∈ RRQ−1×IQ . Gq ∈ RRq−1×Iq×Rq , q ∈ {2, . . . , Q − 1}, G1 ∈ RI1×R1 . Figure: TT decomposition of a D-order Tensor.[I.V.Oseledets, SIAM 2011] Ouafae Karmouda 12/03/2020 17 / 24

Slide 18

Slide 18 text

Fast Kernel Subspace Estimation based on Tensor Train Decomposition Key property HOSVD : X = G ×1 U1 ×2 ... ×Q UQ, TTD : X = G1 ×1 2 G2 · · · ×1 Q−1 GQ−1 ×1 Q GQ Interesting property: span(U1) = span(G1), span(UQ) = span(GT Q ), span(Uq) = span (Fq) , whereFq = Matrix of the left singular vectors of SVD of (Gq)(2) . Figure: In the case of Tq = 2 with Uq = [U1 q , U2 q ] Ouafae Karmouda 12/03/2020 18 / 24

Slide 19

Slide 19 text

Fast Kernel Subspace Estimation based on Tensor Train Decomposition Equivalence TTD and HOSVD Assume that tensor X follows a Q-order HOSVD of multilinear rank-(T1, · · · , TQ). A TTD of X is given by [Zniyed, Boyer et al, LAA]: G1 = U1 Gq = Tq ×2 Uq(1 < q < ¯ q) with Tq = reshape(IRq ; T1 . . . Tq−1, Tq, T1 · · · Tq) G¯ q = G¯ q ×2 U¯ q(1 < q < ¯ q) with G¯ q = reshape(G; R ¯ Q−1 , T¯ q, R¯ q) Gq = Tq ×2 Uq(¯ q < q < Q) with ¯ Tq = reshape(IRq−1 ; Tq . . . TQ, Tq, Tq+1 · · · TQ) GQ = UT Q ¯ q is the smallest q that veriﬁes Q i=1 Ti ≥ Q i=q+1 Ti Ouafae Karmouda 12/03/2020 19 / 24

Slide 20

Slide 20 text

Fast Kernel Subspace Estimation based on Tensor Train Decomposition FAKSETT: Fast Kernel Subspace Estimation based on Tensor Train decomposition Consider the TTD of X, Y ∈ RI1×···×IQ , X = G1 ×1 2 G2 · · · ×1 Q−1 GQ−1 ×1 Q GQ (6) Y = G1 ×1 2 G 2 · · · ×1 Q−1 G Q−1 ×1 Q GQ (7) The kernel-based part of the proposed method is [Karmouda, Boulanger, Boyer, ICASSP 2021]: k(X, Y) = Q q=1 ˜ k span(Fq), span(Fq ) , (8) where Fq = Matrix of the left singular vectors of (Gq)(2) , Fq = Matrix of the left singular vectors of (G q)(2) and ˜ k span(Fq), span(Fq ) = exp −2γ (Fq)(FT q ) − (Fq )(Fq )T 2 F . Ouafae Karmouda 12/03/2020 20 / 24

Slide 21

Slide 21 text

Numerical Experiments Classiﬁcation performance for the UCF11 dataset UCF11 : Composed of videos that contain human actions of size 240 frames × 240 × ×320 × 3 dim.frames . Figure: Two human actions considered. s% m-ranks FAKSETT native method %50 [2,2,2,2] 0.72(10−2) 0.73(10−2) %60 [3,3,3,3] 0.7(10−2) 0.7(10−2) %80 [3,3,3,3] 0.76(10−2) 0.77(10−2) Table: Mean accuracy (standard deviation) on test data for UCF11 database Ouafae Karmouda 12/03/2020 21 / 24

Slide 22

Slide 22 text

Numerical Experiments Classiﬁcation Performance for Extended Yale dataset Extended Yale dataset : This dataset contains images of size 9 nb.poses × 480 × 640 dim.images × 16 nb.illum of 28 human subjects. Figure: 3 classes of Extended Yale dataset. s% m-ranks FAKSETT native method %50 [1,3,2,1] 0.98(10−2) 0.99(10−2) %60 [1,2,2,1] 0.99(10−2) 0.99(10−2) Table: Mean accuracy (standard deviation) on test data for Extended Yale database Ouafae Karmouda 12/03/2020 22 / 24

Slide 23

Slide 23 text

Numerical Experiments Computational time Database m-ranks FAKSETT native method UCF11 [2,2,2,2] 14(0.42) 69(3) [3,3,3,3] 15(0.63) 104(5) Extended Yale [1,2,2,1] 2.56(0.09) 9.47(0.1) Table: Mean time (standard deviation) on seconds consumed to compute HOSVD for diﬀerent databases w.r.t to diﬀerent values of multi-linear ranks. Ouafae Karmouda 12/03/2020 23 / 24

Slide 24

Slide 24 text

Conclusion Conclusion Despite of a good classiﬁcation, the method of the state of the art suﬀers from a high complexity cost. Exploit some algebraic link beween TTD and HOSVD to speed up the native method. We have proposed the FAKSETT method. FAKSETT reaches similar scores and considerably reduces computational time. Ouafae Karmouda 12/03/2020 24 / 24