Upgrade to Pro — share decks privately, control downloads, hide ads and more …

FG 2021

FG 2021

Olivier Lézoray

December 05, 2021
Tweet

More Decks by Olivier Lézoray

Other Decks in Research

Transcript

  1. [table]capposition=above Lightweight Deep Symmetric Positive Definite Manifold Network for Real-Time

    3D Hand Gesture Recognition Mostefa Ben naceur, Luc Brun, Olivier Lezoray Normandie Univ, ENSICAEN, CNRS, UNICAEN, GREYC, Caen France Mostefa Ben naceur, Luc Brun, Olivier Lezoray () Lightweight Deep Symmetric Positive Definite Manifold Network for Real-Time 3D Hand G 1 / 8
  2. Context (a) 22 hand joints (graph) (b) Transforming hand joints

    into structured data (2D grid) where each joint has 3 dim (x, y, z) Input Xt = 20 16 12 8 4 19 15 11 7 3 18 14 10 6 2 17 13 9 5 1 (c) 2D grid with 3 dim is the input to our proposed model where each node of the 2D grid has at most 9 neighbors including itself Mostefa Ben naceur, Luc Brun, Olivier Lezoray () Lightweight Deep Symmetric Positive Definite Manifold Network for Real-Time 3D Hand G 2 / 8
  3. Our proposed model: Deep SPDNe for offline 3D hand gesture

    recognition (1/2) 2DConv : Xt k = j∈Ni Wk,i Xt k−1,j Gmap : Xsb s,j = Σsb s,j + µsb s,j (µsb s,j )T µsb s,j (µsb s,j )T 1 (1) Reig : X1,sb s,j,k = Uk−1 max(ϵI, Vk−1 )UT k−1 LogEig : X2,sb s,j = Uk−1 log(Vk−1 )UT k−1 (2) Mostefa Ben naceur, Luc Brun, Olivier Lezoray () Lightweight Deep Symmetric Positive Definite Manifold Network for Real-Time 3D Hand G 3 / 8
  4. Our proposed model: Deep SPDNe for offline 3D hand gesture

    recognition (2/2) VecMat : xsb s,j = fv (X2,sb s,j ) = [X2,sb s,j (1, 1), √ 2X2,sb s,j (1, 2) , ..., √ 2X2,sb s,j (1, dc out ), X2,sb s,j (2, 2), √ 2X2,sb s,j (2, 3) , ..., X2,sb s,j (dc out , dc out )]T (3) BiMap : X = fb (X1, ..., XN ; W1, ..., WN ) = N i=1 Wi Xi W T i (4) Mostefa Ben naceur, Luc Brun, Olivier Lezoray () Lightweight Deep Symmetric Positive Definite Manifold Network for Real-Time 3D Hand G 4 / 8
  5. Our proposed model: Deep SPDNet and TDN pipeline for online

    3D hand gesture recognition Mostefa Ben naceur, Luc Brun, Olivier Lezoray () Lightweight Deep Symmetric Positive Definite Manifold Network for Real-Time 3D Hand G 5 / 8
  6. Experiments: Performance of Deep SPDNet and TDN pipeline to detect

    and recognize the sequences of clips Method Color Depth Pose Accuracy(%) HON4D [1] ✗ ✓ ✗ 70.61 Lie Group [2] ✗ ✗ ✓ 82.69 HBRNN [3] ✗ ✗ ✓ 77.40 JOULE-all [4] ✓ ✓ ✓ 78.78 Two stream-all [5] ✓ ✗ ✗ 75.30 Novel View [6] ✗ ✓ ✗ 69.21 LSTM [7] ✗ ✗ ✓ 80.14 Gram Matrix [8] ✗ ✗ ✓ 85.39 T Forests [9] ✗ ✗ ✓ 80.69 M Learning [10] ✗ ✗ ✓ 84.35 G Manifolds [11] ✗ ✗ ✓ 77.57 ST-TS-HGR-NET [12] ✗ ✗ ✓ 93.22 Two-stream NN [13] ✗ ✗ ✓ 90.26 Deep SPDNet ✗ ✗ ✓ 90.96 10 C/S 20 C/S 40 C/S 80 C/S 160 C/S 91.71(±0.65) 90.44(±1.4) 89.78(±1.5) 89.34(±1.56) 89.09(±1.51) 88.99(±0.92) 87.61(±1.57) 86.36(±2.23) 85.37(±2.59) 84.72(±2.66) 85.27(±0.88) 83.24(±2.15) 81.43(±3.14) 79.76(±4.05) 78.14(±4.90) Mostefa Ben naceur, Luc Brun, Olivier Lezoray () Lightweight Deep Symmetric Positive Definite Manifold Network for Real-Time 3D Hand G 6 / 8
  7. Conclusion a novel Deep SPD model for skeleton-based offline 3D

    hand gesture recognition integrate symmetric positive definite (SPD) matrices in our deep learning model for learning statistical and discriminative hand gesture representations a stream of 10-frame clips is sufficient to solve the issue of online 3D hand gesture recognition using our proposed pipeline of Deep SPD and TDN networks Mostefa Ben naceur, Luc Brun, Olivier Lezoray () Lightweight Deep Symmetric Positive Definite Manifold Network for Real-Time 3D Hand G 7 / 8
  8. Bibliography I Mostefa Ben naceur, Luc Brun, Olivier Lezoray ()

    Lightweight Deep Symmetric Positive Definite Manifold Network for Real-Time 3D Hand G 8 / 8
  9. Oreifej, O., & Liu, Z. (2013). Hon4d: Histogram of oriented

    4d normals for activity recognition from depth sequences. In Proceedings of the IEEE conference on computer vision and pattern recognition (pp. 716-723). Vemulapalli, R., Arrate, F., & Chellappa, R. (2014). Human action recognition by representing 3d skeletons as points in a lie group. In Proceedings of the IEEE conference on computer vision and pattern recognition (pp. 588-595). Du, Y., Wang, W., & Wang, L. (2015). Hierarchical recurrent neural network for skeleton based action recognition. In Proceedings of the IEEE conference on computer vision and pattern recognition (pp. 1110-1118). Hu, J. F., Zheng, W. S., Lai, J., & Zhang, J. (2015). Jointly learning heterogeneous features for RGB-D activity recognition. In Proceedings of the IEEE conference on computer vision and pattern recognition (pp. 5344-5352). Feichtenhofer, C., Pinz, A., & Zisserman, A. (2016). Convolutional two-stream network fusion for video action recognition. In Proceedings of the IEEE conference on computer vision and pattern recognition (pp. 1933-1941). Mostefa Ben naceur, Luc Brun, Olivier Lezoray () Lightweight Deep Symmetric Positive Definite Manifold Network for Real-Time 3D Hand G 8 / 8
  10. Rahmani, H., & Mian, A. (2016). 3d action recognition from

    novel viewpoints. In Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition (pp. 1506-1515). Zhu, W., Lan, C., Xing, J., Zeng, W., Li, Y., Shen, L., & Xie, X. (2016, March). Co-occurrence feature learning for skeleton based action recognition using regularized deep LSTM networks. In Proceedings of the AAAI Conference on Artificial Intelligence (Vol. 30, No. 1). Zhang, X., Wang, Y., Gou, M., Sznaier, M., & Camps, O. (2016). Efficient temporal sequence comparison and classification using gram matrix embeddings on a riemannian manifold. In Proceedings of the IEEE conference on computer vision and pattern recognition (pp. 4498-4507). Garcia-Hernando, G., & Kim, T. K. (2017). Transition forests: Learning discriminative temporal transitions for action recognition and detection. In Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition (pp. 432-440). Huang, Z., & Van Gool, L. (2017, February). A riemannian network for spd matrix learning. In Proceedings of the AAAI Conference on Artificial Intelligence (Vol. 31, No. 1). Mostefa Ben naceur, Luc Brun, Olivier Lezoray () Lightweight Deep Symmetric Positive Definite Manifold Network for Real-Time 3D Hand G 8 / 8
  11. Huang, Z., Wu, J., & Van Gool, L. (2018, April).

    Building deep networks on grassmann manifolds. In Proceedings of the AAAI Conference on Artificial Intelligence (Vol. 32, No. 1). Nguyen, X. S., Brun, L., Lézoray, O., & Bougleux, S. (2019). A neural network based on SPD manifold learning for skeleton-based hand gesture recognition. In Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition (pp. 12036-12045). Li, C., Li, S., Gao, Y., Zhang, X., & Li, W. (2021). A Two-stream Neural Network for Pose-based Hand Gesture Recognition. arXiv preprint arXiv:2101.08926. Mostefa Ben naceur, Luc Brun, Olivier Lezoray () Lightweight Deep Symmetric Positive Definite Manifold Network for Real-Time 3D Hand G 8 / 8