FG 2021

[table]capposition=above Lightweight Deep Symmetric Positive Definite Manifold Network for Real-Time
3D Hand Gesture Recognition Mostefa Ben naceur, Luc Brun, Olivier Lezoray Normandie Univ, ENSICAEN, CNRS, UNICAEN, GREYC, Caen France Mostefa Ben naceur, Luc Brun, Olivier Lezoray () Lightweight Deep Symmetric Positive Definite Manifold Network for Real-Time 3D Hand G 1 / 8

Context (a) 22 hand joints (graph) (b) Transforming hand joints
into structured data (2D grid) where each joint has 3 dim (x, y, z) Input Xt = 20 16 12 8 4 19 15 11 7 3 18 14 10 6 2 17 13 9 5 1 (c) 2D grid with 3 dim is the input to our proposed model where each node of the 2D grid has at most 9 neighbors including itself Mostefa Ben naceur, Luc Brun, Olivier Lezoray () Lightweight Deep Symmetric Positive Definite Manifold Network for Real-Time 3D Hand G 2 / 8

Our proposed model: Deep SPDNe for offline 3D hand gesture
recognition (1/2) 2DConv : Xt k = j∈Ni Wk,i Xt k−1,j Gmap : Xsb s,j = Σsb s,j + µsb s,j (µsb s,j )T µsb s,j (µsb s,j )T 1 (1) Reig : X1,sb s,j,k = Uk−1 max(ϵI, Vk−1 )UT k−1 LogEig : X2,sb s,j = Uk−1 log(Vk−1 )UT k−1 (2) Mostefa Ben naceur, Luc Brun, Olivier Lezoray () Lightweight Deep Symmetric Positive Definite Manifold Network for Real-Time 3D Hand G 3 / 8

Our proposed model: Deep SPDNe for offline 3D hand gesture
recognition (2/2) VecMat : xsb s,j = fv (X2,sb s,j ) = [X2,sb s,j (1, 1), √ 2X2,sb s,j (1, 2) , ..., √ 2X2,sb s,j (1, dc out ), X2,sb s,j (2, 2), √ 2X2,sb s,j (2, 3) , ..., X2,sb s,j (dc out , dc out )]T (3) BiMap : X = fb (X1, ..., XN ; W1, ..., WN ) = N i=1 Wi Xi W T i (4) Mostefa Ben naceur, Luc Brun, Olivier Lezoray () Lightweight Deep Symmetric Positive Definite Manifold Network for Real-Time 3D Hand G 4 / 8

Our proposed model: Deep SPDNet and TDN pipeline for online
3D hand gesture recognition Mostefa Ben naceur, Luc Brun, Olivier Lezoray () Lightweight Deep Symmetric Positive Definite Manifold Network for Real-Time 3D Hand G 5 / 8

Experiments: Performance of Deep SPDNet and TDN pipeline to detect
and recognize the sequences of clips Method Color Depth Pose Accuracy(%) HON4D [1] ✗ ✓ ✗ 70.61 Lie Group [2] ✗ ✗ ✓ 82.69 HBRNN [3] ✗ ✗ ✓ 77.40 JOULE-all [4] ✓ ✓ ✓ 78.78 Two stream-all [5] ✓ ✗ ✗ 75.30 Novel View [6] ✗ ✓ ✗ 69.21 LSTM [7] ✗ ✗ ✓ 80.14 Gram Matrix [8] ✗ ✗ ✓ 85.39 T Forests [9] ✗ ✗ ✓ 80.69 M Learning [10] ✗ ✗ ✓ 84.35 G Manifolds [11] ✗ ✗ ✓ 77.57 ST-TS-HGR-NET [12] ✗ ✗ ✓ 93.22 Two-stream NN [13] ✗ ✗ ✓ 90.26 Deep SPDNet ✗ ✗ ✓ 90.96 10 C/S 20 C/S 40 C/S 80 C/S 160 C/S 91.71(±0.65) 90.44(±1.4) 89.78(±1.5) 89.34(±1.56) 89.09(±1.51) 88.99(±0.92) 87.61(±1.57) 86.36(±2.23) 85.37(±2.59) 84.72(±2.66) 85.27(±0.88) 83.24(±2.15) 81.43(±3.14) 79.76(±4.05) 78.14(±4.90) Mostefa Ben naceur, Luc Brun, Olivier Lezoray () Lightweight Deep Symmetric Positive Definite Manifold Network for Real-Time 3D Hand G 6 / 8

Conclusion a novel Deep SPD model for skeleton-based offline 3D
hand gesture recognition integrate symmetric positive definite (SPD) matrices in our deep learning model for learning statistical and discriminative hand gesture representations a stream of 10-frame clips is sufficient to solve the issue of online 3D hand gesture recognition using our proposed pipeline of Deep SPD and TDN networks Mostefa Ben naceur, Luc Brun, Olivier Lezoray () Lightweight Deep Symmetric Positive Definite Manifold Network for Real-Time 3D Hand G 7 / 8

Bibliography I Mostefa Ben naceur, Luc Brun, Olivier Lezoray ()
Lightweight Deep Symmetric Positive Definite Manifold Network for Real-Time 3D Hand G 8 / 8

Oreifej, O., & Liu, Z. (2013). Hon4d: Histogram of oriented
4d normals for activity recognition from depth sequences. In Proceedings of the IEEE conference on computer vision and pattern recognition (pp. 716-723). Vemulapalli, R., Arrate, F., & Chellappa, R. (2014). Human action recognition by representing 3d skeletons as points in a lie group. In Proceedings of the IEEE conference on computer vision and pattern recognition (pp. 588-595). Du, Y., Wang, W., & Wang, L. (2015). Hierarchical recurrent neural network for skeleton based action recognition. In Proceedings of the IEEE conference on computer vision and pattern recognition (pp. 1110-1118). Hu, J. F., Zheng, W. S., Lai, J., & Zhang, J. (2015). Jointly learning heterogeneous features for RGB-D activity recognition. In Proceedings of the IEEE conference on computer vision and pattern recognition (pp. 5344-5352). Feichtenhofer, C., Pinz, A., & Zisserman, A. (2016). Convolutional two-stream network fusion for video action recognition. In Proceedings of the IEEE conference on computer vision and pattern recognition (pp. 1933-1941). Mostefa Ben naceur, Luc Brun, Olivier Lezoray () Lightweight Deep Symmetric Positive Definite Manifold Network for Real-Time 3D Hand G 8 / 8

Rahmani, H., & Mian, A. (2016). 3d action recognition from
novel viewpoints. In Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition (pp. 1506-1515). Zhu, W., Lan, C., Xing, J., Zeng, W., Li, Y., Shen, L., & Xie, X. (2016, March). Co-occurrence feature learning for skeleton based action recognition using regularized deep LSTM networks. In Proceedings of the AAAI Conference on Artificial Intelligence (Vol. 30, No. 1). Zhang, X., Wang, Y., Gou, M., Sznaier, M., & Camps, O. (2016). Efficient temporal sequence comparison and classification using gram matrix embeddings on a riemannian manifold. In Proceedings of the IEEE conference on computer vision and pattern recognition (pp. 4498-4507). Garcia-Hernando, G., & Kim, T. K. (2017). Transition forests: Learning discriminative temporal transitions for action recognition and detection. In Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition (pp. 432-440). Huang, Z., & Van Gool, L. (2017, February). A riemannian network for spd matrix learning. In Proceedings of the AAAI Conference on Artificial Intelligence (Vol. 31, No. 1). Mostefa Ben naceur, Luc Brun, Olivier Lezoray () Lightweight Deep Symmetric Positive Definite Manifold Network for Real-Time 3D Hand G 8 / 8

Huang, Z., Wu, J., & Van Gool, L. (2018, April).
Building deep networks on grassmann manifolds. In Proceedings of the AAAI Conference on Artificial Intelligence (Vol. 32, No. 1). Nguyen, X. S., Brun, L., Lézoray, O., & Bougleux, S. (2019). A neural network based on SPD manifold learning for skeleton-based hand gesture recognition. In Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition (pp. 12036-12045). Li, C., Li, S., Gao, Y., Zhang, X., & Li, W. (2021). A Two-stream Neural Network for Pose-based Hand Gesture Recognition. arXiv preprint arXiv:2101.08926. Mostefa Ben naceur, Luc Brun, Olivier Lezoray () Lightweight Deep Symmetric Positive Definite Manifold Network for Real-Time 3D Hand G 8 / 8

FG 2021

FG 2021

Olivier Lézoray

More Decks by Olivier Lézoray

Other Decks in Research

Featured

Transcript

[table]capposition=above Lightweight Deep Symmetric Positive Definite Manifold Network for Real-Time

Context (a) 22 hand joints (graph) (b) Transforming hand joints

Our proposed model: Deep SPDNe for offline 3D hand gesture

Our proposed model: Deep SPDNe for offline 3D hand gesture

Our proposed model: Deep SPDNet and TDN pipeline for online

Experiments: Performance of Deep SPDNet and TDN pipeline to detect

Conclusion a novel Deep SPD model for skeleton-based offline 3D

Bibliography I Mostefa Ben naceur, Luc Brun, Olivier Lezoray ()

Oreifej, O., & Liu, Z. (2013). Hon4d: Histogram of oriented

Rahmani, H., & Mian, A. (2016). 3d action recognition from

Huang, Z., Wu, J., & Van Gool, L. (2018, April).