Upgrade to Pro — share decks privately, control downloads, hide ads and more …

modeling_point_cloud.pdf

 modeling_point_cloud.pdf

koki madono

June 24, 2019
Tweet

More Decks by koki madono

Other Decks in Research

Transcript

  1. .PEFMJOH1PJOH $MPVEXJUI 4FMG"UUFOUJPOBOE(VNCFM4VCTFU 4BNQMJOH +JBODIFOH :BOH 2JBOH ;IBOH #JOHCJOH /J

    -JOHVP -J +JOYJBO -JV .FOHEJF ;IPV 2J5JBO 4IBOHIBJ+JBP5POH6OJWFSTJUZ .PF,FZ-BCPG"SUJGJDJBM*OUFMMJHFODF "**OTUJUVUF 4IBIHIBJ +JBP5POH 6OJWFSTJUZ )VBXFJ/PBIT "SL-BC $713
  2. ֓ཁ , Mengdie Zhou Qi Tian anzmd}@sjtu.edu.cn, [email protected] ai Jiao

    Tong University ence, AI Institute, Shanghai Jiao Tong University wei Noahs Ark Lab ortant ed by ention louds. sing a to re- ate its ermu- cs de- pling) by, we Furthest Point Sampling Group Shuffle Attention Gumbel Subset Sampling Outlier Attention Weights w nsx2 GO KPO 5SC GXPSL 3 6G PKO 2 GO KPO ?SCOTHPS GS x 0I 9 A D 8 nq x d C 6D C ty r dT k
  3. എܠ - T/ RKK C EP A FCK CFCL OK

    K G T F (.GC ) -H )E FF wifs} f y / f p dT GY 2 GSC G PPMKO :9 x x : M K 7GCF 2 GO KPOx S f p o _ t f T C DC h luy
  4. ఏҊख๏ུ֓ . 3D C C DC 6 8CG D t

    p nq C DC 2 TPM G COF =GMC K G PTK KPO G GFFKO SP T HHMG C GO KPO 0I 9A I9G 8CEA C E I 8 DC C 8 8C v C D C t
  5. 0 0 4 0 4 4 0 0 4 !

    = #$, #&, … , #(, … , #) c f e U !( * = (#(, #, − #(), | , ≠ ( Figure 2. Point Attention Transformer architecture for classification (to
  6. 0 0 4 0 4 4 0 0 4 !

    = #$, #&, … , #(, … , #) c f e U !( * = (#(, #, − #(), | , ≠ ( Nearest-neighbor Graph Figure 2. Point Attention Transformer architecture for classification (to 1234 56 = 7 ∘ max ℎ 5* 5* ∈ >6 * } 7, ℎ ∶ AB3 CDEℎ FGHIJ KHGL 2= 4x f?P 8 x
  7. 3 0 0 4 43 b Uln e aM 0

    DIE G I A C DC p Group Linear Shuffle GSA Self-Attention
  8. 3 0 0 4 ( Group Linear Point 2 Point

    1 Point 3 M x a w NO : N/R 5 = 5S, 5T, . . , 5VW,5VWXS,5VWXT,, . . , 5V 0 5YVWXZ [ = 1, … , NO)| D = 0, … , R − 1
  9. 3 0 0 4 ) Dot-Product In-Group Attention Group Linear

    Group1 Group2 Group3 Point 2 Non-linearity Point 1 Point 3 a xK x muw2 GO KPO 1EE^_ `, > ∶ a `, > b > FGHIJ1EE^ > ∶ NH^NcE(1EE^_(>Y, >Y) >Y = >(Y)dY}YeS,…,O dY ∶ a x x { dY ∈ ℝVW×VW
  10. 3 0 0 4 Dot-Product In-Group Attention Group Norm Group

    Linear Channel Shuffle Group1 Group2 Group3 Point 2 Non-linearity + Point 1 Point 3 3 COOGM HHMGx FGHIJ1EE^ > dh FGHIJ1EE^ > a tx { xq
  11. 3 0 0 4 Dot-Product In-Group Attention Group Norm Group

    Linear Channel Shuffle Group1 Group2 Group3 Point 2 Non-linearity + Point 1 Point 3 Fa1 0 FK(h FGHIJ1EE^ > + >) PKO w PKO x }qp luk
  12. 0 0 4 , Group Linear Shuffle GSA GSA Down

    Sampling Element-wise Classification Loss Segmentation Loss Segmentation Classification Self-Attention GSA GSA & Down Sampling Repeating for i times Group1 Group2 Group3 Shuffled Group MLP MLP ... MLP MLP MLP Nearest-neighbor Graph Figure 2. Point Attention Transformer architecture for classification (top branch) and segmentation (bottom branch). The input points are first embedded into high-level representations through an Absolute and Relative Position Embedding (ARPE) module, resulting in some points representative (bigger in the figure). In classification, the features alternately pass through Group Shuffle Attention (GSA) blocks and down-sampling blocks, either Furthest Point Sampling (FPS), or our Gumbel Subset Sampling (GSS). In segmentation, only GSA layers are used. Finally, a shared MLP is connected to every point, followed by an element-wise classification loss or segmentation loss for training. 3MCTTKHKEC KPO w ns g w_ x w nvf GS C KPOw nvf C MKO fqfd xq
  13. 0 0 4 - Input FC Layer Transposed Gumbel Noise

    1/τ Softmax Dot-Product ( Ni , c ) ( Ni+1 , Ni ) ( Ni+1 , Ni ) ( Ni+1 , c ) × + (b) Gumbel Subset Sampling Annealing τ → 0+ τ = 1 ntion. The core representation Instead, we use a hard and discrete selection w to-end trainable gumbel softmax (Eq. 3): y gumbel = gumbel softmax(wXT i ) · X i , w 2 in training phase, it provides smooth gradients crete reparameterization trick. With annealing, ates to a hard selection in test phase. A Gumbel Subset Sampling (GSS) is simply point version of Eq. 13, which means a distribut sets, GSS(X i ) = gumbel softmax(WXT i )·X i , W The following proposition theoretically gua permutation-invariance of GSS. !, ∈ ℝ), × j !, * = klm !, ∈ ℝ), × ),n$
  14. 0 0 4 . Input FC Layer Transposed Gumbel Noise

    1/τ Softmax Dot-Product ( Ni , c ) ( Ni+1 , Ni ) ( Ni+1 , Ni ) ( Ni+1 , c ) × + (b) Gumbel Subset Sampling Annealing τ → 0+ τ = 1 ntion. The core representation Instead, we use a hard and discrete selection w to-end trainable gumbel softmax (Eq. 3): y gumbel = gumbel softmax(wXT i ) · X i , w 2 in training phase, it provides smooth gradients crete reparameterization trick. With annealing, ates to a hard selection in test phase. A Gumbel Subset Sampling (GSS) is simply point version of Eq. 13, which means a distribut sets, GSS(X i ) = gumbel softmax(WXT i )·X i , W The following proposition theoretically gua permutation-invariance of GSS. !, ∈ ℝ), × j !, * = klm !, ∈ ℝ), × ),n$ u fv sP fv s fqf S 0I 9A D 8 x
  15. 0 op 5 op(5) 0 3 r qx q ~

    sHELc5(op 5 )t g e nq y vxt_ y tt e T
  16. 0 op 5 = {vw, vS, … , vx} x

    b c qY = exp( ⁄ log(vY) + RY ) ∑Zew x exp( ⁄ log(vZ) + RZ ) RY ~ − log(− log Å^DoHGL 0,1 :  ∶ a a e q = sHoELc5(( ⁄ log op 5 + R ) | Rw rs w z c tb / T PL K KP 3C G PSKECM =G CSC G GSK C KPO XK 6 GM PH CY
  17. ճαϯϓϦϯάͨ͠ࡍͷ෼෍ j=1 exp((log(⇡j) + gj)/⌧) he density of the Gumbel-Softmax

    distribution (derived in Appendix B) is: p⇡,⌧ (y1, ..., yk) = (k)⌧k 1 k X i=1 ⇡i/y⌧ i ! k k Y i=1 ⇡i/y⌧+1 i his distribution was independently discovered by Maddison et al. (2016), where it is referred e concrete distribution. As the softmax temperature ⌧ approaches 0, samples from the Gum oftmax distribution become one-hot and the Gumbel-Softmax distribution becomes identical to tegorical distribution p(z). expectation a) Categorical category sample b) = 0.1 = 0.5 = 1.0 = 10.0 gure 1: The Gumbel-Softmax distribution interpolates between discrete one-hot-encoded cate C softmax op 5 x sHoELc5(( ⁄ log op 5 + R ) x e a a / OG 7P a a / T/ CSYK PS FH )) FH
  18. 0 0 4 ( Input FC Layer Transposed Gumbel Noise

    1/τ Softmax Dot-Product ( Ni , c ) ( Ni+1 , Ni ) ( Ni+1 , Ni ) ( Ni+1 , c ) × + (b) Gumbel Subset Sampling Annealing τ → 0+ τ = 1 ntion. The core representation Instead, we use a hard and discrete selection w to-end trainable gumbel softmax (Eq. 3): y gumbel = gumbel softmax(wXT i ) · X i , w 2 in training phase, it provides smooth gradients crete reparameterization trick. With annealing, ates to a hard selection in test phase. A Gumbel Subset Sampling (GSS) is simply point version of Eq. 13, which means a distribut sets, GSS(X i ) = gumbel softmax(WXT i )·X i , W The following proposition theoretically gua permutation-invariance of GSS. !, ∈ ℝ), × j !, * = klm !, ∈ ℝ), × ),n$ ÖÜÜ !, = Máàâäã(å!, ç) !, å ∈ ℝ),n$×j S fv p Um
  19. 0 0 ) 6 8 C 8 8 ) 6

    G 8 8 ) r - .b ln yx 3D C ::I 8: b
  20. O GI D NP Method Points Accuracy (%) DeepSets [51]

    5,000 90.0 PointNet [30] 1,024 89.2 Kd-Net [19] 1,024 90.6 PointNet++ [32] 1,024 90.7 KCNet [34] 1,024 91.0 DGCNN [42] 1,024 92.2 PointCNN [23] 1,024 92.2 PAT (GSA only) 1,024 91.3 PAT (GSA only) 256 90.9 PAT (FPS) 1,024 91.4 PAT (FPS + GSS) 1,024 91.7 Table 1. Classification performance on ModelNet40 dataset. T A in P ev
  21. 4 MA NP Method Size Time Accuracy (%) PointNet [30]

    40 25.3 89.2 PointNet++ [32] 12 163.2 90.7 DGCNN [42] 21 94.6 92.2 PAT (GSA only) 5 132.9 91.3 PAT (FPS) 5 87.6 91.4 PAT (FPS + GSS) 5.8 88.6 91.7 Table 2. Model size (”Size”, MB), forward time (”Time”, ms) and Accuracy on ModelNet40 dataset. in speed with low-level implemental optimization. Note the
  22. 0 0 0 0 4 , f f C 8

    DC .8 8G DA : DGG 8A 8 DCb 2 8C E :A8GG 1D7p
  23. Method mIoU mIoU on Area 5 Size (MB) RSNet [13]

    56.47 - - SPGraph [20] 62.1 58.04 - PointNet [30] 47.71 47.6 4.7 DGCNN [42] 56.1 - 6.9 PointCNN [23] 65.39 57.26 46.2 PAT 64.28 60.07 6.1 Table 3. 3D semantic segmentation results on S3DIS. Mean per- class IoU (mIoU, %) is used as evaluation metric. Model sizes are obtained using the official codes. To further analyze the performance between PointCNN Fig stre 'PMEͰܭࢉͨ͠N*P6Ͱ͸Ұ൪ߴ͍ -
  24. ·ͱΊ . f p m3D C C DC 6 8CG

    D d v se m P0I 9 A I9G 8 EA C f S c f y re m e T S f c p l