Upgrade to Pro — share decks privately, control downloads, hide ads and more …

Tutorial of Geometric Solvers for Reconstructio...

Tutorial of Geometric Solvers for Reconstruction And Pose estimation (GSRAP)

Open Source AI Association

December 20, 2024
Tweet

More Decks by Open Source AI Association

Other Decks in Technology

Transcript

  1. Tutorial of Geometric Solvers for Reconstruction And Pose estimation (GSRAP)

    Tutorial of Geometric Solvers for Reconstruction And Pose estimation (GSRAP) Developer: Kai Okawa, Mikiya Shibuya Supporter: Open Source AI Association 1
  2. Tutorial of Geometric Solvers for Reconstruction And Pose estimation (GSRAP)

    What is GSRAP? Geometric Solvers for Reconstruction And Pose estimation • Functions (To be added) • Relative pose estimation from 2D-2D point correspondences • Nister’s Five points algorithm • Absolute pose estimation from 2D-3D point correspondences • P3P algorithm, EPnP • Similarity transformation estimation from 3D-3D point correspondences • Umeyama algorithm • RANSAC for outlier rejections • A versatile framework for rejecting outliers • Features • C++17 • Python wrapper • Tutorial and sample codes 2
  3. Tutorial of Geometric Solvers for Reconstruction And Pose estimation (GSRAP)

    Table of contents • Coordinate transformations and camera projection models • Epipolar geometry • Fundamental Matrix F • Essential Matrix E • Relative pose from 2D-2D corresponding points • Absolute pose from 2D-3D corresponding points • Rigid/Similarity transformation from 3D-3D corresponding points • Algorithm explanation • P3P algorithm (SE(3) estimation) • Umeyama algorithm (Sim(3) estimation) • Random Sample Consensus (RANSAC) • Source code explanation 3
  4. Tutorial of Geometric Solvers for Reconstruction And Pose estimation (GSRAP)

    Coordinate transformations and camera projection models 4
  5. Tutorial of Geometric Solvers for Reconstruction And Pose estimation (GSRAP)

    , ワールド R , Coordinate transformations and projection models Project 3D points onto an image plane • World coordinate system→ camera coordinate system→ image coordinate system • Coordinate system • World. : Common for all cameras • Camera: Fixed to any camera • Image. : Fixed to any image plane • Intrinsic parameters • Image to camera coordinate system conversion • Extrinsic parameters • Camera to world coordinate system conversion 5 Relationship between camera and image coordinate systems Relationship between world and camera coordinate systems World coordinate system Camera coordinate system R𝑤𝑐 , 𝐭𝑤𝑐 R𝑐𝑤 , 𝐭𝑐𝑤
  6. Tutorial of Geometric Solvers for Reconstruction And Pose estimation (GSRAP)

    , ワールド R , Coordinate transformations and projection models Camera records a scene by projecting light reflected off objects in space onto the image sensor • Represented by the intrinsic camera parameter K • The simplest perspective model 𝑓𝑢 , 𝑓𝑣 :Focal length 𝑐𝑢 , 𝑐𝑣 :Optical Center 𝐗𝑐 = 𝑋𝑐 , 𝑌𝑐 , 𝑍𝑐 T:3D points in camera coordinate system 𝐳𝑐 = 𝑢, 𝑣 T :2D points in image coordinate system K′ ≔ 𝑓𝑢 0 𝑐𝑢 0 𝑓𝑣 𝑐𝑣 K ≔ 𝑓𝑢 0 𝑐𝑢 0 𝑓𝑣 𝑐𝑣 0 0 1 𝐳 = 1 𝑍𝑐 K′𝐗𝑐 Relationship between camera and image coordinate systems Relationship between world and camera coordinate systems World coordinate system Camera coordinate system R𝑤𝑐 , 𝐭𝑤𝑐 R𝑐𝑤 , 𝐭𝑐𝑤 6
  7. Tutorial of Geometric Solvers for Reconstruction And Pose estimation (GSRAP)

    Coordinate transformations and projection models If the camera’s coordinate system is different from the world’s, convert from world to camera coordinates. • Substitute 𝐗𝑐 = R𝑐𝑤 𝐭𝑐𝑤 ] ഥ 𝐗𝑤 to the projection model on the previous page 𝐳 = 1 𝑍𝑐 K′ R𝑐𝑤 𝐭𝑐𝑤 ] ഥ 𝐗𝑤 𝑢 𝑣 = 1 𝑍𝑐 𝑓𝑢 0 𝑐𝑢 0 𝑓𝑣 𝑐𝑣 𝑟𝑐𝑤 11 𝑟𝑐𝑤 12 𝑟𝑐𝑤 13 𝑡𝑐𝑤 1 𝑟𝑐𝑤 21 𝑟𝑐𝑤 22 𝑟𝑐𝑤 23 𝑡𝑐𝑤 2 𝑟𝑐𝑤 31 𝑟𝑐𝑤 32 𝑟𝑐𝑤 33 𝑡𝑐𝑤 3 𝑋𝑤 𝑌𝑤 𝑍𝑤 1 𝐗𝑐 = 𝑋𝑐 , 𝑌𝑐 , 𝑍𝑐 T :3D points in camera coordinate system 𝐗𝑤 = 𝑋𝑤 , 𝑌𝑤 , 𝑍𝑤 T :3D points in world coordinate system R𝑐𝑤 , 𝐭𝑐𝑤 :Rotation matrix, Translation vector ഥ 𝐗𝑤 = 𝑋𝑤 , 𝑌𝑤 , 𝑍𝑤 , 1 T :Homogeneous coordinates of 𝐗𝑤 , ワールド R , Relationship between camera and image coordinate systems Relationship between world and camera coordinate systems World coordinate system Camera coordinate system R𝑤𝑐 , 𝐭𝑤𝑐 R𝑐𝑤 , 𝐭𝑐𝑤 7
  8. Tutorial of Geometric Solvers for Reconstruction And Pose estimation (GSRAP)

    Preliminary: Rotation matrix and Rotation vector The rigid body transformation T (6-DOF) and similarity transformation S (7-DOF) are expressed as follows using rotation matrix R ∈ SO 3 , translation vector 𝐭 ∈ ℝ3, scale parameter 𝑠, respectively SO 3 : Special Orthogonal group of order 3 ⇒ 3D rotation SE(3) : Special Euclidean group of order 3 ⇒ 3D rotation + translation Sim(3): Similarity transformation group of order 3 ⇒ 3D rotation + translation + scale Lie algebra corresponding to the above Lie group 𝔰𝔬 3 , 𝔰𝔢 3 , 𝔰𝔦𝔪 3 The matrix representation R (3×3) is not suitable as an optimization variable • Expressed by 9 parameters despite the 3 degrees of freedom • Needs to also satisfy the conditions of the rotation matrix (orthogonality and determinant 1) 8 H. Strasdat et al., “Scale Drift-Aware Large Scale Monocular SLAM”, Robotics: Science and Systems VI, 2010
  9. Tutorial of Geometric Solvers for Reconstruction And Pose estimation (GSRAP)

    Epipolar geometry Spatial correspondence between two cameras and 3D points • Constraints on correspondence points between images captured in 3D scenes • Fundamental matrix F:Image coordinate system (K is unknown) • Essential matrix E:Camera coordinate system(K is known) • Estimates camera pose R𝑐𝑤 , 𝐭𝑐𝑤 from F or E Epipolar plane: • The plane passing through the camera centers 𝐎𝟏 , 𝐎𝟐 and the 3D point 𝐗 Epipolar line 𝐥: • The line where the epipolar plane intersects the image plane Epipole 𝐞: • The point where the straight line connecting 𝐎𝟏 and 𝐎𝟐 intersects the image plane (All epipolar lines 𝐥 pass through the epipole 𝐞) Overview of epipolar geometry 10 𝐳𝟏 𝐗 𝐗 ? 𝐗 ? 𝐳𝟐 Epipolar line 𝐥𝟐 Epipole 𝐞𝟏 Epipole 𝐞𝟐 O1 O2
  10. Tutorial of Geometric Solvers for Reconstruction And Pose estimation (GSRAP)

    Fundamental matrix F Matrix containing information about the intrinsic camera parameter K and the relative pose between the two cameras is denoted as R | 𝐭 Epipolar constraint: • When the point ෤ 𝐳1 is determined, it constraints the corresponding ෤ 𝐳2 in the other image • 𝐥2 = F෤ 𝐳1 represents the epipolar line in the image 𝐼2 corresponding to the point ෤ 𝐳1 in the image 𝐼1 • When 𝐥2 = F෤ 𝐳1 is substituted into the equation ෤ 𝐳2 TF෤ 𝐳1 = 0, it simplifies to ෤ 𝐳2 T𝐥2 = 0, representing a vector equation of a line • F has 7 degrees of freedom • scale ambiguity • det F = 0 11 𝑢2 𝑣2 1 𝑓11 𝑓12 𝑓13 𝑓21 𝑓22 𝑓23 𝑓31 𝑓32 𝑓33 𝑢1 𝑣1 1 = ෤ 𝐳2 TF෤ 𝐳1 = 0 R. Hartley, A. Zisserman, “Multiple View Geometry in Computer Vision”, Cambridge University Press, 2003
  11. Tutorial of Geometric Solvers for Reconstruction And Pose estimation (GSRAP)

    Matrix containing a relative camera pose R | 𝐭 When substituting into 𝐳2 T F 𝐳1 = 0, • K1 ,K2 :Intrinsic parameters of each camera • ො 𝐳1 = K1 −1𝐳1 ,ො 𝐳2 = K2 −1𝐳2 : A point projected on a plane 𝑍𝑐 =1 (normalized image coordinate system) • E has 5 degrees of freedom (For details, refer to the paper below) • scale ambiguity • det E = 0 • 2EETE − tr EET E = 0 Essential matrix E 12 E ≔ K2 T F K1 ෤ 𝐳2 T K2 T −1 E K1 −1 ෤ 𝐳1 = 0 ො 𝐳2 T E ො 𝐳1 = 0 O. D. Faugeras, S. Maybank , “Motion from point matches: Multiplicity of solutions”, IJCV, Vol.4, 1990
  12. Tutorial of Geometric Solvers for Reconstruction And Pose estimation (GSRAP)

    Essential matrix E Compute R | 𝐭 from E ො 𝐳2 T 𝐭 × Rො 𝐳1 = 0 𝐳1 𝐗 𝐳2 𝐞𝟏 𝐞𝟐 O1 O2 R, 𝐭 ො 𝐳2 T 𝐭 × Rො 𝐳1 = 0 The direction vector of pixel 𝐳1 in the coordinate system of camera 2 ො 𝐳1 ො 𝐳2 Vector perpendicular to the epipolar plane (i.e. 0 dot product because it is also orthogonal to the direction vector ො 𝐳2 of 𝐳2 ) The above equation is expressed by the skew- symmetric matrix 𝐭 × , E = 𝐭 × R The epipolar constraint ො 𝐳2 T E ො 𝐳1 = 0 leads to the equation 13 R. Hartley, A. Zisserman, “Multiple View Geometry in Computer Vision”, Cambridge University Press, 2003
  13. Tutorial of Geometric Solvers for Reconstruction And Pose estimation (GSRAP)

    Essential matrix E Compute R | 𝐭 from E • Perform a Singular Value Decomposition on E (there are sign ambiguity in SVD) • The are two possible matrix decompositions E = 𝐭 × R = SR • Since the sign of E is ambiguous, the sign of the translation 𝐭 is ambiguous • There are four possible solutions in total W = 0 −1 0 1 0 0 0 0 1 and Z = 0 0 0 −1 0 0 0 0 1 S = U𝑍UT R = UWVT or UWTVT E = U diag 1,1,0 VT 𝐭 = 𝐮3 or -𝐮3 R | 𝐭 = UWVT| +𝐮3 or UWVT| −𝐮3 or UWTVT| +𝐮3 or UWTVT| −𝐮3 14 R. Hartley, A. Zisserman, “Multiple View Geometry in Computer Vision”, Cambridge University Press, 2003
  14. Tutorial of Geometric Solvers for Reconstruction And Pose estimation (GSRAP)

    A B (a) A B (b) A B’ (c) A B’ (d) ✔ × × × Essential matrix E Compute R | 𝐭 from E When decomposing E, there are four possible solutions for R | 𝐭 as illustrated below • The correct solution is selected based on the condition that “3D points are reconstructed in front of both cameras” 15 R. Hartley, A. Zisserman, “Multiple View Geometry in Computer Vision”, Cambridge University Press, 2003
  15. Tutorial of Geometric Solvers for Reconstruction And Pose estimation (GSRAP)

    Estimation of relative pose from 2D-2D point correspondences 16
  16. Tutorial of Geometric Solvers for Reconstruction And Pose estimation (GSRAP)

    Fundamental matrix estimation The epipolar constraints for 𝑛 points can be expressed as linear equations • Estimate F using 8-point correspondences (8-point algorithm) • In linear solution methods, the number of unknowns is 8 due to scale ambiguity (though the intrinsic degrees of freedom are 7) • In the case of 𝑛 ≥ 8 • Calculate F from the all point correspondences with least square methods, such as Singular Value Decomposition (SVD) • When feature points are distributed on the same plane, the matrix A degenerates • Estimate R | 𝐭 from the essential matrix E or the homography matrix H 17
  17. Tutorial of Geometric Solvers for Reconstruction And Pose estimation (GSRAP)

    Essential matrix estimation There are two main approaches • First estimate F using methods like 8-point algorithm, and derive E by substituting into: E ≔ K2 T F K1 • Directly compute E using methos like 5-point algorithm • 5-point algorithm requires fewer point correspondences than 8-point algorithm • When using RANSAC, the probability of including outliers in randomly sampled correspondences is low, making it possible to find the correct combination of points with fewer iterations 18
  18. Tutorial of Geometric Solvers for Reconstruction And Pose estimation (GSRAP)

    Estimation of absolute pose from 2D-3D point correspondences 19
  19. Tutorial of Geometric Solvers for Reconstruction And Pose estimation (GSRAP)

    Perspective-n-Point (PnP) Problem The problem of estimating the camera pose, R and 𝐭, given the coordinates of n points in 3D space and their corresponding points in the image World Coordinate Camera Coordinate R, 20
  20. Tutorial of Geometric Solvers for Reconstruction And Pose estimation (GSRAP)

    Perspective-n-Point (PnP) Problem P3P problem • Geometrically, the problem can be solved with a minimum of 𝑛 = 3 corresponding points • In 1841, Grunert proposed the first solution • J. A. Grunert, “Das pothenotische problem in erweiterter gestalt nebst bber seine anwendungen in der geodasie,” Grunerts Archiv fur Mathematik und Physik, pages 238–248, 1841 PnP problem • Minimize the error using points where 𝑛 ≥ 3 • EPnP is commonly used as an 𝑛-point solver (EPnP: Efficient Perspective-n-Point Camera Pose Estimation) • V. Lepetit, F. Moreno-Noguer, P. Fua, “EPnP: An Accurate O(n) Solution to the PnP Problem”, International Journal of Computer Vision (IJCV), 2009 21 Gaku Nakano, “A Versatile Approach for Solving PnP, PnPf, and PnPfr Problems”, ECCV, 2016(表を引用)
  21. Tutorial of Geometric Solvers for Reconstruction And Pose estimation (GSRAP)

    Perspective-n-Point (PnP) Problem PnPf, PnPfr Problems • The problem of estimating focal length 𝑓 and lens distortion 𝑘 in addition to the camera pose • The minimum number of required points varies depending on the number of unknow parameters The number of correspondences points required for estimation in the PnP problem and its derived problems, considering the unknown parameters PnP PnPf PnPfr Rotation matrix R ✓ ✓ ✓ Translation vector 𝐭 ✓ ✓ ✓ Focal length 𝑓 ✓ ✓ Lens distortion 𝑘1 , 𝑘2 , 𝑘3 ✓ The number of unknow parameters 6 7 8 (𝑘1 ), 10 (𝑘1 , 𝑘2 , 𝑘3 ) The minimum number of required points 𝑛 3 4 4 (𝑘1 ), 5 (𝑘1 , 𝑘2 , 𝑘3 ) 22 Gaku Nakano, “A Versatile Approach for Solving PnP, PnPf, and PnPfr Problems”, ECCV, 2016(表を引用)
  22. Tutorial of Geometric Solvers for Reconstruction And Pose estimation (GSRAP)

    Estimation of rigid and similarity transformations from 3D-3D point correspondences 23
  23. Tutorial of Geometric Solvers for Reconstruction And Pose estimation (GSRAP)

    Estimation of rigid and similarity transformations from 3D-3D point correspondences Rigid transformation • SE 3 estimation • Rotation matrix R ∈ SO(3), translation vector 𝐭 ∈ ℝ3 Similarity transformation • Sim 3 estimation • Rotation matrix R ∈ SO(3), translation vector 𝐭 ∈ ℝ3, scale parameter 𝑠 24
  24. Tutorial of Geometric Solvers for Reconstruction And Pose estimation (GSRAP)

    Rigid transformation • SE 3 estimation • Rotation matrix R ∈ SO(3), translation vector 𝐭 Transformation using the estimated R, 𝐭 Point Cloud Library, https://github.com/PointCloudLibrary/pcl/blob/master/test/bunny.pcd (Cite the point cloud data) 𝑥2 𝑦2 𝑧2 1 = R 𝐭 0 1 𝑥1 𝑦1 𝑧1 1 R, 𝐭 25 Estimation of rigid and similarity transformations from 3D-3D point correspondences
  25. Tutorial of Geometric Solvers for Reconstruction And Pose estimation (GSRAP)

    Similarity transformation • Sim(3) estimation • Rotation matrix R ∈ SO(3), translation vector 𝐭 ∈ ℝ3, scale parameter 𝑠 R, 𝐭, s 𝑥2 𝑦2 𝑧2 1 = 𝑠R 𝐭 0 1 𝑥1 𝑦1 𝑧1 1 26 Estimation of rigid and similarity transformations from 3D-3D point correspondences Point Cloud Library, https://github.com/PointCloudLibrary/pcl/blob/master/test/bunny.pcd (Cite the point cloud data) Transformation using the estimated R, 𝐭, s
  26. Tutorial of Geometric Solvers for Reconstruction And Pose estimation (GSRAP)

    Algorithm explanation: P3P algorithm (Grunert’s method) 27
  27. Tutorial of Geometric Solvers for Reconstruction And Pose estimation (GSRAP)

    P3P algorithm (Grunert’s method) Estimate T ∈ SE(3) from three 2D-3D correspondences Consider the tetrahedron formed by the camera center 𝐗𝐨 and the three 3D points 𝐗𝟏 , 𝐗𝟐 , 𝐗𝟑 1. Estimate the lengths of the rays 𝑙1 , 𝑙2 , 𝑙3 2. Estimate the camera pose T ∈ SE(3) 𝐗𝐨 𝐗𝟐 𝐗𝟏 𝐗𝟑 𝛼 𝛽 𝛾 𝑎 𝑏 𝑐 𝑙2 𝑙3 𝑙1 𝐱𝟐 𝐱𝟑 𝐱𝟏 Cyrill Stachniss, “Projective 3-Point (P3P) Algorithm / Spatial Resection”, Photogrammetry I & II Course, 2021 28
  28. Tutorial of Geometric Solvers for Reconstruction And Pose estimation (GSRAP)

    P3P algorithm (Grunert’s method) 1. Estimate the lengths of the rays 𝑙1 , 𝑙2 , 𝑙3 Direction vector of the 3D point 𝐱𝒊 in the camera coordinate system ො 𝐱𝒊 𝐗𝐨 𝐗𝟐 𝐗𝟏 𝐗𝟑 𝛼 𝛽 𝛾 𝑎 𝑏 𝑐 𝑙2 𝑙3 𝑙1 ො 𝐱𝟐 ො 𝐱𝟑 ො 𝐱𝟏 𝐱𝟐 𝐱𝟏 𝐱𝟑 ො 𝐱𝒊 = K−𝟏𝐱𝒊 K−𝟏𝐱𝒊 Cyrill Stachniss, “Projective 3-Point (P3P) Algorithm / Spatial Resection”, Photogrammetry I & II Course, 2021 29
  29. Tutorial of Geometric Solvers for Reconstruction And Pose estimation (GSRAP)

    P3P algorithm (Grunert’s method) 1. Estimate the lengths of the rays 𝑙1 , 𝑙2 , 𝑙3 Calculate the angles 𝛼, 𝛽, 𝛾 formed by the ray vectors ො 𝐱𝟏 , ො 𝐱𝟐 , ො 𝐱𝟑 𝛼 = arccos ො 𝐱𝟐 , ො 𝐱𝟑 , 𝛽 = arccos ො 𝐱𝟑 , ො 𝐱𝟏 ,𝛾 = arccos ො 𝐱𝟏 , ො 𝐱𝟐 𝐗𝐨 𝐗𝟐 𝐗𝟏 𝐗𝟑 𝛼 𝛽 𝛾 𝑎 𝑏 𝑐 𝑙2 𝑙3 𝑙1 ො 𝐱𝟐 ො 𝐱𝟑 ො 𝐱𝟏 𝐱𝟐 𝐱𝟏 𝐱𝟑 Cyrill Stachniss, “Projective 3-Point (P3P) Algorithm / Spatial Resection”, Photogrammetry I & II Course, 2021 30
  30. Tutorial of Geometric Solvers for Reconstruction And Pose estimation (GSRAP)

    P3P algorithm (Grunert’s method) 1. Estimate the lengths of the rays 𝑙1 , 𝑙2 , 𝑙3 Based on the law of cosines, the following three equations hold (𝑎, 𝑏, 𝑐, 𝛼, 𝛽, 𝛾 are known) 𝐗𝐨 𝐗𝟐 𝐗𝟏 𝐗𝟑 𝛼 𝛽 𝛾 𝑎 𝑏 𝑐 𝑙2 𝑙3 𝑙1 𝑎2 = 𝑙2 2 + 𝑙3 2 − 2𝑙2 𝑙3 cos𝛼 𝑏2 = 𝑙3 2 + 𝑙1 2 − 2𝑙3 𝑙1 cos𝛽 𝑐2 = 𝑙1 2 + 𝑙2 2 − 2𝑙1 𝑙2 cos𝛾 ො 𝐱𝟐 ො 𝐱𝟑 ො 𝐱𝟏 𝐱𝟐 𝐱𝟏 𝐱𝟑 Cyrill Stachniss, “Projective 3-Point (P3P) Algorithm / Spatial Resection”, Photogrammetry I & II Course, 2021 31
  31. Tutorial of Geometric Solvers for Reconstruction And Pose estimation (GSRAP)

    P3P algorithm (Grunert’s method) 1. Estimate the lengths of the rays 𝑙1 , 𝑙2 , 𝑙3 When substituting with 𝑢 = 𝑙2 𝑙1 , 𝑣 = 𝑙3 𝑙1 , 𝑎2 = 𝑙1 2 𝑢2 + 𝑣2 − 2𝑢𝑣cos𝛼 𝑏2 = 𝑙1 2 1 + 𝑣2 − 2𝑣cos𝛽 𝑐2 = 𝑙1 2 1 + 𝑢2 − 2ucos𝛾 𝑙1 2 = 𝑎2 𝑢2 + 𝑣2 − 2𝑢𝑣cos𝛼 𝑙1 2 = 𝑏2 1 + 𝑣2 − 2𝑣cos𝛽 𝑙1 2 = 𝑐2 1 + 𝑢2 − 2𝑢cos𝛾 Cyrill Stachniss, “Projective 3-Point (P3P) Algorithm / Spatial Resection”, Photogrammetry I & II Course, 2021 32
  32. Tutorial of Geometric Solvers for Reconstruction And Pose estimation (GSRAP)

    P3P algorithm (Grunert’s method) 1. Estimate the lengths of the rays 𝑙1 , 𝑙2 , 𝑙3 Substitute the first and third equations into the second equation to eliminate 𝑙1 , 𝑢 𝑙1 2 = 𝑏2 1 + 𝑣2 − 2𝑣cos𝛽 𝑙1 2 = 𝑎2 𝑢2 + 𝑣2 − 2𝑢𝑣cos𝛼 𝑙1 2 = 𝑐2 1 + 𝑢2 − 2𝑢cos𝛾 𝐴4 𝑣4 + 𝐴3 𝑣3 + 𝐴2 𝑣2 + 𝐴1 𝑣 + 𝐴0 = 0 Reduced to a quartic equation Cyrill Stachniss, “Projective 3-Point (P3P) Algorithm / Spatial Resection”, Photogrammetry I & II Course, 2021 33
  33. Tutorial of Geometric Solvers for Reconstruction And Pose estimation (GSRAP)

    P3P algorithm (Grunert’s method) 1. Estimate the lengths of the rays 𝑙1 , 𝑙2 , 𝑙3 Cyrill Stachniss, “Projective 3-Point (P3P) Algorithm / Spatial Resection”, Photogrammetry I & II Course, 2021 4ሼ ሽ − 𝑎2 − 𝑐2 𝑎2 + 𝑏2 − 𝑐2 cos𝛽 + 2𝑎2𝑏2cos2𝛾cos𝛽 + 𝑏2 𝑎2 − 𝑏2 + 𝑐2 cos𝛼cos𝛾 𝐴4 = 𝑎2 − 𝑏2 − 𝑐2 2 − 4𝑏2𝑐2cos2𝛼 4ሼ ሽ − 𝑎2 − 𝑐2 𝑎2 − 𝑏2 − 𝑐2 cos𝛽 + 𝑎2𝑏2 + 𝑏2𝑐2 − 𝑏4 cos𝛼cos𝛾 + 2𝑏2𝑐2cos2𝛼cos𝛽 2ሼ ሽ 𝑎2 − 𝑐2 2 − 𝑏4 + 2 𝑎2 − 𝑐2 2cos2𝛽 − 2𝑏2 𝑎2 − 𝑐2 cos2𝛼 − 4𝑏2 𝑎2 + 𝑐2 cos𝛼cos𝛽cos𝛾 − 2𝑏2 𝑎2 − 𝑏2 cos2𝛾 𝐴2 = 𝐴0 = 𝑎2 + 𝑏2 − 𝑐2 2 − 4𝑎2𝑏2cos2𝛾 𝐴3 = 𝐴1 = 𝐴4 𝑣4 + 𝐴3 𝑣3 + 𝐴2 𝑣2 + 𝐴1 𝑣 + 𝐴0 = 0 34
  34. Tutorial of Geometric Solvers for Reconstruction And Pose estimation (GSRAP)

    P3P algorithm (Grunert’s method) 1. Estimate the lengths of the rays 𝑙1 , 𝑙2 , 𝑙3 Solve 𝐴4 𝑣4 + 𝐴3 𝑣3 + 𝐴2 𝑣2 + 𝐴1 𝑣 + 𝐴0 = 0 for 𝑣 and compute 𝑙1 , 𝑙2 , 𝑙3 Cyrill Stachniss, “Projective 3-Point (P3P) Algorithm / Spatial Resection”, Photogrammetry I & II Course, 2021 𝑙1 2 = 𝑏2 1 + 𝑣2 − 2𝑣cos𝛽 𝑙3 = 𝑙1 𝑣 𝑎2 = 𝑙1 2 𝑢2 + 𝑣2 − 2𝑢𝑣cos𝛼 𝑙2 2 − 2𝑙1 𝑙2 𝑣𝑐𝑜𝑠𝛼 + 𝑙1 2𝑣2 − 𝑎2 = 0 There are four possible solutions for 𝑙1 , 𝑙2 , 𝑙3 35
  35. Tutorial of Geometric Solvers for Reconstruction And Pose estimation (GSRAP)

    P3P algorithm (Grunert’s method) 1. Estimate the lengths of the rays 𝑙1 , 𝑙2 , 𝑙3 Choose one solution out of the four based on the consistency with additional information (1) or (2) (1) Additional sensor information such as GPS (2) Fourth point correspondences Cyrill Stachniss, “Projective 3-Point (P3P) Algorithm / Spatial Resection”, Photogrammetry I & II Course, 2021 36
  36. Tutorial of Geometric Solvers for Reconstruction And Pose estimation (GSRAP)

    P3P algorithm (Grunert’s method) 2. Estimate the camera pose T ∈ SE(3) Compute R, 𝐗𝐨 by substituting the estimated 𝑙𝑖 𝑙𝑖 ො 𝐱𝒊 = R 𝐗𝒊 − 𝐗𝐨 𝑖 = 1,2,3 𝐗𝐨 𝐗𝟐 𝐗𝟏 𝐗𝟑 𝛼 𝛽 𝛾 𝑎 𝑏 𝑐 𝑙2 𝑙3 𝑙1 ො 𝐱𝟐 ො 𝐱𝟑 ො 𝐱𝟏 𝐱𝟐 𝐱𝟏 𝐱𝟑 Cyrill Stachniss, “Projective 3-Point (P3P) Algorithm / Spatial Resection”, Photogrammetry I & II Course, 2021 37
  37. Tutorial of Geometric Solvers for Reconstruction And Pose estimation (GSRAP)

    𝐒𝐢𝐦 𝟑 estimation (Umeyama algorithm) Estimate the similarity transformation matrix from 𝑛 3D-3D point correspondences • Input: 𝑛 sets of 3D-3D point correspondences X = 𝐱1 , 𝐱2 , ⋯ , 𝐱𝑛 、Y = 𝐲1 , 𝐲2 , ⋯ , 𝐲𝑛 • Output: R, 𝐭, 𝑠 ∈ Sim(3) • Minimize the following error function: • Estimate R, 𝐭, 𝑠 to align X with Y using a similarity transformation 𝑒2 R, 𝐭, 𝑠 = 1 𝑛 ෍ 𝑖=1 𝑛 𝐲𝑖 − 𝑠R𝐱𝑖 + 𝐭 2 Coordinates of 𝐱𝑖 after similarity transformation Shinji Umeyama, “Least-Squares estimation of transformation parameters between two point patterns”, TPAMI, Vol.13, Issue 4, 1991 39
  38. Tutorial of Geometric Solvers for Reconstruction And Pose estimation (GSRAP)

    𝐒𝐢𝐦 𝟑 estimation (Umeyama algorithm) Detailed computation procedure: 1. Compute the centroid and covariance matrix of each point cloud 2. Perform Singular Value Decomposition (SVD) on the covariance matrix Σ𝑥𝑦 3. Compute R, 𝐭, 𝑠 from the results of steps 1 and 2 R = USVT 𝐭 = 𝛍𝑦 − 𝑠R𝛍𝑥 𝛍𝑥 = 1 𝑛 ෍ 𝑖=1 𝑛 𝐱𝑖 𝛍𝑦 = 1 𝑛 ෍ 𝑖=1 𝑛 𝐲𝑖 SVD Σ𝑥𝑦 = UDVT S = ቊ 𝐼 diag 1,1, ⋯ , 1, −1 if det Σ𝑥𝑦 = 1 if det Σ𝑥𝑦 = −1 D = diag 𝑑𝑖 , 𝑑1 ≥ 𝑑2 ≥ ⋯ ≥ 𝑑𝑚 ≥ 0 Shinji Umeyama, “Least-Squares estimation of transformation parameters between two point patterns”, TPAMI, Vol.13, Issue 4, 1991 40 s = 1 𝜎𝑥 2 tr DS Σ𝑥𝑦 = 1 𝑛 ෍ 𝑖=1 𝑛 𝐲𝑖 − 𝛍𝑦 𝐱𝑖 − 𝛍𝑥 T 𝜎𝑥 = 1 𝑛 ෍ 𝑖=1 𝑛 𝐱𝑖 − 𝛍𝑥 2
  39. Tutorial of Geometric Solvers for Reconstruction And Pose estimation (GSRAP)

    Random Sample Consensus (RANSAC) One of the iterative algorithms to estimate the parameters a mathematical model from observational data containing outliers 42 • Example: Estimation of line parameters 1. Randomly sample (two) data points for model estimation 2. Estimate the model parameters using the sampled points 3. Count the inliers with a fitting error within the threshold and calculate the score 4. Iterate through steps 1-3 to find the sample yielding the best score 5. Using all inliers identified in step 4, compute the model parameters that minimize the error
  40. Tutorial of Geometric Solvers for Reconstruction And Pose estimation (GSRAP)

    Random Sample Consensus (RANSAC) 43 • Example: Estimation of line parameters 1. Randomly sample (two) data points for model estimation 2. Estimate the model parameters using the sampled points 3. Count the inliers with a fitting error within the threshold and calculate the score 4. Iterate through steps 1-3 to find the sample yielding the best score 5. Using all inliers identified in step 4, compute the model parameters that minimize the error One of the iterative algorithms to estimate the parameters a mathematical model from observational data containing outliers
  41. Tutorial of Geometric Solvers for Reconstruction And Pose estimation (GSRAP)

    Random Sample Consensus (RANSAC) 44 • Example: Estimation of line parameters 1. Randomly sample (two) data points for model estimation 2. Estimate the model parameters using the sampled points 3. Count the inliers with a fitting error within the threshold and calculate the score 4. Iterate through steps 1-3 to find the sample yielding the best score 5. Using all inliers identified in step 4, compute the model parameters that minimize the error One of the iterative algorithms to estimate the parameters a mathematical model from observational data containing outliers
  42. Tutorial of Geometric Solvers for Reconstruction And Pose estimation (GSRAP)

    Random Sample Consensus (RANSAC) 45 • Example: Estimation of line parameters 1. Randomly sample (two) data points for model estimation 2. Estimate the model parameters using the sampled points 3. Count the inliers with a fitting error within the threshold and calculate the score 4. Iterate through steps 1-3 to find the sample yielding the best score 5. Using all inliers identified in step 4, compute the model parameters that minimize the error One of the iterative algorithms to estimate the parameters a mathematical model from observational data containing outliers
  43. Tutorial of Geometric Solvers for Reconstruction And Pose estimation (GSRAP)

    Random Sample Consensus (RANSAC) 46 • Example: Estimation of line parameters 1. Randomly sample (two) data points for model estimation 2. Estimate the model parameters using the sampled points 3. Count the inliers with a fitting error within the threshold and calculate the score 4. Iterate through steps 1-3 to find the sample yielding the best score 5. Using all inliers identified in step 4, compute the model parameters that minimize the error One of the iterative algorithms to estimate the parameters a mathematical model from observational data containing outliers
  44. Tutorial of Geometric Solvers for Reconstruction And Pose estimation (GSRAP)

    Random Sample Consensus (RANSAC) 47 • Example: Estimation of line parameters 1. Randomly sample (two) data points for model estimation 2. Estimate the model parameters using the sampled points 3. Count the inliers with a fitting error within the threshold and calculate the score 4. Iterate through steps 1-3 to find the sample yielding the best score 5. Using all inliers identified in step 4, compute the model parameters that minimize the error One of the iterative algorithms to estimate the parameters a mathematical model from observational data containing outliers
  45. Tutorial of Geometric Solvers for Reconstruction And Pose estimation (GSRAP)

    Random Sample Consensus (RANSAC) • The number of trials 𝑁 required to obtain a correct solution with probability 𝑝 by selecting only inliers from the observed data One of the iterative algorithms to estimate the parameters a mathematical model from observational data containing outliers 𝑁 = log(1 − 𝑝) log(1 − 𝑟𝑠) 𝑠 : Number of observed data points used for model computation 𝑟 : Inlier ratio of observed data 𝑟𝑠 = 1 − 1 − 𝑝 1 𝑁 𝑟𝑠: Probability of sampling only inliers allowing duplicates 48
  46. Tutorial of Geometric Solvers for Reconstruction And Pose estimation (GSRAP)

    Random Sample Consensus (RANSAC) Nearest neighbor search of local feature points Outlier removal using 8-point algorithm and RANSAC Examples of local feature matching (The original images were sourced from the Fountain-P11 dataset) Example: 8-point algorithm 1. Select 8 random points from nearest neighbor pairs to compute F 2. Compute the epipolar line 𝐥 using F, for each correspondence 3. Check inliers based on distance to 𝐥 within a threshold 4. After 𝑁 iterations through steps 1-3, compute the final ෠ F using least squares from the best F’s inliers 49
  47. Tutorial of Geometric Solvers for Reconstruction And Pose estimation (GSRAP)

    Solvers implemented in GSRAP (To be added) Estimation of relative pose from 2D-2D point correspondences • Nister’s 5-point algorithm (𝑛 ≥ 5, Essential matrix E) • D. Nistér, “An efficient solution to the five-point relative pose problem”, TPAMI, Vol.26, Issue 6, pp.756–777, 2004 Estimation of absolute pose R, 𝐭 from 2D-3D point correspondences • Ke’s P3P algorithm (𝑛 = 3) • T. Ke, S. I. Roumeliotis, “An Efficient Algebraic Solution to the Perspective-Three-Point Problem”, pp.7225-7233, CVPR, 2017 • EPnP: Efficient Perspective-n-Point Camera Pose Estimation (𝑛 ≥ 4) • V. Lepetit, F. Moreno-Noguer, P. Fua, “EPnP: An Accurate O(n) Solution to the PnP Problem”, IJCV, 2009 Estimation of similarity transformations R, 𝐭, s from 3D-3D point correspondences • Umeyama algorithm (𝑛 ≥ 3) • S. Umeyama, “Least-Squares Estimation of Transformation Parameters Between Two Point Patterns”, pp. 376-380, Vol.13, TPAMI, 1991 ※ We implemented Nister’s 5-point algoritm and EPnP withOpenGV, and Ke’s P3P algorithm with OpenCV. • L. Kneip, P. Furgale, "OpenGV: A unified and generalized approach to real-time calibrated geometric vision", ICRA, May 2014 • OpenCV, https://github.com/opencv/opencv 51 Template-based RANSAC allows consistent implementation for all solvers
  48. Tutorial of Geometric Solvers for Reconstruction And Pose estimation (GSRAP)

    RANSAC parameters gsrap-def.h The probability of sampling only inliers 𝐩 Maximum number of sampling Minimum number of sampling A flag that terminates the iteration once it's determined that the probability of only sampling inliers, denoted as 𝑟𝑠, exceeds a specified probability 𝑝, where 𝑟 is the proportion of inliers calculated based on the best model at that time A flag for calculating the number of trials 𝑁 using non- duplicated sampling. (If the sample size is sufficiently large, approximation is possible.) 53
  49. Tutorial of Geometric Solvers for Reconstruction And Pose estimation (GSRAP)

    RANSAC algorithm gsrap-def.h Estimate model from the sampled data Randomly sample the minimum number of data required for model estimation Terminate if the number of sampling iterations exceeds the pre-set maximum 54
  50. Tutorial of Geometric Solvers for Reconstruction And Pose estimation (GSRAP)

    RANSAC algorithm ransac.h Determine the inlier data for a model estimated from any given sample set 55
  51. Tutorial of Geometric Solvers for Reconstruction And Pose estimation (GSRAP)

    RANSAC algorithm ransac.h The number of the inliers for the best model at that moment Without duplication: 𝑞no_duplicate = 𝐶𝑥 𝑀 × ⋯× 𝐶− 𝑠−1 𝑀− 𝑠−1 Where: 𝑀:Total population size 𝐶: Number of inliers in the population 𝑠: Sample size Calculate the probability 𝑞 of sampling only inliers based on the proportion of inliers in the best model With duplication: 𝑞duplicate = 𝑟𝑠 Note: Only the calculation of 𝑞duplicate considers duplication (in practice, sampling is done without duplication). Strictly speaking, the no-duplication method is more accurate. However, since the sample size of the population 𝑀 is generally large, 𝑞no_duplicate is approximately equal to 𝑞duplicate . 56
  52. Tutorial of Geometric Solvers for Reconstruction And Pose estimation (GSRAP)

    RANSAC algorithm ransac.h If itr_upperlimit is greater than 𝑁, update it. (This means that as the best model is updated and the inlier proportion 𝑟 increases, fewer sampling iterations are required.) If the calculate 𝑞 is greater than the probability of sampling only inliers derived from the current upper limit itr_upperlimit, then compute sample count 𝑁. Calculate the required sampling count 𝑁 from the probability 𝑞 (preventing division by zero and overflow in the computation of 𝑁) 57
  53. Tutorial of Geometric Solvers for Reconstruction And Pose estimation (GSRAP)

    Source code explanation: Estimation of relative pose from 2D-2D point correspondences 58
  54. Tutorial of Geometric Solvers for Reconstruction And Pose estimation (GSRAP)

    Recovery of R, 𝐭 from E essential_solver.cc Calculate the essential matrix E using Nister’s 5-point algorithm (Input: Direction vectors of feature points in each image, namely bearings1 and bearings2) Normalize E to set the scale of the translation vector to 1 59 Verify that the first and second singular values are equal, and the third singular value is zero, taking errors into account ∵ SVD E = U diag 1,1,0 VT Perform Singular Value Decomposition (SVD) Define the lambda function to determine whether the determinant of the 3 × 3 matrix M is 1 or not, taking errors into account.
  55. Tutorial of Geometric Solvers for Reconstruction And Pose estimation (GSRAP)

    Recovery of R, 𝐭 from E essential_solver.cc 60 If either U or VT represents a reflection transformation (i.e., det ∙ < 0), the right-hand and left-hand coordinate systems will swap. Adjust using the last element of W. 𝐭 = 𝐮3 or -𝐮3 R | 𝐭 = UWVT| +𝐮3 or UWVT| −𝐮3 or UWTVT| +𝐮3 or UWTVT| −𝐮3 From the four possible R | 𝐭 decomposition of E, select the correct solution based on the condition that “3D points are reconstructed in front of both cameras”
  56. Tutorial of Geometric Solvers for Reconstruction And Pose estimation (GSRAP)

    Estimation of E using Nister’s five points algorithm example_essential_solver.cc Set the parameters for RANSAC Note: std::thread::hardware_concurrency() is a function to retrieve the number of threads supported by the system Compute the E matrix Set the parameters for RANSAC Set a function to check if the eigen values of the E matrix meet the conditions. Generate correspondence information without mismatches 61 Generate camera poses and corresponding points for simulation • Center coordinates and radius of the 3D point cloud • Generate the 3D point cloud • Set the camera poses for the two viewpoints • Convert the 3D point clouds3D into the direction vectors in each camera’s coordinate system
  57. Tutorial of Geometric Solvers for Reconstruction And Pose estimation (GSRAP)

    Source code explanation: Estimation of absolute pose from 2D-3D point correspondences 62
  58. Tutorial of Geometric Solvers for Reconstruction And Pose estimation (GSRAP)

    Estimation of R, 𝐭 using P3P example_p3p_solver.cc Set the parameters for RANSAC Compute R, 𝐭 Set the parameters for the P3P solver Generate correspondence information without mismatches Generate camera poses and corresponding points for simulation • Center coordinates and radius of the 3D point cloud • Generation of the 3D point cloud • Setting the camera poses for the two viewpoints • Convert the 3D point cloud to direction vectors in each camera coordinate system Declare for inlier determination based on bearing vector angular errors (for ‘PnpInlierCheckParamsUsingProjectedPoint’, it‘s based on reprojection error in image space) 63
  59. Tutorial of Geometric Solvers for Reconstruction And Pose estimation (GSRAP)

    Estimation of R, 𝐭 using EPnP example_pnp_solver.cc 64 Set the parameters for RANSAC Compute R, 𝐭 Set the parameters for the PnP solver Generate correspondence information without mismatches Generate camera poses and corresponding points for simulation • Center coordinates and radius of the 3D point cloud • Generation of the 3D point cloud • Setting the camera poses for the two viewpoints • Convert the 3D point cloud to direction vectors in each camera coordinate system Declare for inlier determination based on bearing vector angular errors (for ‘AssumePerspectiveCamera’, it‘s based on reprojection error in image space)
  60. Tutorial of Geometric Solvers for Reconstruction And Pose estimation (GSRAP)

    Source code explanation: Estimation of rigid and similarity transformations from 3D-3D point correspondences 65
  61. Tutorial of Geometric Solvers for Reconstruction And Pose estimation (GSRAP)

    Umeyama algorithm sim3_solver.cc Calculate the centroid of each point cloud: 𝛍𝑥 = 1 𝑛 σ𝑖=1 𝑛 𝐱𝑖 , 𝛍𝑦 = 1 𝑛 σ𝑖=1 𝑛 𝐲𝑖 Shift the origin of the coordinate system of each point cloud to its centroid: 𝐱𝑖 − 𝛍𝑥 , 𝐲𝑖 − 𝛍𝑦 Compute the covariance matrix: Σ𝑥𝑦 = 1 𝑛 σ𝑖=1 𝑛 𝐲𝑖 − 𝛍𝑦 𝐱𝑖 − 𝛍𝑥 T SVD Σ𝑥𝑦 = UDVT D = diag 𝑑𝑖 , 𝑑1 ≥ 𝑑2 ≥ ⋯ ≥ 𝑑𝑚 ≥ 0 If rank Σ𝑥𝑦 ≥ 2, an optimal solution is uniquely determined Ensure the determinant of the rotation matrix (right-hand coordinate system) is 1. If det Σ𝑥𝑦 = −1, it's a reflection transformation. S = ቊ 𝐼 diag 1,1, ⋯, 1, −1 if det Σ𝑥𝑦 = 1 if det Σ𝑥𝑦 = −1 66
  62. Tutorial of Geometric Solvers for Reconstruction And Pose estimation (GSRAP)

    Umeyama algorithm sim3_solver.cc Compute the rotation matrix: R = USVT Compute the scale: s = 1 𝜎𝑥 2 tr DS Compute the translation vector: 𝐭 = 𝛍𝑦 − 𝑠R𝛍𝑥 67
  63. Tutorial of Geometric Solvers for Reconstruction And Pose estimation (GSRAP)

    Estimation of Sim 3 using Umeyama algorithm example_sim3_solver.cc True rotation matrix Rgt : Rotation axis 1 1 1 T, rotation angle 𝜋 4 True scale 𝑠gt True translation vector 𝐭gt Generate 500 random points P1 , within a cube centered at 0 0 0 T with edges of length 1 Transform point cloud P1 using Rgt , 𝐭gt , 𝑠gt to generate point cloud P2 Generate correspondence information without mismatches 68
  64. Tutorial of Geometric Solvers for Reconstruction And Pose estimation (GSRAP)

    Estimation of Sim 3 using Umeyama algorithm example_sim3_solver.cc Optimize using least squares method with inliers only Set the parameters for RANSAC Calculate R, 𝐭, and s Set the the threshold for inlier determination 69
  65. Tutorial of Geometric Solvers for Reconstruction And Pose estimation (GSRAP)

    Summary • Basics of Camera Geometry • Coordinate transformation and camera projection model • Epipolar geometry • Fundamental Matrix F , Essential Matrix E • Estimation of relative pose, absolute pose, and rigid/similarity transformations using 2D-2D, 2D-3D, and 3D-3D correspondences • Algorithm Explanation • P3P, Umeyama Algorithm • Random Sample Consensus (RANSAC) • Template-based RANSAC allows consistent implementation for all solvers • Source code explanation • RANSAC, Estimation of E matrix and recovery of R, 𝐭, P3P, EPnP, Umeyama Algorithm and Sim(3) estimation
  66. Tutorial of Geometric Solvers for Reconstruction And Pose estimation (GSRAP)

    Acknowledgements In preparing this document, we would like to express our heartfelt gratitude to Professors Richard Hartley and Andrew Zisserman, authors of the seminal reference book “Multiple View Geometry in Computer Vision,” which offers a detailed explanation of the fundamentals of multi-view geometry. We also wish to extend our deep appreciation to Professor Cyrill Stachniss for sharing his invaluable computer vision lecture materials. Additionally, we are indebted to Dr. Gaku Nakano for granting us the privilege to edit and cite his tables and figures. Finally, our profound thanks go to the numerous individuals who consistently extend their unwavering support.