of the formula II. Extension to the original paper’s approach III. How to tame corner cases II. GPU Implementation I. A failure II. Introduce my implementation Deep Dive into Gaussian Splatting
the color or transparency at a point along the view ray, no handling of multiple volume overlaps 𝒄1, 𝛼1 𝒄2, 𝛼2 𝒄3, 𝛼3 𝛼𝑖: Transparency of the splat along the ray 𝒄𝑖: Color of the splat along the ray 𝑖: Index of a splat 𝐶 = 𝒄1 𝛼1 + 1 − 𝛼1 𝒄2 𝛼2 + 1 − 𝛼1 1 − 𝛼2 𝒄3 𝛼3 Just like ordinary alpha blending 3D Gaussian Splatting for Real-Time Radiance Field Rendering
from a limited point set 1. Let’s have an arbitrary unit vector: 𝑉 2. Take the projected quantity: zi = 𝑉 𝑥 ⋅ 𝑥𝑖 + 𝑉 𝑦 ⋅ 𝑦𝑖 3. Calculate the variance of zi An arbitrary unit vector: 𝑉 (𝑥1 , y1 ) (𝑥2 , y2 ) (𝑥3 , y3 )
0 1 0 0 0 1 𝑀𝑇 = 𝑅𝑆𝑆𝑇𝑅𝑇 where S as a scaling matrix, R is a rotation matrix 3D Gaussian Splatting for Real-Time Radiance Field Rendering 𝑠𝑥 0 0 0 𝑠𝑦 0 0 0 𝑠𝑧 Quaternion 𝑞𝑟 𝑞𝑖 𝑞𝑗 𝑞𝑘 Optimize them by a differentiable renderer Σ𝑏𝑎𝑑 = −0.2 0.2 0.2 −0.1 An example of an invalid case
conventional perspective transform • Transform the covariance matrix Σ ? • Think about mapping between screen space and world space • Perspective transform is not a linear transform • Apply an approximated linear transform instead 3D Gaussian Splatting for Real-Time Radiance Field Rendering EWA Volume Splatting
is not too large on almost cases • Large splat on the edge may have noticeable errors • Totally fine for small splats • But one more critical corner case… Worst case scenario that I found
view space +x -x (#^ω^) 𝑱 = − 1 𝑝𝑧 0 𝑝𝑥 𝑝𝑧 2 0 − 1 𝑝𝑧 𝑝𝑦 𝑝𝑧 2 The direct reasons are those terms But they work well in general cases, thus, I’d recommend doing an adhoc culling with 𝜇 for such splats The view frustum A rendering example. A splat appears but it is total off from the view frustum
be temporally unstable • Also numerically unstable • The original paper’s implementation chose an ad hoc but simple antialiasing A small splat in a pixel 𝒙 − 𝝁 𝑇Σ−1 𝒙 − 𝝁 x as an evaluation point https://github.com/graphdeco-inria/diff-gaussian- rasterization/blob/59f5f77e3ddbac3ed9db93ec2cfe99ed6c5d121d/cuda_rasterizer/forward.cu#L110-L111 Adding a constant value to the screen space covariance matrix?
tile Evaluate Gaussian kernel and calculate color for each splat for each pixel 6 FAST DIFFERENTIABLE RASTERIZER FOR GAUSSIANS 3D Gaussian Splatting for Real-Time Radiance Field Rendering 3D to 2D perspective projection of splats Store splats to the list on 16 x 16 tiles
• So just take a splat list and sort them per tile on shared memory? Holy smoke! Faster Particle Rendering using Direct Compute by Gareth Thomas However, it didn’t work • List of splats per tile is often quite long… • Just slow for large sorting via bitonic sort • It is easy to overflow from shared memory on sorting • Decrease tile size to reduce the number of splats? • No. It increases global atomicAdd, which is too expensive
Store splats to the list on 64 x 64 tiles • SH evaluation ( approximation ) Z sorting of splats for each tile • Onesweep radix sort, just once Tile Index Depth 8-bit 32-bit Index of a splat : Key : Value Tile ( 0, 0 ) Tile ( 1, 0 ) Tile ( 2, 0 ) … Sort Evaluate Gaussian kernel and calculate color for each splat for each pixel Evaluate splats for each tile/pixels Use atomic counters on shared memory
• 32x32 for each tile for 2048x1536 resolution ( rounded up to the multiple of 16 ) • A splat can be evaluated by many unrelated pixels!! • Do additional culling for each 8x8 tile ( x2 faster ) 32 px 32 px Waste of splat evaluations 8 px 8 px Load splats for each GPU block block size = 16x16 Culling unrelated splats for each 8x8 micro tiles Results as bit mask Evaluate valid splats __ffs to find set bit
conservative • AABB-Ellipse test? • No, there is a simpler way 8x8 tile Idea: Find a maximum Gaussian kernel value in an AABB can tell if we can cull the splat accurately Proposed by StopThePop (SIGGRAPH 2024) paper A maximum Gaussian kernel value in an AABB 𝒙 − 𝝁 𝑇Σ−1 𝒙 − 𝝁 A minimum value of 𝐺𝑔𝑠 𝒙 = exp(− 1 2 𝒙 − 𝝁 𝑇Σ−1 𝒙 − 𝝁 ) ( Mahalanobis distance)^2
direction closer An insight: The minimum point must be found along the two lines ☺ x = clamp( 𝜇𝑥, box left, box right ), y = clamp( 𝜇𝑦, box left, box right ) The minimum point must be on the visible edge from 𝜇, not somewhere on the behind ※ if the 𝜇 is outside closer
• Easy to control its behavior • Can be optimized by a simple idea/formula • GPU implementation • The white-box splat representation helps to optimize the implementation on the GPU too • Thank you, linear algebra
• EWA Volume Splatting • A geometric interpretation of the covariance matrix • Holy smoke! Faster Particle Rendering using Direct Compute by Gareth Thomas • 驚くほどキレイな三次元シーン復元、「3D Gaussian Splatting」を徹底的に解説 する