Efficient Perspective-Correct 3D Gaussian Splatting
Using Hybrid Transparency
arXiv
Abstract
3D Gaussian Splats (3DGS) have proven a versatile rendering primitive, both for inverse rendering as well as real-time exploration
of scenes. In these applications, coherence across camera frames and multiple views is crucial, be it for robust convergence of a
scene reconstruction or for artifact-free fly-throughs. Recent work started mitigating artifacts that break multi-view coherence,
including popping artifacts due to inconsistent transparency sorting and perspective-correct outlines of (2D) splats. At the same
time, real-time requirements forced such implementations to accept compromises in how transparency of large assemblies of
3D Gaussians is resolved, in turn breaking coherence in other ways.
In our work, we aim at achieving maximum coherence, by rendering fully perspective-correct 3D Gaussians while using a high-quality
approximation of accurate blending, hybrid transparency, on a per-pixel level, in order to retain real-time frame rates.
Our fast and perspectively accurate approach for evaluation of 3D Gaussians does not require matrix inversions, thereby ensuring
numerical stability and eliminating the need for special handling of degenerate splats, and the hybrid transparency formulation
for blending maintains similar quality as fully resolved per-pixel transparencies at a fraction of the rendering costs.
We further show that each of these two components can be independently integrated into Gaussian splatting systems.
In combination, they achieve up to 2× higher frame rates, 2× faster optimization, and equal or better image quality
with fewer rendering artifacts compared to traditional 3DGS on common benchmarks.
Accurate Splat Bounding and Evaluation
Although the affine approximation 3DGS uses for the projection of 3D Gaussians onto the image plane performs well on benchmark datasets,
it fails to model perspective distortion correctly, especially when parts of the scene are viewed at close distances.
The result are visually disturbing artifacts, where the projected Gaussians take on extreme, distorted shapes, severely affecting the rendering quality.
We propose a fast, differentiable method for perspective-accurate 3D Gaussian splat evaluation at the point of maximum contribution
along per-pixel viewing rays that avoids matrix inversion entirely by extending established techniques [SWBG06, WHA*07].
The perspectively correct screen-space bounding box of a splat (a) is given by the projection of its bounding frustum in view space (b). When transformed into local splat coordinates, the frustum planes align with tangential planes of the unit sphere (c). Our approach for splat evaluation along viewing rays makes use of the Plücker coordinate representation ( 𝒅 : 𝒎). In local splat coordinates, the point along the ray that maximizes the Gaussian’s value corresponds to the point 𝒙 that minimizes the perpendicular distance ∥𝒙∥ to the origin (d). Parts (a-c) courtesy of Weyrich et al. [WHA*07]; used with permission.
Temporally-Stable Rendering via Hybrid Transparency
We propose to use the established rendering paradigm of Hybrid Transparency [MCTB13], which provides high quality and performance while avoiding the global depth presorting used in 3DGS.
By alpha-blending the first 𝐾 fragments (called the core) in correct depth-order per pixel and accumulating remaining contributions (the tail) using an order-independent residual, our method mitigates popping artifacts while maintaining superior performance.
Visual comparisons for different model configurations regarding our hybrid transparency approach. Using a smaller core size 𝐾 causes issues for reflective surfaces, as radiance fields commonly model these using semi-transparency. Disabling the order-independent tail only slightly reduces quality, especially in the sky, whereas not using it during optimization results in catastrophic failure.
Quantitative Results
Quantitative comparisons on the Mip-NeRF360 and Tanks and Temples datasets. Our approach of using perspectively correct splat evaluation in combination with hybrid transparency significantly reduces training and rendering times with image quality being similar to the baselines’. Excluding Zip-NeRF, the three best results are highlighted in green in descending order of saturation.
Visual Comparisons
Concurrent Work
EVER also addresses limitations of 3D Gaussian splatting. It shows that the 3D Gaussians can be replaced with constant density ellipsoids to allow for the use of exact volume rendering. Compared to our approach, their rendering is slower as they use a ray tracing framework but the volumetric rendering improves image quality significantly. Similarly check out the recent Taming 3DGS, which proposes a controllable densification strategy alongside multiple improvements to reduce training times. We believe that future work could combine these ideas with our approach for even better results.
Citation
@article{hahlbohm2024htgs,
title={Efficient Perspective-Correct 3D Gaussian Splatting Using Hybrid Transparency},
author={Florian Hahlbohm and Fabian Friederichs and Tim Weyrich and Linus Franke and Moritz Kappel and Susana Castillo and Marc Stamminger and Martin Eisemann and Marcus Magnor},
journal={arXiv},
year={2024}
}
Acknowledgements
We would like to thank Timon Scholz and Carlotta Harms for their help with comparisons and the supplemental material.
The authors gratefully acknowledge financial support from the German Research Foundation (DFG) for the projects “Real-Action VR” (ID 523421583) and “Increasing Realism of Omnidirectional Videos in Virtual Reality” (ID 491805996), as well as from the L3S Research Center, Hanover, Germany.
Linus Franke was supported by the 5G innovation program of the German Federal Ministry for Digital and Transport under the funding code 165GU103B.
All scenes shown above are from the Mip-NeRF360 and Tanks and Temples datasets. The website template was adapted from Zip-NeRF. For the comparison sliders, we use img-comparison-slider and the video comparison tool from Ref-NeRF.