Under zoom in, 3D Gaussian Splatting exhibits significant erosion artefacts, while under zoom out, it undergoes dramatic expansion. Mip-Splatting utilizes 3D smoothing and 2D Mip filters to regularize primitives while training. In contrast, our method is training-free and maintains scale consistency using a 2D scale-adaptive filter. We also use super-sampling and integration to obtain more accurate results when zooming out.
In this paper, we present a Scale-adaptive method for Anti-aliasing Gaussian Splatting (SA-GS). While the state-of-the-art method Mip-splatting needs modifying the training procedure of Gaussian splatting, our method functions at test-time and is training-free. Specifically, SA-GS can be applied to any pretrained Gaussian splatting field as a plugin to significantly improve the field's anti-aliasing performance. The core technique is to apply 2D scale-adaptive filters to each Gaussian during test time. As pointed out by Mip-Splatting, observing Gaussians at different frequencies readily leads to mismatches between the Gaussian scales during training and testing. Mip-Splatting resolves this issue using 3D smoothing and 2D Mip filters, which are unfortunately not aware of testing frequency. In this work, we show that a 2D scale-adaptive filter that is informed of testing frequency can effectively match the Gaussian scale, thus making the Gaussian field remain distributed consistently across different testing frequencies. When scale inconsistency is eliminated, sampling rates smaller than the scene frequency result in the appearance of conventional jaggedness, and we propose to integrate the projected 2D Gaussian of each pixel during testing. This integration is actually a limiting case of super-sampling, but elegantly addresses the heavy computational overhead of super-sampling while achieving significantly improved anti-aliasing performance over vanilla Gaussian Splatting. Through extensive experiments using various settings and both bounded and unbounded scenes, we show SA-GS performs comparably with Mip-Splatting and significantly outperforms Mip-Splatting if the memory overhead of super-sampling is acceptable. Note that super-sampling is only effective when our scale-adaptive filtering is activated.
All Gaussian Splatting models share this framework for training and rendering, but different models use different strategies to constrain Gaussian primitives. During training, 3DGS uses (c) in pixel space for stability training, but results in scale inconsistencies at different rendering settings; Mip-Splatting utilises (a) to restrict the upper Gaussian frequency limit in 3D space, (b) to emulate box filtering in pixel space. But Mip-Splatting still suffers from scale inconsistency and requires retraining. Our approach is training-free and only operates on the testing flow. We use (d) in pixel space to maintain the scale consistency of the Gaussian primitives, and further enhance the anti-aliasing capability of 3DGS by applying (e) and (f) to the alpha-blending process.
The 2D dilation operates on the screen space and aims to ensure a positive characterization of the 2D Gaussian covariance. However, a fixed 2D dilation can result in scale ambiguities when representing the same scene at different rendering settings, as shown by the green expansion area. (a) When the Gaussian scale is held constant and the resolution changes, the dilation scale changes inconsistently. (b) When the Gaussian scale changes and the resolution remains constant, the dilation scale does not change with the Gaussian. Our 2D scale-adaptive filter ensures that the Gaussian distribution remains consistent across different rendering settings, as shown by the red expansion area. This keeps the scale consistent with the training setup.
Our super sampling method, denoted as (a), involves dividing each pixel thread into 9 sub-pixels when traversing the base-ordered Gaussian within a tile. Each sub-pixel independently undergoes alpha-blending and weights the Gaussian spherical harmonic coefficient according to the sampling results. (b) is our integration method that diagonalizes the Gaussian covariance matrix by pixel rotation. This decomposes the integration result into the product of two marginal Gaussian distributions.
Replacing the 2D dilation of 3DGS with an EWA (elliptical weighted average) filter, denoted as 3DGS + EWA, reduces the dilation and erosion artifacts. However, it produces high-frequency artifacts when zooming in, while our method is free of such artifacts, as shown in the following comparisons.
Here, we show more comparisons with 3DGS + EWA. Both models are trained with downsampled images with factor 8 and render at higher-resolution. GT (Training resolution) is the image we used for training but bilinearly upsampled to higher-resolution for reference and GT (8x resolution) is the real GT image we used for evaluation.
Our 2D Mip filter simulates a 2D box filter in physical imaging process. It approximates exactly 1 pixel in screen space, thus effectively reducing aliasing artifacts. As shown in the following video, removing the 2D Mip filter results in aliasing artifacts when zooming out.
The 3D smoothing filter constrains the size of the 3D Gaussian primitives based on the maximal sampling frequency induced by the training views, eliminating high frequency artifacts when zooming in. In the following comparisons, we train the models with downsampled images and render high resolution images to simulate zoom-in effects. Excluding the 3D smoothing filter results in high-frequency artifacts. Note that both models are trained with downsampled images with factor 8 and render at higher-resolution. GT (Training resolution) is the image we used for training but bilinearly upsampled to higher-resolution for reference and GT (8x resolution) is the real GT image we used for evaluation.
@article{Yu2023MipSplatting,
author = {Yu, Zehao and Chen, Anpei and Huang, Binbin and Sattler, Torsten and Geiger, Andreas},
title = {Mip-Splatting: Alias-free 3D Gaussian Splatting},
journal = {arXiv:2311.16493},
year = {2023},
}