Mi Zhenxing, Yin Ping, Xiao Xue, Xu Dan
IEEE Trans Pattern Anal Mach Intell. 2025 Aug 27;PP. doi: 10.1109/TPAMI.2025.3603305.
Recent Neural Radiance Field (NeRF) methods on large-scale scenes have demonstrated promising results and underlined the importance of scene decomposition for scalable NeRFs. Although these methods achieved reasonable scalability, there are several critical problems remaining unexplored in the existing large-scale NeRF modeling methods, i.e., learnable decomposition, modeling scene heterogeneity, and modeling efficiency. In this paper, we introduce Switch-NeRF++, a Heterogeneous Mixture of Hash Experts (HMoHE) network that addresses these challenges within a unified framework. Our framework is a highly scalable NeRF that learns heterogeneous decomposition and heterogeneous Neural Radiance Fields efficiently for large-scale scenes in an end-to-end manner. In our framework, a gating network learns to decompose scenes into partitions and allocates 3D points to specialized NeRF experts. This gating network is co-optimized with the experts by our proposed Sparsely Gated Mixture of Experts (MoE) NeRF framework. Our network architecture incorporates a hash-based gating network and distinct heterogeneous hash experts. The hash-based gating efficiently learns the decomposition of the large-scale scene. The distinct heterogeneous hash experts consist of hash grids of different resolution ranges. This enables effective learning of the heterogeneous representation of different decomposed scene parts within large-scale complex scenes. These design choices make our framework an end-to-end and highly scalable NeRF solution for real-world large-scale scene modeling to achieve both quality and efficiency. We evaluate our accuracy and scalability on existing large-scale NeRF datasets. Additionally, we also introduce a new dataset with very large-scale scenes ($ \gt 6.5km^{2}$) from UrbanBIS. Extensive experiments demonstrate that our approach can be easily scaled to various large-scale scenes and achieve state-of-the-art scene rendering accuracy. Furthermore, our method exhibits significant efficiency gains, with an 8x acceleration in training and a 16x acceleration in rendering compared to the best-performing competitor Switch-NeRF. The codes and trained models will be released in https://github.com/MiZhenxing/Switch-NeRF.