Nguyen-Ha Phong, Huynh Lam, Rahtu Esa, Matas Jiri, Heikkila Janne
IEEE Trans Pattern Anal Mach Intell. 2024 May;46(5):2758-2769. doi: 10.1109/TPAMI.2023.3335311. Epub 2024 Apr 3.
We present CG-NeRF, a cascade and generalizable neural radiance fields method for view synthesis. Recent generalizing view synthesis methods can render high-quality novel views using a set of nearby input views. However, the rendering speed is still slow due to the nature of uniformly-point sampling of neural radiance fields. Existing scene-specific methods can train and render novel views efficiently but can not generalize to unseen data. Our approach addresses the problems of fast and generalizing view synthesis by proposing two novel modules: a coarse radiance fields predictor and a convolutional-based neural renderer. This architecture infers consistent scene geometry based on the implicit neural fields and renders new views efficiently using a single GPU. We first train CG-NeRF on multiple 3D scenes of the DTU dataset, and the network can produce high-quality and accurate novel views on unseen real and synthetic data using only photometric losses. Moreover, our method can leverage a denser set of reference images of a single scene to produce accurate novel views without relying on additional explicit representations and still maintains the high-speed rendering of the pre-trained model. Experimental results show that CG-NeRF outperforms state-of-the-art generalizable neural rendering methods on various synthetic and real datasets.
我们提出了CG-NeRF,一种用于视图合成的级联且可泛化的神经辐射场方法。最近的可泛化视图合成方法可以使用一组附近的输入视图渲染高质量的新视图。然而,由于神经辐射场均匀点采样的性质,渲染速度仍然很慢。现有的特定场景方法可以高效地训练和渲染新视图,但不能泛化到未见数据。我们的方法通过提出两个新颖的模块来解决快速和可泛化视图合成的问题:一个粗略辐射场预测器和一个基于卷积的神经渲染器。这种架构基于隐式神经场推断一致的场景几何,并使用单个GPU高效地渲染新视图。我们首先在DTU数据集的多个3D场景上训练CG-NeRF,该网络仅使用光度损失就能在未见的真实和合成数据上生成高质量且准确的新视图。此外,我们的方法可以利用单个场景的更密集参考图像集来生成准确的新视图,而无需依赖额外的显式表示,并且仍然保持预训练模型的高速渲染。实验结果表明,CG-NeRF在各种合成和真实数据集上优于现有的可泛化神经渲染方法。