Song Liangchen, Chen Anpei, Li Zhong, Chen Zhang, Chen Lele, Yuan Junsong, Xu Yi, Geiger Andreas
IEEE Trans Vis Comput Graph. 2023 May;29(5):2732-2742. doi: 10.1109/TVCG.2023.3247082. Epub 2023 Mar 29.
Visually exploring in a real-world 4D spatiotemporal space freely in VR has been a long-term quest. The task is especially appealing when only a few or even single RGB cameras are used for capturing the dynamic scene. To this end, we present an efficient framework capable of fast reconstruction, compact modeling, and streamable rendering. First, we propose to decompose the 4D spatiotemporal space according to temporal characteristics. Points in the 4D space are associated with probabilities of belonging to three categories: static, deforming, and new areas. Each area is represented and regularized by a separate neural field. Second, we propose a hybrid representations based feature streaming scheme for efficiently modeling the neural fields. Our approach, coined NeRFPlayer, is evaluated on dynamic scenes captured by single hand-held cameras and multi-camera arrays, achieving comparable or superior rendering performance in terms of quality and speed comparable to recent state-of-the-art methods, achieving reconstruction in 10 seconds per frame and interactive rendering. Project website: https://bit.ly/nerfplayer.
在虚拟现实(VR)中自由地在真实世界的4D时空空间中进行视觉探索一直是一个长期追求的目标。当仅使用少数甚至单个RGB相机来捕捉动态场景时,这项任务尤其具有吸引力。为此,我们提出了一个高效的框架,该框架能够进行快速重建、紧凑建模和可流式渲染。首先,我们建议根据时间特征对4D时空空间进行分解。4D空间中的点与属于三类的概率相关联:静态、变形和新区域。每个区域由一个单独的神经场表示并进行正则化。其次,我们提出了一种基于混合表示的特征流方案,用于有效地对神经场进行建模。我们的方法名为NeRFPlayer,在由单手持相机和多相机阵列捕获的动态场景上进行了评估,在质量和速度方面实现了与最近的最新方法相当或更优的渲染性能,实现了每秒10帧的重建和交互式渲染。项目网站:https://bit.ly/nerfplayer 。