IEEE Trans Vis Comput Graph. 2022 Nov;28(11):3854-3864. doi: 10.1109/TVCG.2022.3203102. Epub 2022 Oct 21.
Virtual Reality (VR) is becoming ubiquitous with the rise of consumer displays and commercial VR platforms. Such displays require low latency and high quality rendering of synthetic imagery with reduced compute overheads. Recent advances in neural rendering showed promise of unlocking new possibilities in 3D computer graphics via image-based representations of virtual or physical environments. Specifically, the neural radiance fields (NeRF) demonstrated that photo-realistic quality and continuous view changes of 3D scenes can be achieved without loss of view-dependent effects. While NeRF can significantly benefit rendering for VR applications, it faces unique challenges posed by high field-of-view, high resolution, and stereoscopic/egocentric viewing, typically causing low quality and high latency of the rendered images. In VR, this not only harms the interaction experience but may also cause sickness. To tackle these problems toward six-degrees-of-freedom, egocentric, and stereo NeRF in VR, we present the first gaze-contingent 3D neural representation and view synthesis method. We incorporate the human psychophysics of visual- and stereo-acuity into an egocentric neural representation of 3D scenery. We then jointly optimize the latency/performance and visual quality while mutually bridging human perception and neural scene synthesis to achieve perceptually high-quality immersive interaction. We conducted both objective analysis and subjective studies to evaluate the effectiveness of our approach. We find that our method significantly reduces latency (up to 99% time reduction compared with NeRF) without loss of high-fidelity rendering (perceptually identical to full-resolution ground truth). The presented approach may serve as the first step toward future VR/AR systems that capture, teleport, and visualize remote environments in real-time.
虚拟现实 (VR) 随着消费者显示器和商业 VR 平台的兴起而变得无处不在。此类显示器需要具有低延迟和高质量的合成图像渲染功能,同时计算开销要低。神经渲染的最新进展表明,通过虚拟或物理环境的基于图像的表示,可以在 3D 计算机图形学中解锁新的可能性。具体来说,神经辐射场 (NeRF) 证明,通过无视图相关效果损失的方式,可以实现 3D 场景的逼真质量和连续视图变化。虽然 NeRF 可以极大地受益于 VR 应用程序的渲染,但它面临着由高视场、高分辨率和立体/自我中心视图引起的独特挑战,通常会导致渲染图像的质量低和延迟高。在 VR 中,这不仅会损害交互体验,还可能导致不适。为了解决这些问题,我们针对六自由度、自我中心和立体 NeRF 在 VR 中的应用,提出了第一个基于注视的 3D 神经表示和视图合成方法。我们将视觉和立体敏锐度的人类心理物理学纳入到 3D 场景的自我中心神经表示中。然后,我们共同优化延迟/性能和视觉质量,同时相互桥接人类感知和神经场景合成,以实现感知高质量的沉浸式交互。我们进行了客观分析和主观研究来评估我们方法的有效性。我们发现,与 NeRF 相比,我们的方法可以显著降低延迟(高达 99%的时间减少),而不会损失高保真度渲染(在感知上与全分辨率真实感相同)。所提出的方法可能是未来 VR/AR 系统的第一步,这些系统可以实时捕获、传输和可视化远程环境。