IEEE Trans Vis Comput Graph. 2022 May;28(5):2157-2167. doi: 10.1109/TVCG.2022.3150522. Epub 2022 Apr 8.
Media streaming, with an edge-cloud setting, has been adopted for a variety of applications such as entertainment, visualization, and design. Unlike video/audio streaming where the content is usually consumed passively, virtual reality applications require 3D assets stored on the edge to facilitate frequent edge-side interactions such as object manipulation and viewpoint movement. Compared to audio and video streaming, 3D asset streaming often requires larger data sizes and yet lower latency to ensure sufficient rendering quality, resolution, and latency for perceptual comfort. Thus, streaming 3D assets faces remarkably additional than streaming audios/videos, and existing solutions often suffer from long loading time or limited quality. To address this challenge, we propose a perceptually-optimized progressive 3D streaming method for spatial quality and temporal consistency in immersive interactions. On the cloud-side, our main idea is to estimate perceptual importance in 2D image space based on user gaze behaviors, including where they are looking and how their eyes move. The estimated importance is then mapped to 3D object space for scheduling the streaming priorities for edge-side rendering. Since this computational pipeline could be heavy, we also develop a simple neural network to accelerate the cloud-side scheduling process. We evaluate our method via subjective studies and objective analysis under varying network conditions (from 3G to 5G) and edge devices (HMD and traditional displays), and demonstrate better visual quality and temporal consistency than alternative solutions.
边缘云环境下的媒体流传输已广泛应用于娱乐、可视化和设计等多种领域。与通常被动消费内容的视频/音频流不同,虚拟现实应用需要将存储在边缘上的 3D 资产用于频繁的边缘交互,如对象操作和视点移动。与音频和视频流相比,3D 资产流通常需要更大的数据量,但延迟要更低,以确保足够的渲染质量、分辨率和感知舒适度的延迟。因此,3D 资产流比音频/视频流面临更大的挑战,而现有的解决方案通常存在加载时间长或质量有限的问题。为了解决这个挑战,我们提出了一种基于感知的优化渐进式 3D 流传输方法,用于沉浸式交互中的空间质量和时间一致性。在云侧,我们的主要思想是根据用户的注视行为,包括他们正在看哪里以及眼睛如何移动,在 2D 图像空间中估计感知重要性。然后,将估计的重要性映射到 3D 对象空间,以安排边缘渲染的流优先级。由于这个计算过程可能很繁重,我们还开发了一个简单的神经网络来加速云侧的调度过程。我们通过主观研究和客观分析,在不同的网络条件(从 3G 到 5G)和边缘设备(头戴式显示器和传统显示器)下评估了我们的方法,并展示了比其他解决方案更好的视觉质量和时间一致性。