Hu Zhiming, Bulling Andreas, Li Sheng, Wang Guoping
IEEE Trans Vis Comput Graph. 2023 Apr;29(4):1992-2004. doi: 10.1109/TVCG.2021.3138902. Epub 2023 Feb 28.
Understanding human visual attention in immersive virtual reality (VR) is crucial for many important applications, including gaze prediction, gaze guidance, and gaze-contingent rendering. However, previous works on visual attention analysis typically only explored one specific VR task and paid less attention to the differences between different tasks. Moreover, existing task recognition methods typically focused on 2D viewing conditions and only explored the effectiveness of human eye movements. We first collect eye and head movements of 30 participants performing four tasks, i.e., Free viewing, Visual search, Saliency, and Track, in 15 360-degree VR videos. Using this dataset, we analyze the patterns of human eye and head movements and reveal significant differences across different tasks in terms of fixation duration, saccade amplitude, head rotation velocity, and eye-head coordination. We then propose EHTask - a novel learning-based method that employs eye and head movements to recognize user tasks in VR. We show that our method significantly outperforms the state-of-the-art methods derived from 2D viewing conditions both on our dataset (accuracy of 84.4% versus 62.8%) and on a real-world dataset ( 61.9% versus 44.1%). As such, our work provides meaningful insights into human visual attention under different VR tasks and guides future work on recognizing user tasks in VR.
理解沉浸式虚拟现实(VR)中的人类视觉注意力对于许多重要应用至关重要,包括注视预测、注视引导和注视相关渲染。然而,先前关于视觉注意力分析的工作通常只探索了一个特定的VR任务,而较少关注不同任务之间的差异。此外,现有的任务识别方法通常专注于二维观看条件,仅探索了人类眼球运动的有效性。我们首先收集了30名参与者在15个360度VR视频中执行四个任务(即自由观看、视觉搜索、显著性和跟踪)时的眼睛和头部运动数据。利用这个数据集,我们分析了人类眼睛和头部运动的模式,并揭示了在注视持续时间、扫视幅度、头部旋转速度和眼头协调方面不同任务之间的显著差异。然后,我们提出了EHTask——一种基于学习的新颖方法,该方法利用眼睛和头部运动来识别VR中的用户任务。我们表明,我们的方法在我们的数据集上(准确率为84.4%,而现有方法为62.8%)以及在一个真实世界数据集上(61.9%,而现有方法为44.1%)均显著优于源自二维观看条件的最先进方法。因此,我们的工作为不同VR任务下的人类视觉注意力提供了有意义的见解,并为未来VR中用户任务识别的工作提供了指导。