Falch Lucas, Lohan Katrin Solveig
Institute for the Development of Mechatronic Systems EMS, Eastern Switzerland University of Applied Sciences (OST), Buchs, Switzerland.
Front Robot AI. 2024 Apr 2;11:1369566. doi: 10.3389/frobt.2024.1369566. eCollection 2024.
This paper presents a novel webcam-based approach for gaze estimation on computer screens. Utilizing appearance based gaze estimation models, the system provides a method for mapping the gaze vector from the user's perspective onto the computer screen. Notably, it determines the user's 3D position in front of the screen, using only a 2D webcam without the need for additional markers or equipment. The study presents a comprehensive comparative analysis, assessing the performance of the proposed method against established eye tracking solutions. This includes a direct comparison with the purpose-built Tobii Eye Tracker 5, a high-end hardware solution, and the webcam-based GazeRecorder software. In experiments replicating head movements, especially those imitating yaw rotations, the study brings to light the inherent difficulties associated with tracking such motions using 2D webcams. This research introduces a solution by integrating Structure from Motion (SfM) into the Convolutional Neural Network (CNN) model. The study's accomplishments include showcasing the potential for accurate screen gaze tracking with a simple webcam, presenting a novel approach for physical distance computation, and proposing compensation for head movements, laying the groundwork for advancements in real-world gaze estimation scenarios.
本文提出了一种基于网络摄像头的新颖方法,用于在计算机屏幕上进行注视估计。该系统利用基于外观的注视估计模型,提供了一种从用户视角将注视向量映射到计算机屏幕上的方法。值得注意的是,它仅使用二维网络摄像头就能确定用户在屏幕前的三维位置,无需额外的标记或设备。该研究进行了全面的比较分析,评估了所提方法相对于既定眼动追踪解决方案的性能。这包括与专门构建的高端硬件解决方案 Tobii Eye Tracker 5 以及基于网络摄像头的 GazeRecorder 软件进行直接比较。在模拟头部运动(尤其是模仿偏航旋转的运动)的实验中,该研究揭示了使用二维网络摄像头跟踪此类运动所固有的困难。这项研究通过将运动结构(SfM)集成到卷积神经网络(CNN)模型中引入了一种解决方案。该研究的成果包括展示了使用简单网络摄像头进行精确屏幕注视跟踪的潜力,提出了一种用于物理距离计算的新颖方法,并提出了头部运动补偿,为实际注视估计场景的进展奠定了基础。