School of Artificial Intelligence, Guilin University of Electronic Technology, Jinji Road, Guilin 541004, China.
Faculty of Engineering and Environment, Northumbria University, Newcastle NE18ST, UK.
Sensors (Basel). 2022 Jul 21;22(14):5462. doi: 10.3390/s22145462.
Gaze estimation, which is a method to determine where a person is looking at given the person's full face, is a valuable clue for understanding human intention. Similarly to other domains of computer vision, deep learning (DL) methods have gained recognition in the gaze estimation domain. However, there are still gaze calibration problems in the gaze estimation domain, thus preventing existing methods from further improving the performances. An effective solution is to directly predict the difference information of two human eyes, such as the differential network (Diff-Nn). However, this solution results in a loss of accuracy when using only one inference image. We propose a differential residual model (DRNet) combined with a new loss function to make use of the difference information of two eye images. We treat the difference information as auxiliary information. We assess the proposed model (DRNet) mainly using two public datasets (1) MpiiGaze and (2) Eyediap. Considering only the eye features, DRNet outperforms the state-of-the-art gaze estimation methods with of 4.57 and 6.14 using MpiiGaze and Eyediap datasets, respectively. Furthermore, the experimental results also demonstrate that DRNet is extremely robust to noise images.
注视估计是一种根据人脸确定人注视位置的方法,是理解人类意图的一个有价值的线索。与计算机视觉的其他领域类似,深度学习 (DL) 方法在注视估计领域得到了认可。然而,注视估计领域仍然存在注视校准问题,从而阻止了现有方法进一步提高性能。一个有效的解决方案是直接预测两个人眼的差异信息,例如差分网络 (Diff-Nn)。然而,当仅使用一个推理图像时,此解决方案会导致精度损失。我们提出了一种结合新损失函数的差分残差模型 (DRNet),以利用两眼图像的差异信息。我们将差异信息视为辅助信息。我们主要使用两个公共数据集 (1)MpiiGaze 和 (2)Eyediap 来评估所提出的模型 (DRNet)。仅考虑眼部特征,DRNet 在使用 MpiiGaze 和 Eyediap 数据集时,分别以 4.57 和 6.14 的性能优于最先进的注视估计方法。此外,实验结果还表明,DRNet 对噪声图像具有极强的鲁棒性。