I2DNet - 基于外观的注视估计系统的设计与实时评估

I2DNet - Design and Real-Time Evaluation of Appearance-based gaze estimation system.

作者信息

Murthy L R D, Brahmbhatt Siddhi, Arjun Somnath, Biswas Pradipta

机构信息

3D Lab, CPDM, Indian Institute of Science, Bangalore, India.

Information Technology, G H Patel College of Engineering and Technology, India.

出版信息

J Eye Mov Res. 2021 Aug 31;14(4). doi: 10.16910/jemr.14.4.2. eCollection 2021.

DOI:10.16910/jemr.14.4.2

PMID:34733445

原文链接:https://pmc.ncbi.nlm.nih.gov/articles/PMC8561667/

Abstract

Gaze estimation problem can be addressed using either model-based or appearance-based approaches. Model-based approaches rely on features extracted from eye images to fit a 3D eye-ball model to obtain gaze point estimate while appearance-based methods attempt to directly map captured eye images to gaze point without any handcrafted features. Recently, availability of large datasets and novel deep learning techniques made appearance-based methods achieve superior accuracy than model-based approaches. However, many appearance- based gaze estimation systems perform well in within-dataset validation but fail to provide the same degree of accuracy in cross-dataset evaluation. Hence, it is still unclear how well the current state-of-the-art approaches perform in real-time in an interactive setting on unseen users. This paper proposes I2DNet, a novel architecture aimed to improve subject- independent gaze estimation accuracy that achieved a state-of-the-art 4.3 and 8.4 degree mean angle error on the MPIIGaze and RT-Gene datasets respectively. We have evaluated the proposed system as a gaze-controlled interface in real-time for a 9-block pointing and selection task and compared it with Webgazer.js and OpenFace 2.0. We have conducted a user study with 16 participants, and our proposed system reduces selection time and the number of missed selections statistically significantly compared to other two systems.

摘要

注视估计问题可以使用基于模型的方法或基于外观的方法来解决。基于模型的方法依赖于从眼睛图像中提取的特征来拟合三维眼球模型以获得注视点估计，而基于外观的方法则试图直接将捕获的眼睛图像映射到注视点，无需任何手工制作的特征。最近，大量数据集的可用性和新颖的深度学习技术使基于外观的方法比基于模型的方法获得了更高的准确性。然而，许多基于外观的注视估计系统在数据集内验证中表现良好，但在跨数据集评估中未能提供相同程度的准确性。因此，目前尚不清楚当前的最先进方法在针对未见过的用户的交互式设置中实时表现如何。本文提出了I2DNet，这是一种新颖的架构，旨在提高独立于主体的注视估计准确性，该架构在MPIIGaze和RT-Gene数据集上分别实现了4.3度和8.4度的最先进平均角度误差。我们已经将所提出的系统评估为用于9块指向和选择任务的实时注视控制界面，并将其与Webgazer.js和OpenFace 2.0进行了比较。我们对16名参与者进行了用户研究，与其他两个系统相比，我们提出的系统在统计上显著减少了选择时间和错过选择的数量。