Leo Marco, Cazzato Dario, De Marco Tommaso, Distante Cosimo
National Research Council of Italy, Institute of Optics, Arnesano, Lecce, Italy.
National Research Council of Italy, Institute of Optics, Arnesano, Lecce, Italy; Faculty of Engineering, University of Salento, Lecce, Italy.
PLoS One. 2014 Aug 14;9(8):e102829. doi: 10.1371/journal.pone.0102829. eCollection 2014.
The automatic detection and tracking of human eyes and, in particular, the precise localization of their centers (pupils), is a widely debated topic in the international scientific community. In fact, the extracted information can be effectively used in a large number of applications ranging from advanced interfaces to biometrics and including also the estimation of the gaze direction, the control of human attention and the early screening of neurological pathologies. Independently of the application domain, the detection and tracking of the eye centers are, currently, performed mainly using invasive devices. Cheaper and more versatile systems have been only recently introduced: they make use of image processing techniques working on periocular patches which can be specifically acquired or preliminarily cropped from facial images. In the latter cases the involved algorithms must work even in cases of non-ideal acquiring conditions (e.g in presence of noise, low spatial resolution, non-uniform lighting conditions, etc.) and without user's awareness (thus with possible variations of the eye in scale, rotation and/or translation). Getting satisfying results in pupils' localization in such a challenging operating conditions is still an open scientific topic in Computer Vision. Actually, the most performing solutions in the literature are, unfortunately, based on supervised machine learning algorithms which require initial sessions to set the working parameters and to train the embedded learning models of the eye: this way, experienced operators have to work on the system each time it is moved from an operational context to another. It follows that the use of unsupervised approaches is more and more desirable but, unfortunately, their performances are not still satisfactory and more investigations are required. To this end, this paper proposes a new unsupervised approach to automatically detect the center of the eye: its algorithmic core is a representation of the eye's shape that is obtained through a differential analysis of image intensities and the subsequent combination with the local variability of the appearance represented by self-similarity coefficients. The experimental evidence of the effectiveness of the method was demonstrated on challenging databases containing facial images. Moreover, its capabilities to accurately detect the centers of the eyes were also favourably compared with those of the leading state-of-the-art methods.
人眼的自动检测与跟踪,尤其是其中心(瞳孔)的精确定位,是国际科学界广泛争论的话题。事实上,提取的信息可有效地应用于大量应用中,从先进接口到生物识别,还包括注视方向估计、人类注意力控制以及神经病理学的早期筛查。无论应用领域如何,目前眼中心的检测与跟踪主要使用侵入性设备。更便宜且更通用的系统直到最近才被引入:它们利用在眼周区域上工作的图像处理技术,这些区域可以专门获取或从面部图像中预先裁剪。在后一种情况下,所涉及的算法即使在非理想的采集条件下(例如存在噪声、低空间分辨率、不均匀照明条件等)且在用户不知情的情况下(因此眼睛可能在尺度、旋转和/或平移方面存在变化)也必须能正常工作。在如此具有挑战性的操作条件下获得令人满意的瞳孔定位结果仍然是计算机视觉领域一个未解决的科学问题。实际上,文献中性能最佳的解决方案不幸的是基于监督机器学习算法,这些算法需要初始会话来设置工作参数并训练眼睛的嵌入式学习模型:这样,每次系统从一个操作环境转移到另一个操作环境时,都需要有经验的操作人员对系统进行操作。因此,越来越需要使用无监督方法,但不幸的是,它们的性能仍不令人满意,还需要更多的研究。为此,本文提出了一种新的无监督方法来自动检测眼中心:其算法核心是通过对图像强度的差分分析以及随后与由自相似系数表示的外观局部变化相结合而获得的眼睛形状表示。该方法有效性的实验证据在包含面部图像的具有挑战性的数据库上得到了证明。此外,还将其准确检测眼中心的能力与领先的现有最先进方法进行了有利比较。