Suppr超能文献

基于 RGB-IR 相机的 3D 注视估计。

3D Gaze Estimation Using RGB-IR Cameras.

机构信息

The Department of Information Systems, University of Haifa, Mount Carmel, Haifa 3498838, Israel.

出版信息

Sensors (Basel). 2022 Dec 29;23(1):381. doi: 10.3390/s23010381.

Abstract

In this paper, we present a framework for 3D gaze estimation intended to identify the user's focus of attention in a corneal imaging system. The framework uses a headset that consists of three cameras, a scene camera and two eye cameras: an IR camera and an RGB camera. The IR camera is used to continuously and reliably track the pupil and the RGB camera is used to acquire corneal images of the same eye. Deep learning algorithms are trained to detect the pupil in IR and RGB images and to compute a per user 3D model of the eye in real time. Once the 3D model is built, the 3D gaze direction is computed starting from the eyeball center and passing through the pupil center to the outside world. This model can also be used to transform the pupil position detected in the IR image into its corresponding position in the RGB image and to detect the gaze direction in the corneal image. This technique circumvents the problem of pupil detection in RGB images, which is especially difficult and unreliable when the scene is reflected in the corneal images. In our approach, the auto-calibration process is transparent and unobtrusive. Users do not have to be instructed to look at specific objects to calibrate the eye tracker. They need only to act and gaze normally. The framework was evaluated in a user study in realistic settings and the results are promising. It achieved a very low 3D gaze error (2.12°) and very high accuracy in acquiring corneal images (intersection over union-IoU = 0.71). The framework may be used in a variety of real-world mobile scenarios (indoors, indoors near windows and outdoors) with high accuracy.

摘要

在本文中,我们提出了一种用于 3D 注视估计的框架,旨在识别角膜成像系统中用户的关注焦点。该框架使用一个由三个摄像头组成的头戴式设备,包括一个场景摄像头和两个眼摄像头:一个红外摄像头和一个 RGB 摄像头。红外摄像头用于连续、可靠地跟踪瞳孔,而 RGB 摄像头用于获取同一眼睛的角膜图像。深度学习算法经过训练可用于检测红外和 RGB 图像中的瞳孔,并实时计算每个用户的眼部 3D 模型。一旦建立了 3D 模型,就可以从眼球中心开始,通过瞳孔中心到外部世界计算 3D 注视方向。该模型还可以用于将在红外图像中检测到的瞳孔位置转换为其在 RGB 图像中的对应位置,并检测角膜图像中的注视方向。该技术解决了在 RGB 图像中检测瞳孔的问题,当场景反射在角膜图像中时,该问题尤其困难且不可靠。在我们的方法中,自动校准过程是透明且不引人注目的。用户无需被指示看向特定对象来校准眼动追踪器。他们只需正常行动和注视即可。该框架在现实设置中的用户研究中进行了评估,结果很有前景。它实现了非常低的 3D 注视误差(2.12°)和非常高的角膜图像采集精度(交并比-IoU = 0.71)。该框架可用于各种现实世界的移动场景(室内、室内靠近窗户和室外),具有很高的精度。

https://cdn.ncbi.nlm.nih.gov/pmc/blobs/4ece/9823916/c0e0c665af3b/sensors-23-00381-g001.jpg

文献检索

告别复杂PubMed语法,用中文像聊天一样搜索,搜遍4000万医学文献。AI智能推荐,让科研检索更轻松。

立即免费搜索

文件翻译

保留排版,准确专业,支持PDF/Word/PPT等文件格式,支持 12+语言互译。

免费翻译文档

深度研究

AI帮你快速写综述,25分钟生成高质量综述,智能提取关键信息,辅助科研写作。

立即免费体验