IEEE Trans Neural Syst Rehabil Eng. 2023;31:4306-4317. doi: 10.1109/TNSRE.2023.3328888. Epub 2023 Nov 3.
Robots capable of robust, real-time recognition of human intent during manipulation tasks could be used to enhance human-robot collaboration for activities of daily living. Eye gaze-based control interfaces offer a non-invasive way to infer intent and reduce the cognitive burden on operators of complex robots. Eye gaze is traditionally used for "gaze triggering" (GT) in which staring at an object, or sequence of objects, triggers pre-programmed robotic movements. We propose an alternative approach: a neural network-based "action prediction" (AP) mode that extracts gaze-related features to recognize, and often predict, an operator's intended action primitives. We integrated the AP mode into a shared autonomy framework capable of 3D gaze reconstruction, real-time intent inference, object localization, obstacle avoidance, and dynamic trajectory planning. Using this framework, we conducted a user study to directly compare the performance of the GT and AP modes using traditional subjective performance metrics, such as Likert scales, as well as novel objective performance metrics, such as the delay of recognition. Statistical analyses suggested that the AP mode resulted in more seamless robotic movement than the state-of-the-art GT mode, and that participants generally preferred the AP mode.
机器人能够在操作任务中稳健、实时地识别人类意图,这将有助于增强人机协作,以完成日常生活活动。基于眼动追踪的控制接口提供了一种非侵入式的方法来推断意图,并降低了复杂机器人操作人员的认知负担。传统上,眼动追踪用于“注视触发”(GT),即盯着一个或一组物体,触发预先编程的机器人运动。我们提出了一种替代方法:基于神经网络的“动作预测”(AP)模式,该模式提取与注视相关的特征,以识别并经常预测操作人员的预期动作基元。我们将 AP 模式集成到一个共享自主框架中,该框架能够进行 3D 注视重建、实时意图推断、物体定位、障碍物回避和动态轨迹规划。使用这个框架,我们进行了一项用户研究,直接比较了传统主观性能指标(如李克特量表)和新颖的客观性能指标(如识别延迟)下的 GT 和 AP 模式的性能。统计分析表明,与最先进的 GT 模式相比,AP 模式的机器人运动更加流畅,并且参与者普遍更喜欢 AP 模式。