利用三维注视跟踪实现双手操作过程中的动作识别以增强人机协作

Exploiting Three-Dimensional Gaze Tracking for Action Recognition During Bimanual Manipulation to Enhance Human-Robot Collaboration.

作者信息

Haji Fathaliyan Alireza, Wang Xiaoyu, Santos Veronica J

机构信息

Biomechatronics Laboratory, Mechanical and Aerospace Engineering, University of California, Los Angeles, Los Angeles, CA, United States.

出版信息

Front Robot AI. 2018 Apr 4;5:25. doi: 10.3389/frobt.2018.00025. eCollection 2018.

DOI:10.3389/frobt.2018.00025

PMID:33500912

原文链接:https://pmc.ncbi.nlm.nih.gov/articles/PMC7805858/

Abstract

Human-robot collaboration could be advanced by facilitating the intuitive, gaze-based control of robots, and enabling robots to recognize human actions, infer human intent, and plan actions that support human goals. Traditionally, gaze tracking approaches to action recognition have relied upon computer vision-based analyses of two-dimensional egocentric camera videos. The objective of this study was to identify useful features that can be extracted from three-dimensional (3D) gaze behavior and used as inputs to machine learning algorithms for human action recognition. We investigated human gaze behavior and gaze-object interactions in 3D during the performance of a bimanual, instrumental activity of daily living: the preparation of a powdered drink. A marker-based motion capture system and binocular eye tracker were used to reconstruct 3D gaze vectors and their intersection with 3D point clouds of objects being manipulated. Statistical analyses of gaze fixation duration and saccade size suggested that some actions (pouring and stirring) may require more visual attention than other actions (reach, pick up, set down, and move). 3D gaze saliency maps, generated with high spatial resolution for six subtasks, appeared to encode action-relevant information. The "gaze object sequence" was used to capture information about the identity of objects in concert with the temporal sequence in which the objects were visually regarded. Dynamic time warping barycentric averaging was used to create a population-based set of characteristic gaze object sequences that accounted for intra- and inter-subject variability. The gaze object sequence was used to demonstrate the feasibility of a simple action recognition algorithm that utilized a dynamic time warping Euclidean distance metric. Averaged over the six subtasks, the action recognition algorithm yielded an accuracy of 96.4%, precision of 89.5%, and recall of 89.2%. This level of performance suggests that the gaze object sequence is a promising feature for action recognition whose impact could be enhanced through the use of sophisticated machine learning classifiers and algorithmic improvements for real-time implementation. Robots capable of robust, real-time recognition of human actions during manipulation tasks could be used to improve quality of life in the home and quality of work in industrial environments.

摘要

通过促进对机器人基于注视的直观控制，并使机器人能够识别人类动作、推断人类意图以及规划支持人类目标的动作，人机协作可以得到推进。传统上，用于动作识别的注视跟踪方法依赖于基于计算机视觉对二维自我中心相机视频的分析。本研究的目的是识别可以从三维（3D）注视行为中提取的有用特征，并将其用作机器学习算法进行人类动作识别的输入。我们在一项双手进行的日常生活工具性活动——冲调粉末饮料的过程中，研究了3D空间中的人类注视行为和注视对象交互。基于标记的运动捕捉系统和双目眼动仪被用于重建3D注视向量及其与被操作物体的3D点云的交点。对注视持续时间和扫视大小的统计分析表明，某些动作（倒和搅拌）可能比其他动作（伸手、拿起、放下和移动）需要更多的视觉注意力。为六个子任务生成的具有高空间分辨率的3D注视显著性图似乎编码了与动作相关的信息。“注视对象序列”用于结合物体被视觉关注的时间顺序来捕捉有关物体身份的信息。动态时间规整重心平均法被用于创建一组基于群体的特征注视对象序列，该序列考虑了个体间和个体内的变异性。注视对象序列被用于证明一种利用动态时间规整欧几里得距离度量的简单动作识别算法的可行性。在六个子任务上平均，该动作识别算法的准确率为96.4%，精确率为89.5%，召回率为89.2%。这种性能水平表明，注视对象序列是一种很有前景的动作识别特征，通过使用复杂的机器学习分类器和算法改进以实现实时应用，其影响可能会得到增强。能够在操作任务期间对人类动作进行可靠实时识别的机器人可用于改善家庭生活质量和工业环境中的工作质量。

https://cdn.ncbi.nlm.nih.gov/pmc/blobs/0147/7805858/e080ef022071/frobt-05-00025-g001.jpg

相似文献

Exploiting Three-Dimensional Gaze Tracking for Action Recognition During Bimanual Manipulation to Enhance Human-Robot Collaboration.

Front Robot AI. 2018 Apr 4;5:25. doi: 10.3389/frobt.2018.00025. eCollection 2018.

Gaze-Based Shared Autonomy Framework With Real-Time Action Primitive Recognition for Robot Manipulators.

IEEE Trans Neural Syst Rehabil Eng. 2023;31:4306-4317. doi: 10.1109/TNSRE.2023.3328888. Epub 2023 Nov 3.

Toward Shared Autonomy Control Schemes for Human-Robot Systems: Action Primitive Recognition Using Eye Gaze Features.

Front Neurorobot. 2020 Oct 15;14:567571. doi: 10.3389/fnbot.2020.567571. eCollection 2020.

Human-like object tracking and gaze estimation with PKD android.

Proc SPIE Int Soc Opt Eng. 2016 May;9859. doi: 10.1117/12.2224382.

Object Grasp Control of a 3D Robot Arm by Combining EOG Gaze Estimation and Camera-Based Object Recognition.

Biomimetics (Basel). 2023 May 18;8(2):208. doi: 10.3390/biomimetics8020208.

Generating accurate 3D gaze vectors using synchronized eye tracking and motion capture.

Behav Res Methods. 2024 Jan;56(1):18-31. doi: 10.3758/s13428-022-01958-6. Epub 2022 Sep 9.

Deep Attention Network for Egocentric Action Recognition.

IEEE Trans Image Process. 2019 Aug;28(8):3703-3713. doi: 10.1109/TIP.2019.2901707. Epub 2019 Feb 26.

3-D-Gaze-Based Robotic Grasping Through Mimicking Human Visuomotor Function for People With Motion Impairments.

IEEE Trans Biomed Eng. 2017 Dec;64(12):2824-2835. doi: 10.1109/TBME.2017.2677902. Epub 2017 Mar 3.

Gaze Point Tracking Based on a Robotic Body-Head-Eye Coordination Method.

Sensors (Basel). 2023 Jul 11;23(14):6299. doi: 10.3390/s23146299.

Objects guide human gaze behavior in dynamic real-world scenes.

PLoS Comput Biol. 2023 Oct 26;19(10):e1011512. doi: 10.1371/journal.pcbi.1011512. eCollection 2023 Oct.

引用本文的文献

Comparison of LSTM, Transformers, and MLP-mixer neural networks for gaze based human intention prediction.

Front Neurorobot. 2023 May 25;17:1157957. doi: 10.3389/fnbot.2023.1157957. eCollection 2023.

Human-Inspired Robotic Eye-Hand Coordination Enables New Communication Channels Between Humans and Robots.

Int J Soc Robot. 2021 Aug;13(5):1033-1046. doi: 10.1007/s12369-020-00693-2. Epub 2020 Sep 18.

Target position and avoidance margin effects on path planning in obstacle avoidance.

Sci Rep. 2021 Jul 27;11(1):15285. doi: 10.1038/s41598-021-94638-y.

Gaze-Based Intention Estimation for Shared Autonomy in Pick-and-Place Tasks.

Front Neurorobot. 2021 Apr 16;15:647930. doi: 10.3389/fnbot.2021.647930. eCollection 2021.

Toward Shared Autonomy Control Schemes for Human-Robot Systems: Action Primitive Recognition Using Eye Gaze Features.

Front Neurorobot. 2020 Oct 15;14:567571. doi: 10.3389/fnbot.2020.567571. eCollection 2020.

System Transparency in Shared Autonomy: A Mini Review.

Front Neurorobot. 2018 Nov 30;12:83. doi: 10.3389/fnbot.2018.00083. eCollection 2018.

本文引用的文献

3-D-Gaze-Based Robotic Grasping Through Mimicking Human Visuomotor Function for People With Motion Impairments.

IEEE Trans Biomed Eng. 2017 Dec;64(12):2824-2835. doi: 10.1109/TBME.2017.2677902. Epub 2017 Mar 3.

Recognition of Activities of Daily Living with Egocentric Vision: A Review.

Sensors (Basel). 2016 Jan 7;16(1):72. doi: 10.3390/s16010072.

Goal-oriented gaze strategies afforded by object interaction.

Vision Res. 2015 Jan;106:47-57. doi: 10.1016/j.visres.2014.11.003. Epub 2014 Nov 21.

Expert surgeon's quiet eye and slowing down: expertise differences in performance and quiet eye duration during identification and dissection of the recurrent laryngeal nerve.

Am J Surg. 2014 Feb;207(2):187-93. doi: 10.1016/j.amjsurg.2013.07.033. Epub 2013 Oct 2.

Mental imagery.

Front Psychol. 2013 Apr 23;4:198. doi: 10.3389/fpsyg.2013.00198. eCollection 2013.

Slowing down to stay out of trouble in the operating room: remaining attentive in automaticity.

Acad Med. 2010 Oct;85(10):1571-7. doi: 10.1097/ACM.0b013e3181f073dd.

RECOGNIZING BEHAVIOR IN HAND-EYE COORDINATION PATTERNS.

Int J HR. 2009;6(3):337-359. doi: 10.1142/S0219843609001863.

An adaptive algorithm for fixation, saccade, and glissade detection in eyetracking data.

Behav Res Methods. 2010 Feb;42(1):188-204. doi: 10.3758/BRM.42.1.188.

Eye movements in natural behavior.

Trends Cogn Sci. 2005 Apr;9(4):188-94. doi: 10.1016/j.tics.2005.02.009.

In what ways do eye movements contribute to everyday activities?

Vision Res. 2001;41(25-26):3559-65. doi: 10.1016/s0042-6989(01)00102-x.

文献AI研究员

20分钟写一篇综述，助力文献阅读效率提升50倍。

立即体验

用中文搜PubMed

大模型驱动的PubMed中文搜索引擎

马上搜索

文档翻译

学术文献翻译模型，支持多种主流文档格式。

立即体验

利用三维注视跟踪实现双手操作过程中的动作识别以增强人机协作

Exploiting Three-Dimensional Gaze Tracking for Action Recognition During Bimanual Manipulation to Enhance Human-Robot Collaboration.

作者信息

机构信息

出版信息

相似文献

引用本文的文献

本文引用的文献

文献AI研究员

用中文搜PubMed

文档翻译

Suppr 超能文献

相似文献

引用本文的文献

本文引用的文献