用于辅助自我中心视觉的目标定位中的手动预激发

Hand-Priming in Object Localization for Assistive Egocentric Vision.

作者信息

Lee Kyungjun, Shrivastava Abhinav, Kacorri Hernisa

机构信息

University of Maryland, College Park.

出版信息

IEEE Winter Conf Appl Comput Vis. 2020 Mar;2020:3411-3421. doi: 10.1109/wacv45572.2020.9093353. Epub 2020 May 14.

DOI:10.1109/wacv45572.2020.9093353

PMID:32803098

原文链接:https://pmc.ncbi.nlm.nih.gov/articles/PMC7423407/

Abstract

Egocentric vision holds great promises for increasing access to visual information and improving the quality of life for people with visual impairments, with object recognition being one of the daily challenges for this population. While we strive to improve recognition performance, it remains difficult to identify which object is of interest to the user; the object may not even be included in the frame due to challenges in camera aiming without visual feedback. Also, gaze information, commonly used to infer the area of interest in egocentric vision, is often not dependable. However, blind users often tend to include their hand either interacting with the object that they wish to recognize or simply placing it in proximity for better camera aiming. We propose localization models that leverage the presence of the hand as the contextual information for priming the center area of the object of interest. In our approach, hand segmentation is fed to either the entire localization network or its last convolutional layers. Using egocentric datasets from sighted and blind individuals, we show that the hand-priming achieves higher precision than other approaches, such as fine-tuning, multi-class, and multi-task learning, which also encode hand-object interactions in localization.

摘要

以自我为中心的视觉有望极大地增加视障人士获取视觉信息的机会并改善他们的生活质量，物体识别是这一人群日常面临的挑战之一。虽然我们努力提高识别性能，但仍然很难确定用户感兴趣的是哪个物体；由于没有视觉反馈的情况下相机瞄准存在挑战，物体甚至可能不在画面中。此外，通常用于推断以自我为中心视觉中感兴趣区域的注视信息往往不可靠。然而，盲人用户往往会将手包括在内，要么与他们想要识别的物体进行交互，要么只是将手放在附近以便更好地进行相机瞄准。我们提出了定位模型，利用手的存在作为上下文信息来启动感兴趣物体的中心区域。在我们的方法中，手部分割被输入到整个定位网络或其最后一个卷积层。使用来自有视力和盲人个体的以自我为中心的数据集，我们表明手部启动比其他方法（如微调、多类和多任务学习）具有更高的精度，这些方法也在定位中对手部与物体的交互进行编码。

https://cdn.ncbi.nlm.nih.gov/pmc/blobs/106c/7423407/8c9d3573047f/nihms-1609047-f0001.jpg

相似文献

Hand-Priming in Object Localization for Assistive Egocentric Vision.用于辅助自我中心视觉的目标定位中的手动预激发

IEEE Winter Conf Appl Comput Vis. 2020 Mar;2020:3411-3421. doi: 10.1109/wacv45572.2020.9093353. Epub 2020 May 14.

Leveraging Hand-Object Interactions in Assistive Egocentric Vision.利用辅助自我中心视觉中的手-物交互。

IEEE Trans Pattern Anal Mach Intell. 2023 Jun;45(6):6820-6831. doi: 10.1109/TPAMI.2021.3123303. Epub 2023 May 5.

Hands Holding Clues for Object Recognition in Teachable Machines.手中线索助力可教机器进行物体识别。

Proc SIGCHI Conf Hum Factor Comput Syst. 2019 May;2019. doi: 10.1145/3290605.3300566.

Revisiting Blind Photography in the Context of Teachable Object Recognizers.在可教学对象识别器的背景下重新审视盲摄影。

ASSETS. 2019 Oct;2019:83-95. doi: 10.1145/3308561.3353799.

The Last Meter: Blind Visual Guidance to a Target.最后一米：对目标的盲目视觉引导。

Proc SIGCHI Conf Hum Factor Comput Syst. 2014 Apr-May;2014:3113-3122. doi: 10.1145/2556288.2557328.

Learning to Recognize Actions on Objects in Egocentric Video With Attention Dictionaries.基于注意字典的自主体视频中物体动作识别学习。

IEEE Trans Pattern Anal Mach Intell. 2023 Jun;45(6):6674-6687. doi: 10.1109/TPAMI.2021.3058649. Epub 2023 May 5.

Impacts of Image Obfuscation on Fine-grained Activity Recognition in Egocentric Video.图像模糊处理对自我中心视频中细粒度活动识别的影响

Proc IEEE Int Conf Pervasive Comput Commun Workshops. 2022 Mar;2022:341-346. doi: 10.1109/percomworkshops53856.2022.9767447. Epub 2022 May 6.

An egocentric vision based assistive co-robot.一种基于自我中心视觉的辅助协作机器人。

IEEE Int Conf Rehabil Robot. 2013 Jun;2013:6650473. doi: 10.1109/ICORR.2013.6650473.

Exploiting Three-Dimensional Gaze Tracking for Action Recognition During Bimanual Manipulation to Enhance Human-Robot Collaboration.利用三维注视跟踪实现双手操作过程中的动作识别以增强人机协作

Front Robot AI. 2018 Apr 4;5:25. doi: 10.3389/frobt.2018.00025. eCollection 2018.

Understanding How Blind Users Handle Object Recognition Errors: Strategies and Challenges.了解视障用户如何处理物体识别错误：策略与挑战

ASSETS. 2024;2024:1-15. doi: 10.1145/3663548.3675635.

引用本文的文献

ActiSight: Wearer Foreground Extraction Using a Practical RGB-Thermal Wearable.ActiSight：使用实用的RGB-热成像可穿戴设备进行穿戴者前景提取。

Proc IEEE Int Conf Pervasive Comput Commun. 2022 Mar;2022:237-246. doi: 10.1109/percom53586.2022.9762385. Epub 2022 Apr 27.

本文引用的文献

Revisiting Blind Photography in the Context of Teachable Object Recognizers.在可教学对象识别器的背景下重新审视盲摄影。

ASSETS. 2019 Oct;2019:83-95. doi: 10.1145/3308561.3353799.

Hands Holding Clues for Object Recognition in Teachable Machines.手中线索助力可教机器进行物体识别。

Proc SIGCHI Conf Hum Factor Comput Syst. 2019 May;2019. doi: 10.1145/3290605.3300566.

Lending A Hand: Detecting Hands and Recognizing Activities in Complex Egocentric Interactions.伸出援手：在复杂的自我中心交互中检测手部动作并识别活动。

Proc IEEE Int Conf Comput Vis. 2015 Dec;2015:1949-1957. doi: 10.1109/ICCV.2015.226. Epub 2016 Feb 18.

Faster R-CNN: Towards Real-Time Object Detection with Region Proposal Networks.更快的 R-CNN：基于区域建议网络的实时目标检测。

IEEE Trans Pattern Anal Mach Intell. 2017 Jun;39(6):1137-1149. doi: 10.1109/TPAMI.2016.2577031. Epub 2016 Jun 6.

Fully Convolutional Networks for Semantic Segmentation.全卷积网络用于语义分割。

IEEE Trans Pattern Anal Mach Intell. 2017 Apr;39(4):640-651. doi: 10.1109/TPAMI.2016.2572683. Epub 2016 May 24.

Delving into Egocentric Actions.深入探究以自我为中心的行为。

Proc IEEE Comput Soc Conf Comput Vis Pattern Recognit. 2015 Jun;2015:287-295. doi: 10.1109/CVPR.2015.7298625.

Neural processing of recollection, familiarity and priming at encoding: evidence from a forced-choice recognition paradigm.编码时回忆、熟悉感和启动效应的神经加工：来自强制选择识别范式的证据。

Brain Res. 2014 Oct 17;1585:72-82. doi: 10.1016/j.brainres.2014.08.024. Epub 2014 Aug 16.

The effects of contextual scenes on the identification of objects.语境场景对物体识别的影响。

Mem Cognit. 1975 Sep;3(5):519-26. doi: 10.3758/BF03197524.

The proprioceptive senses: their roles in signaling body shape, body position and movement, and muscle force.本体感觉：其在信号身体形态、身体姿势和运动以及肌肉力量方面的作用。

Physiol Rev. 2012 Oct;92(4):1651-97. doi: 10.1152/physrev.00048.2011.

Evidence for a proprioception-based rapid on-line error correction mechanism for hand orientation during reaching movements in blind subjects.盲人在伸手动作过程中基于本体感觉的手部方位快速在线误差校正机制的证据。

J Neurosci. 2009 Mar 18;29(11):3485-96. doi: 10.1523/JNEUROSCI.2374-08.2009.

文献检索

告别复杂PubMed语法，用中文像聊天一样搜索，搜遍4000万医学文献。AI智能推荐，让科研检索更轻松。

立即免费搜索

文件翻译

保留排版，准确专业，支持PDF/Word/PPT等文件格式，支持 12+语言互译。

免费翻译文档

深度研究

AI帮你快速写综述，25分钟生成高质量综述，智能提取关键信息，辅助科研写作。

立即免费体验