Suppr超能文献

一种用于多模态图像标注的被动学习传感器架构:社交机器人的应用

A Passive Learning Sensor Architecture for Multimodal Image Labeling: An Application for Social Robots.

作者信息

Gutiérrez Marco A, Manso Luis J, Pandya Harit, Núñez Pedro

机构信息

Robotics and Artificial Vision Laboratory, University of Extremadura, 10003 Cáceres, Spain.

Robotics Research Center, IIIT Hyderabad, 500032 Hyderabad, India.

出版信息

Sensors (Basel). 2017 Feb 11;17(2):353. doi: 10.3390/s17020353.

Abstract

Object detection and classification have countless applications in human-robot interacting systems. It is a necessary skill for autonomous robots that perform tasks in household scenarios. Despite the great advances in deep learning and computer vision, social robots performing non-trivial tasks usually spend most of their time finding and modeling objects. Working in real scenarios means dealing with constant environment changes and relatively low-quality sensor data due to the distance at which objects are often found. Ambient intelligence systems equipped with different sensors can also benefit from the ability to find objects, enabling them to inform humans about their location. For these applications to succeed, systems need to detect the objects that may potentially contain other objects, working with relatively low-resolution sensor data. A passive learning architecture for sensors has been designed in order to take advantage of multimodal information, obtained using an RGB-D camera and trained semantic language models. The main contribution of the architecture lies in the improvement of the performance of the sensor under conditions of low resolution and high light variations using a combination of image labeling and word semantics. The tests performed on each of the stages of the architecture compare this solution with current research labeling techniques for the application of an autonomous social robot working in an apartment. The results obtained demonstrate that the proposed sensor architecture outperforms state-of-the-art approaches.

摘要

目标检测与分类在人机交互系统中有无数应用。对于在家庭场景中执行任务的自主机器人来说,这是一项必备技能。尽管深度学习和计算机视觉取得了巨大进展,但执行重要任务的社交机器人通常大部分时间都花在寻找物体并对其建模上。在实际场景中工作意味着要应对不断变化的环境以及由于物体通常所处距离而导致的相对低质量的传感器数据。配备不同传感器的环境智能系统也能从物体检测能力中受益,从而能够告知人类物体的位置。为了使这些应用取得成功,系统需要在处理相对低分辨率传感器数据的情况下,检测可能包含其他物体的物体。为了利用多模态信息,设计了一种用于传感器的被动学习架构,该信息通过RGB-D相机和经过训练的语义语言模型获得。该架构的主要贡献在于,通过结合图像标注和词语义,在低分辨率和高光变化条件下提高了传感器的性能。在该架构的每个阶段进行的测试,将此解决方案与当前用于在公寓中工作的自主社交机器人应用的研究标注技术进行了比较。获得的结果表明,所提出的传感器架构优于现有技术方法。

文献AI研究员

20分钟写一篇综述,助力文献阅读效率提升50倍。

立即体验

用中文搜PubMed

大模型驱动的PubMed中文搜索引擎

马上搜索

文档翻译

学术文献翻译模型,支持多种主流文档格式。

立即体验