Toprak Sibel, Navarro-Guerrero Nicolás, Wermter Stefan
Knowledge Technology, Department of Informatics, Universität Hamburg, Vogt-Kölln-Str. 30, 22527 Hamburg, Germany.
Cognit Comput. 2018;10(3):408-425. doi: 10.1007/s12559-017-9536-7. Epub 2017 Dec 28.
In computational systems for visuo-haptic object recognition, vision and haptics are often modeled as separate processes. But this is far from what really happens in the human brain, where cross- as well as multimodal interactions take place between the two sensory modalities. Generally, three main principles can be identified as underlying the processing of the visual and haptic object-related stimuli in the brain: (1) hierarchical processing, (2) the divergence of the processing onto substreams for object shape and material perception, and (3) the experience-driven self-organization of the integratory neural circuits. The question arises whether an object recognition system can benefit in terms of performance from adopting these brain-inspired processing principles for the integration of the visual and haptic inputs. To address this, we compare the integration strategy that incorporates all three principles to the two commonly used integration strategies in the literature. We collected data with a NAO robot enhanced with inexpensive contact microphones as tactile sensors. The results of our experiments involving every-day objects indicate that (1) the contact microphones are a good alternative to capturing tactile information and that (2) organizing the processing of the visual and haptic inputs hierarchically and in two pre-processing streams is helpful performance-wise. Nevertheless, further research is needed to effectively quantify the role of each identified principle by itself as well as in combination with others.
在用于视觉-触觉物体识别的计算系统中,视觉和触觉通常被建模为独立的过程。但这与人类大脑中实际发生的情况相去甚远,在人类大脑中,两种感官模态之间会发生交叉以及多模态交互。一般来说,可以确定三个主要原则是大脑中视觉和触觉物体相关刺激处理的基础:(1)分层处理,(2)处理过程分流到用于物体形状和材料感知的子流,以及(3)经验驱动的整合神经回路的自组织。问题在于,一个物体识别系统在整合视觉和触觉输入时,采用这些受大脑启发的处理原则是否能在性能方面受益。为了解决这个问题,我们将包含所有三个原则的整合策略与文献中常用的两种整合策略进行比较。我们使用配备廉价接触式麦克风作为触觉传感器的NAO机器人收集数据。我们对日常物体进行实验的结果表明:(1)接触式麦克风是捕捉触觉信息的良好替代方案,并且(2)在两个预处理流中分层组织视觉和触觉输入的处理在性能方面是有帮助的。然而,需要进一步研究以有效地量化每个已确定原则本身以及与其他原则结合时的作用。