Lessmann Markus, Würtz Rolf P
Institute for Neural Computation, Ruhr-University Bochum, Germany.
Neural Netw. 2014 Jun;54:70-84. doi: 10.1016/j.neunet.2014.02.011. Epub 2014 Mar 12.
Invariant object recognition, which means the recognition of object categories independent of conditions like viewing angle, scale and illumination, is a task of great interest that humans can fulfill much better than artificial systems. During the last years several basic principles were derived from neurophysiological observations and careful consideration: (1) Developing invariance to possible transformations of the object by learning temporal sequences of visual features that occur during the respective alterations. (2) Learning in a hierarchical structure, so basic level (visual) knowledge can be reused for different kinds of objects. (3) Using feedback to compare predicted input with the current one for choosing an interpretation in the case of ambiguous signals. In this paper we propose a network which implements all of these concepts in a computationally efficient manner which gives very good results on standard object datasets. By dynamically switching off weakly active neurons and pruning weights computation is sped up and thus handling of large databases with several thousands of images and a number of categories in a similar order becomes possible. The involved parameters allow flexible adaptation to the information content of training data and allow tuning to different databases relatively easily. Precondition for successful learning is that training images are presented in an order assuring that images of the same object under similar viewing conditions follow each other. Through an implementation with sparse data structures the system has moderate memory demands and still yields very good recognition rates.
不变物体识别是指识别与视角、尺度和光照等条件无关的物体类别,这是一项人类比人工系统完成得好得多的极具吸引力的任务。在过去几年中,从神经生理学观察和仔细思考中得出了几个基本原则:(1)通过学习在相应变化过程中出现的视觉特征的时间序列,形成对物体可能变换的不变性。(2)以分层结构进行学习,以便基本水平(视觉)知识可用于不同类型的物体。(3)在信号模糊的情况下,利用反馈将预测输入与当前输入进行比较以选择一种解释。在本文中,我们提出了一种网络,该网络以计算高效的方式实现所有这些概念,在标准物体数据集上取得了很好的结果。通过动态关闭弱激活神经元和修剪权重,计算速度得以加快,从而能够处理包含数千张图像和数量相近类别的大型数据库。所涉及的参数允许灵活适应训练数据的信息内容,并相对容易地调整以适应不同的数据库。成功学习的前提是训练图像按一定顺序呈现,确保在相似视角条件下同一物体的图像相继出现。通过使用稀疏数据结构实现,该系统对内存需求适中,且仍能产生非常高的识别率。