Hosoda Kenji, Watanabe Masataka, Wersing Heiko, Körner Edgar, Tsujino Hiroshi, Tamura Hiroshi, Fujita Ichiro
Department of Quantum Engineering and Systems Science, University of Tokyo, Tokyo, Japan.
Neural Comput. 2009 Sep;21(9):2605-33. doi: 10.1162/neco.2009.03-08-722.
Object representation in the inferior temporal cortex (IT), an area of visual cortex critical for object recognition in the primate, exhibits two prominent properties: (1) objects are represented by the combined activity of columnar clusters of neurons, with each cluster representing component features or parts of objects, and (2) closely related features are continuously represented along the tangential direction of individual columnar clusters. Here we propose a learning model that reflects these properties of parts-based representation and topographic organization in a unified framework. This model is based on a nonnegative matrix factorization (NMF) basis decomposition method. NMF alone provides a parts-based representation where nonnegative inputs are approximated by additive combinations of nonnegative basis functions. Our proposed model of topographic NMF (TNMF) incorporates neighborhood connections between NMF basis functions arranged on a topographic map and attains the topographic property without losing the parts-based property of the NMF. The TNMF represents an input by multiple activity peaks to describe diverse information, whereas conventional topographic models, such as the self-organizing map (SOM), represent an input by a single activity peak in a topographic map. We demonstrate the parts-based and topographic properties of the TNMF by constructing a hierarchical model for object recognition where the TNMF is at the top tier for learning high-level object features. The TNMF showed better generalization performance over NMF for a data set of continuous view change of an image and more robustly preserving the continuity of the view change in its object representation. Comparison of the outputs of our model with actual neural responses recorded in the IT indicates that the TNMF reconstructs the neuronal responses better than the SOM, giving plausibility to the parts-based learning of the model.
颞下皮质(IT)中的物体表征,这是灵长类动物中对物体识别至关重要的视觉皮质区域,表现出两个突出特性:(1)物体由神经元柱状簇的联合活动来表征,每个簇代表物体的组成特征或部分;(2)密切相关的特征沿着单个柱状簇的切线方向连续表征。在此,我们提出一种学习模型,该模型在统一框架中反映了基于部分的表征和地形组织的这些特性。此模型基于非负矩阵分解(NMF)基分解方法。单独的NMF提供一种基于部分的表征,其中非负输入由非负基函数的加法组合来近似。我们提出的地形NMF(TNMF)模型纳入了排列在地形图上的NMF基函数之间的邻域连接,并在不丧失NMF基于部分的特性的情况下实现了地形特性。TNMF通过多个活动峰值来表征输入以描述多样信息,而传统的地形模型,如自组织映射(SOM),通过地形图中的单个活动峰值来表征输入。我们通过构建一个用于物体识别的层次模型来证明TNMF的基于部分和地形的特性,其中TNMF处于学习高级物体特征的顶层。对于图像连续视图变化的数据集,TNMF比NMF表现出更好的泛化性能,并且在其物体表征中更稳健地保留了视图变化的连续性。将我们模型的输出与在IT中记录的实际神经反应进行比较表明,TNMF比SOM能更好地重建神经反应,这使该模型基于部分的学习具有合理性。