Hosoya Haruo, Hyvärinen Aapo
Computational Neuroscience Laboratories, ATR International, Kyoto 619-0288, Japan, and Presto, Japan Science and Technology Agency, Saitama 332-0012, Japan
Department of Computer Science and HIIT, University of Helsinki, Helsinki 00560, Finland
Neural Comput. 2016 Jul;28(7):1249-64. doi: 10.1162/NECO_a_00843. Epub 2016 May 12.
In visual modeling, invariance properties of visual cells are often explained by a pooling mechanism, in which outputs of neurons with similar selectivities to some stimulus parameters are integrated so as to gain some extent of invariance to other parameters. For example, the classical energy model of phase-invariant V1 complex cells pools model simple cells preferring similar orientation but different phases. Prior studies, such as independent subspace analysis, have shown that phase-invariance properties of V1 complex cells can be learned from spatial statistics of natural inputs. However, those previous approaches assumed a squaring nonlinearity on the neural outputs to capture energy correlation; such nonlinearity is arguably unnatural from a neurobiological viewpoint but hard to change due to its tight integration into their formalisms. Moreover, they used somewhat complicated objective functions requiring expensive computations for optimization. In this study, we show that visual spatial pooling can be learned in a much simpler way using strong dimension reduction based on principal component analysis. This approach learns to ignore a large part of detailed spatial structure of the input and thereby estimates a linear pooling matrix. Using this framework, we demonstrate that pooling of model V1 simple cells learned in this way, even with nonlinearities other than squaring, can reproduce standard tuning properties of V1 complex cells. For further understanding, we analyze several variants of the pooling model and argue that a reasonable pooling can generally be obtained from any kind of linear transformation that retains several of the first principal components and suppresses the remaining ones. In particular, we show how the classic Wiener filtering theory leads to one such variant.
在视觉建模中,视觉细胞的不变性属性通常由一种池化机制来解释,在这种机制中,对某些刺激参数具有相似选择性的神经元输出被整合起来,以便在一定程度上获得对其他参数的不变性。例如,经典的相位不变V1复合细胞能量模型对偏好相似方向但不同相位的模型简单细胞进行池化。先前的研究,如独立子空间分析,表明V1复合细胞的相位不变性属性可以从自然输入的空间统计中学习到。然而,那些先前的方法在神经输出上假设了一个平方非线性来捕捉能量相关性;从神经生物学的角度来看,这种非线性可以说是不自然的,但由于它紧密地整合到它们的形式体系中而难以改变。此外,它们使用了一些复杂的目标函数,优化时需要昂贵的计算。在本研究中,我们表明可以使用基于主成分分析的强大降维方法以一种简单得多的方式学习视觉空间池化。这种方法学会忽略输入详细空间结构的很大一部分,从而估计一个线性池化矩阵。使用这个框架,我们证明以这种方式学习的模型V1简单细胞的池化,即使使用平方以外的非线性,也能重现V1复合细胞的标准调谐属性。为了进一步理解,我们分析了池化模型的几个变体,并认为一般可以从任何一种保留几个第一主成分并抑制其余主成分的线性变换中获得合理的池化。特别是,我们展示了经典的维纳滤波理论如何导致这样一种变体。