生成模型中非线性前向光学的获取：遮挡视觉的两阶段“自下而上”学习。

Nagano Station, Japan Broadcasting Corporation, 210-2, Inaba, Nagano-City, 380-8502, Japan.

Neural Netw. 2011 Mar;24(2):148-58. doi: 10.1016/j.neunet.2010.10.004. Epub 2010 Oct 27.

We propose a two-stage learning method which implements occluded visual scene analysis into a generative model, a type of hierarchical neural network with bi-directional synaptic connections. Here, top-down connections simulate forward optics to generate predictions for sensory driven low-level representation, whereas bottom-up connections function to send the prediction error, the difference between the sensory based and the predicted low-level representation, to higher areas. The prediction error is then used to update the high-level representation to obtain better agreement with the visual scene. Although the actual forward optics is highly nonlinear and the accuracy of simulated forward optics is crucial for these types of models, the majority of previous studies have only investigated linear and simplified cases of forward optics. Here we take occluded vision as an example of nonlinear forward optics, where an object in front completely masks out the object behind. We propose a two-staged learning method inspired by the staged development of infant visual capacity. In the primary learning stage, a minimal set of object basis is acquired within a linear generative model using the conventional unsupervised learning scheme. In the secondary learning stage, an auxiliary multi-layer neural network is trained to acquire nonlinear forward optics by supervised learning. The important point is that the high-level representation of the linear generative model serves as the input and the sensory driven low-level representation provides the desired output. Numerical simulations show that occluded visual scene analysis can indeed be implemented by the proposed method. Furthermore, considering the format of input to the multi-layer network and analysis of hidden-layer units leads to the prediction that whole object representation of partially occluded objects, together with complex intermediate representation as a consequence of nonlinear transformation from non-occluded to occluded representation may exist in the low-level visual system of the brain.

我们提出了一种两阶段学习方法，将遮挡视觉场景分析纳入生成模型中，生成模型是一种具有双向突触连接的分层神经网络。在这里，自上而下的连接模拟前向光学，为感觉驱动的低级表示生成预测，而自下而上的连接则将预测误差（感觉基础和预测的低级表示之间的差异）发送到更高的区域。然后，预测误差用于更新高级表示，以更好地与视觉场景一致。尽管实际的前向光学高度非线性，并且模拟前向光学的准确性对于这些类型的模型至关重要，但大多数先前的研究仅研究了前向光学的线性和简化情况。在这里，我们以遮挡视觉为例，这是一种物体完全遮挡后面物体的非线性前向光学。我们提出了一种两阶段学习方法，灵感来自婴儿视觉能力的分阶段发展。在初级学习阶段，使用传统的无监督学习方案，在线性生成模型中获取最小的物体基集合。在二次学习阶段，通过监督学习训练辅助多层神经网络来获取非线性前向光学。重要的是，线性生成模型的高级表示作为输入，感觉驱动的低级表示提供所需的输出。数值模拟表明，所提出的方法确实可以实现遮挡视觉场景分析。此外，考虑到多层网络的输入格式和隐藏层单元的分析，预测表明，部分遮挡物体的整体物体表示以及非线性从非遮挡到遮挡表示的变换的复杂中间表示可能存在于大脑的低级视觉系统中。

相似文献

Acquisition of nonlinear forward optics in generative models: two-stage "downside-up" learning for occluded vision.

Neural Netw. 2011 Mar;24(2):148-58. doi: 10.1016/j.neunet.2010.10.004. Epub 2010 Oct 27.

Visual recognition and inference using dynamic overcomplete sparse learning.

Neural Comput. 2007 Sep;19(9):2301-52. doi: 10.1162/neco.2007.19.9.2301.

A model for learning topographically organized parts-based representations of objects in visual cortex: topographic nonnegative matrix factorization.

Neural Comput. 2009 Sep;21(9):2605-33. doi: 10.1162/neco.2009.03-08-722.

Cortext: a columnar model of bottom-up and top-down processing in the neocortex.

Neural Netw. 2009 Oct;22(8):1055-70. doi: 10.1016/j.neunet.2009.07.021. Epub 2009 Jul 19.

Learning and inference in the brain.

Neural Netw. 2003 Nov;16(9):1325-52. doi: 10.1016/j.neunet.2003.06.005.

Dynamic causal modeling of evoked responses in EEG and MEG.

Neuroimage. 2006 May 1;30(4):1255-72. doi: 10.1016/j.neuroimage.2005.10.045. Epub 2006 Feb 9.

A hybrid learning network for shift, orientation, and scaling invariant pattern recognition.

Network. 2001 Nov;12(4):493-512.

Artificial vision by multi-layered neural networks: neocognitron and its advances.

Neural Netw. 2013 Jan;37:103-19. doi: 10.1016/j.neunet.2012.09.016. Epub 2012 Oct 5.

Spatial scene representations formed by self-organizing learning in a hippocampal extension of the ventral visual system.

Eur J Neurosci. 2008 Nov;28(10):2116-27. doi: 10.1111/j.1460-9568.2008.06486.x.

Invariant object recognition with trace learning and multiple stimuli present during training.

Network. 2007 Jun;18(2):161-87. doi: 10.1080/09548980701556055.

Suppr 超能文献

核心技术专利：CN118964589B侵权必究

相似文献

Acquisition of nonlinear forward optics in generative models: two-stage "downside-up" learning for occluded vision.

Neural Netw. 2011 Mar;24(2):148-58. doi: 10.1016/j.neunet.2010.10.004. Epub 2010 Oct 27.

Visual recognition and inference using dynamic overcomplete sparse learning.

Neural Comput. 2007 Sep;19(9):2301-52. doi: 10.1162/neco.2007.19.9.2301.

A model for learning topographically organized parts-based representations of objects in visual cortex: topographic nonnegative matrix factorization.

Neural Comput. 2009 Sep;21(9):2605-33. doi: 10.1162/neco.2009.03-08-722.

Cortext: a columnar model of bottom-up and top-down processing in the neocortex.

Neural Netw. 2009 Oct;22(8):1055-70. doi: 10.1016/j.neunet.2009.07.021. Epub 2009 Jul 19.

Learning and inference in the brain.

Neural Netw. 2003 Nov;16(9):1325-52. doi: 10.1016/j.neunet.2003.06.005.

Dynamic causal modeling of evoked responses in EEG and MEG.

Neuroimage. 2006 May 1;30(4):1255-72. doi: 10.1016/j.neuroimage.2005.10.045. Epub 2006 Feb 9.

A hybrid learning network for shift, orientation, and scaling invariant pattern recognition.

Network. 2001 Nov;12(4):493-512.

Artificial vision by multi-layered neural networks: neocognitron and its advances.

Neural Netw. 2013 Jan;37:103-19. doi: 10.1016/j.neunet.2012.09.016. Epub 2012 Oct 5.

Spatial scene representations formed by self-organizing learning in a hippocampal extension of the ventral visual system.

Eur J Neurosci. 2008 Nov;28(10):2116-27. doi: 10.1111/j.1460-9568.2008.06486.x.

Invariant object recognition with trace learning and multiple stimuli present during training.

Network. 2007 Jun;18(2):161-87. doi: 10.1080/09548980701556055.

Acquisition of nonlinear forward optics in generative models: two-stage "downside-up" learning for occluded vision.

机构信息

出版信息

相似文献

文献AI研究员

用中文搜PubMed

文档翻译

Suppr 超能文献

相似文献