Isbister James B, Eguchi Akihiro, Ahmad Nasir, Galeazzi Juan M, Buckley Mark J, Stringer Simon
Oxford Centre for Theoretical Neuroscience and Artificial Intelligence, University of Oxford, Oxford OX2 6GG, UK.
Oxford Brain and Behaviour Group, Department of Experimental Psychology, University of Oxford, Oxford OX2 6GG, UK.
Interface Focus. 2018 Aug 6;8(4):20180021. doi: 10.1098/rsfs.2018.0021. Epub 2018 Jun 15.
We discuss a recently proposed approach to solve the classic feature-binding problem in primate vision that uses neural dynamics known to be present within the visual cortex. Broadly, the feature-binding problem in the visual context concerns not only how a hierarchy of features such as edges and objects within a scene are represented, but also the hierarchical relationships between these features at every spatial scale across the visual field. This is necessary for the visual brain to be able to make sense of its visuospatial world. Solving this problem is an important step towards the development of artificial general intelligence. In neural network simulation studies, it has been found that neurons encoding the binding relations between visual features, known as binding neurons, emerge during visual training when key properties of the visual cortex are incorporated into the models. These biological network properties include (i) bottom-up, lateral and top-down synaptic connections, (ii) spiking neuronal dynamics, (iii) spike timing-dependent plasticity, and (iv) a random distribution of axonal transmission delays (of the order of several milliseconds) in the propagation of spikes between neurons. After training the network on a set of visual stimuli, modelling studies have reported observing the gradual emergence of polychronization through successive layers of the network, in which subpopulations of neurons have learned to emit their spikes in regularly repeating spatio-temporal patterns in response to specific visual stimuli. Such a subpopulation of neurons is known as a polychronous neuronal group (PNG). Some neurons embedded within these PNGs receive convergent inputs from neurons representing lower- and higher-level visual features, and thus appear to encode the hierarchical binding relationship between features. Neural activity with this kind of spatio-temporal structure robustly emerges in the higher network layers even when neurons in the input layer represent visual stimuli with spike timings that are randomized according to a Poisson distribution. The resulting hierarchical representation of visual scenes in such models, including the representation of hierarchical binding relations between lower- and higher-level visual features, is consistent with the hierarchical phenomenology or subjective experience of primate vision and is distinct from approaches interested in segmenting a visual scene into a finite set of objects.
我们讨论一种最近提出的解决灵长类视觉中经典特征绑定问题的方法,该方法利用了已知存在于视觉皮层内的神经动力学。广义而言,视觉背景下的特征绑定问题不仅涉及场景中诸如边缘和物体等特征层次结构的表示方式,还涉及整个视野中每个空间尺度上这些特征之间的层次关系。这对于视觉大脑理解其视觉空间世界是必要的。解决这个问题是迈向通用人工智能发展的重要一步。在神经网络模拟研究中发现,在将视觉皮层的关键特性纳入模型进行视觉训练期间,会出现编码视觉特征之间绑定关系的神经元,即绑定神经元。这些生物网络特性包括:(i)自下而上、横向和自上而下的突触连接;(ii)脉冲神经元动力学;(iii)脉冲时间依赖可塑性;(iv)神经元之间脉冲传播中轴突传输延迟的随机分布(几毫秒量级)。在一组视觉刺激上训练网络后,建模研究报告称观察到通过网络的连续层逐渐出现多同步现象,其中神经元亚群已学会响应特定视觉刺激以规则重复的时空模式发放脉冲。这样的神经元亚群被称为多同步神经元群(PNG)。嵌入这些PNG中的一些神经元接收来自代表低级和高级视觉特征的神经元的汇聚输入,因此似乎编码了特征之间的层次绑定关系。即使输入层中的神经元以根据泊松分布随机化的脉冲时间表示视觉刺激,具有这种时空结构的神经活动仍会在较高网络层中稳健出现。此类模型中视觉场景的最终层次表示,包括低级和高级视觉特征之间层次绑定关系的表示,与灵长类视觉的层次现象学或主观体验一致,并且不同于将视觉场景分割为有限对象集的方法。