Neural Netw. 2013 Oct;46:62-74. doi: 10.1016/j.neunet.2013.04.010. Epub 2013 May 6.
This paper proposes a neuronal circuitry layout and synaptic plasticity principles that allow the (pyramidal) neuron to act as a "combinatorial switch". Namely, the neuron learns to be more prone to generate spikes given those combinations of firing input neurons for which a previous spiking of the neuron had been followed by a positive global reward signal. The reward signal may be mediated by certain modulatory hormones or neurotransmitters, e.g., the dopamine. More generally, a trial-and-error learning paradigm is suggested in which a global reward signal triggers long-term enhancement or weakening of a neuron's spiking response to the preceding neuronal input firing pattern. Thus, rewards provide a feedback pathway that informs neurons whether their spiking was beneficial or detrimental for a particular input combination. The neuron's ability to discern specific combinations of firing input neurons is achieved through a random or predetermined spatial distribution of input synapses on dendrites that creates synaptic clusters that represent various permutations of input neurons. The corresponding dendritic segments, or the enclosed individual spines, are capable of being particularly excited, due to local sigmoidal thresholding involving voltage-gated channel conductances, if the segment's excitatory and absence of inhibitory inputs are temporally coincident. Such nonlinear excitation corresponds to a particular firing combination of input neurons, and it is posited that the excitation strength encodes the combinatorial memory and is regulated by long-term plasticity mechanisms. It is also suggested that the spine calcium influx that may result from the spatiotemporal synaptic input coincidence may cause the spine head actin filaments to undergo mechanical (muscle-like) contraction, with the ensuing cytoskeletal deformation transmitted to the axon initial segment where it may modulate the global neuron firing threshold. The tasks of pattern classification and generalization are discussed within the presented framework.
本文提出了一种神经元电路布局和突触可塑性原理,使(锥)体神经元能够充当“组合开关”。也就是说,神经元学会了在先前神经元的一次放电后,跟随一个正的全局奖励信号,更容易产生对特定组合的放电输入神经元的尖峰。奖励信号可能由某些调节激素或神经递质介导,例如多巴胺。更一般地,提出了一种试错学习范例,其中全局奖励信号触发神经元对先前神经元输入放电模式的放电反应的长期增强或减弱。因此,奖励提供了一种反馈途径,告知神经元它们的放电对于特定输入组合是有益还是有害。神经元区分特定的输入神经元组合的能力是通过在树突上随机或预定的输入突触的空间分布来实现的,这种分布创建了代表输入神经元各种排列的突触簇。由于涉及电压门控通道电导的局部 sigmoidal 阈值,相应的树突段或包含的单个棘突,如果其兴奋性输入和抑制性输入不存在时间上的巧合,则能够被特别地激发。这种非线性激发对应于输入神经元的特定放电组合,并且假设激发强度编码组合记忆,并受长期可塑性机制的调节。还提出,可能由时空突触输入巧合引起的棘突钙内流可能导致棘突头部肌动蛋白丝发生机械(肌肉样)收缩,随之而来的细胞骨架变形传递到轴突起始段,从而调节全局神经元放电阈值。在提出的框架内讨论了模式分类和泛化的任务。