Loewenstein Yonatan, Seung H Sebastian
Howard Hughes Medical Institute and the Department of Brain and Cognitive Sciences, Massachusetts Institute of Technology, Cambridge, MA 02139, USA.
Proc Natl Acad Sci U S A. 2006 Oct 10;103(41):15224-9. doi: 10.1073/pnas.0505220103. Epub 2006 Sep 28.
The probability of choosing an alternative in a long sequence of repeated choices is proportional to the total reward derived from that alternative, a phenomenon known as Herrnstein's matching law. This behavior is remarkably conserved across species and experimental conditions, but its underlying neural mechanisms still are unknown. Here, we propose a neural explanation of this empirical law of behavior. We hypothesize that there are forms of synaptic plasticity driven by the covariance between reward and neural activity and prove mathematically that matching is a generic outcome of such plasticity. Two hypothetical types of synaptic plasticity, embedded in decision-making neural network models, are shown to yield matching behavior in numerical simulations, in accord with our general theorem. We show how this class of models can be tested experimentally by making reward not only contingent on the choices of the subject but also directly contingent on fluctuations in neural activity. Maximization is shown to be a generic outcome of synaptic plasticity driven by the sum of the covariances between reward and all past neural activities.
在一系列重复选择中,选择某一选项的概率与该选项所带来的总奖励成正比,这一现象被称为赫尔斯坦匹配定律。这种行为在物种和实验条件中都显著保守,但其潜在的神经机制仍然未知。在此,我们提出了对这一行为经验法则的神经学解释。我们假设存在由奖励与神经活动之间的协方差驱动的突触可塑性形式,并通过数学证明匹配是这种可塑性的一般结果。嵌入决策神经网络模型中的两种假设类型的突触可塑性在数值模拟中显示出产生匹配行为,这与我们的一般定理一致。我们展示了如何通过使奖励不仅取决于主体的选择,还直接取决于神经活动的波动来对这类模型进行实验测试。最大化被证明是由奖励与所有过去神经活动之间的协方差之和驱动的突触可塑性的一般结果。