Department of Psychology, The Ohio State University, Columbus, OH 43210, USA.
Neural Comput. 2012 May;24(5):1186-229. doi: 10.1162/NECO_a_00270. Epub 2012 Feb 1.
In this letter, we examine the computational mechanisms of reinforce-ment-based decision making. We bridge the gap across multiple levels of analysis, from neural models of corticostriatal circuits-the basal ganglia (BG) model (Frank, 2005 , 2006 ) to simpler but mathematically tractable diffusion models of two-choice decision making. Specifically, we generated simulated data from the BG model and fit the diffusion model (Ratcliff, 1978 ) to it. The standard diffusion model fits underestimated response times under conditions of high response and reinforcement conflict. Follow-up fits showed good fits to the data both by increasing nondecision time and by raising decision thresholds as a function of conflict and by allowing this threshold to collapse with time. This profile captures the role and dynamics of the subthalamic nucleus in BG circuitry, and as such, parametric modulations of projection strengths from this nucleus were associated with parametric increases in decision boundary and its modulation by conflict. We then present data from a human reinforcement learning experiment involving decisions with low- and high-reinforcement conflict. Again, the standard model failed to fit the data, but we found that two variants similar to those that fit the BG model data fit the experimental data, thereby providing a convergence of theoretical accounts of complex interactive decision-making mechanisms consistent with available data. This work also demonstrates how to make modest modifications to diffusion models to summarize core computations of the BG model. The result is a better fit and understanding of reinforcement-based choice data than that which would have occurred with either model alone.
在这封信中,我们研究了基于强化的决策的计算机制。我们跨越了多个分析层次,从皮质纹状体回路的神经模型(基底神经节(BG)模型(Frank,2005,2006))到更简单但数学上可处理的二选一决策扩散模型,弥合了这一差距。具体来说,我们从 BG 模型生成模拟数据,并将扩散模型(Ratcliff,1978)拟合到该模型中。标准扩散模型在高反应和强化冲突条件下拟合低估了反应时间。后续拟合显示,通过增加非决策时间和根据冲突提高决策阈值,以及允许该阈值随时间崩溃,数据拟合良好。这种情况捕捉到了基底神经节电路中丘脑下核的作用和动态,因此,从这个核投射的参数调制与决策边界的参数增加及其由冲突的调制有关。然后,我们展示了一项涉及低强化冲突和高强化冲突的人类强化学习实验的数据。同样,标准模型未能拟合数据,但我们发现,两种类似于拟合 BG 模型数据的变体拟合了实验数据,从而为复杂的交互式决策机制的理论解释提供了一致的结果,这些理论解释与现有数据一致。这项工作还展示了如何对扩散模型进行适度修改,以总结 BG 模型的核心计算。结果是对强化选择数据的拟合和理解比单独使用任何一种模型都要好。