University of Amsterdam, Department of Psychology, Amsterdam, Netherlands.
Leiden University, Department of Psychology, Leiden, Netherlands.
Elife. 2021 Jan 27;10:e63055. doi: 10.7554/eLife.63055.
Learning and decision-making are interactive processes, yet cognitive modeling of error-driven learning and decision-making have largely evolved separately. Recently, evidence accumulation models (EAMs) of decision-making and reinforcement learning (RL) models of error-driven learning have been combined into joint RL-EAMs that can in principle address these interactions. However, we show that the most commonly used combination, based on the diffusion decision model (DDM) for binary choice, consistently fails to capture crucial aspects of response times observed during reinforcement learning. We propose a new RL-EAM based on an advantage racing diffusion (ARD) framework for choices among two or more options that not only addresses this problem but captures stimulus difficulty, speed-accuracy trade-off, and stimulus-response-mapping reversal effects. The RL-ARD avoids fundamental limitations imposed by the DDM on addressing effects of absolute values of choices, as well as extensions beyond binary choice, and provides a computationally tractable basis for wider applications.
学习和决策是相互作用的过程,但错误驱动学习和决策的认知建模在很大程度上是分开发展的。最近,决策的证据积累模型(EAMs)和错误驱动学习的强化学习(RL)模型已经被结合到联合 RL-EAMs 中,原则上可以解决这些相互作用。然而,我们表明,最常用的组合,基于二元选择的扩散决策模型(DDM),一致地未能捕捉到强化学习过程中观察到的反应时间的关键方面。我们提出了一种新的 RL-EAM,基于用于两个或更多选项选择的优势竞争扩散(ARD)框架,不仅解决了这个问题,还捕捉了刺激难度、速度-准确性权衡以及刺激-反应映射反转效应。RL-ARD 避免了 DDM 对解决选择的绝对值的影响以及超越二元选择的扩展所施加的基本限制,并为更广泛的应用提供了计算上易于处理的基础。