Department of Psychology, University of Maryland-College Park, College Park, Maryland, United States of America.
Program in Neuroscience and Cognitive Science, University of Maryland-College Park, College Park, Maryland, United States of America.
PLoS Comput Biol. 2022 May 5;18(5):e1010047. doi: 10.1371/journal.pcbi.1010047. eCollection 2022 May.
A large literature has accumulated suggesting that human and animal decision making is driven by at least two systems, and that important functions of these systems can be captured by reinforcement learning algorithms. The "model-free" system caches and uses stimulus-value or stimulus-response associations, and the "model-based" system implements more flexible planning using a model of the world. However, it is not clear how the two systems interact during deliberation and how a single decision emerges from this process, especially when they disagree. Most previous work has assumed that while the systems operate in parallel, they do so independently, and they combine linearly to influence decisions. Using an integrated reinforcement learning/drift-diffusion model, we tested the hypothesis that the two systems interact in a non-linear fashion similar to other situations with cognitive conflict. We differentiated two forms of conflict: action conflict, a binary state representing whether the systems disagreed on the best action, and value conflict, a continuous measure of the extent to which the two systems disagreed on the difference in value between the available options. We found that decisions with greater value conflict were characterized by reduced model-based control and increased caution both with and without action conflict. Action conflict itself (the binary state) acted in the opposite direction, although its effects were less prominent. We also found that between-system conflict was highly correlated with within-system conflict, and although it is less clear a priori why the latter might influence the strength of each system above its standard linear contribution, we could not rule it out. Our work highlights the importance of non-linear conflict effects, and provides new constraints for more detailed process models of decision making. It also presents new avenues to explore with relation to disorders of compulsivity, where an imbalance between systems has been implicated.
大量文献表明,人类和动物的决策至少由两个系统驱动,这些系统的重要功能可以通过强化学习算法来捕捉。“无模型”系统缓存和使用刺激-值或刺激-反应关联,而“模型基”系统使用世界模型来实现更灵活的规划。然而,目前尚不清楚两个系统在审议过程中如何相互作用,以及如何从这个过程中产生单个决策,尤其是当它们不一致时。大多数先前的工作假设,虽然两个系统并行运行,但它们是独立运行的,并且线性组合以影响决策。我们使用整合强化学习/漂移扩散模型来检验以下假设:两个系统以类似于其他认知冲突情况的非线性方式相互作用。我们区分了两种冲突形式:行动冲突,代表系统在最佳行动上是否存在分歧的二进制状态,以及价值冲突,代表两个系统在可用选项之间的价值差异上存在分歧的连续度量。我们发现,价值冲突较大的决策的特点是基于模型的控制减少,以及在有或没有行动冲突的情况下谨慎性增加。行动冲突本身(二进制状态)则相反,但它的影响不太明显。我们还发现,系统间冲突与系统内冲突高度相关,虽然从先验来看,后者为何可能会影响每个系统的强度超出其标准线性贡献,这一点不太清楚,但我们不能排除这种可能性。我们的工作强调了非线性冲突效应的重要性,并为决策的更详细过程模型提供了新的约束。它还为与强迫障碍相关的研究提供了新的途径,在强迫障碍中,系统之间的不平衡已经被牵连。