Gruner Patricia, Anticevic Alan, Lee Daeyeol, Pittenger Christopher
Department of Psychiatry, Yale University, New Haven, CT, USA Learning Based Recovery Center, VA Connecticut Health System, West Haven, CT, USA.
Department of Psychiatry, Yale University, New Haven, CT, USA Department of Psychology, Yale University, New Haven, CT, USA Interdepartmental Neuroscience Program, Yale University, New Haven, CT, USA.
Neuroscientist. 2016 Apr;22(2):188-98. doi: 10.1177/1073858414568317. Epub 2015 Jan 20.
Decision making in a complex world, characterized both by predictable regularities and by frequent departures from the norm, requires dynamic switching between rapid habit-like, automatic processes and slower, more flexible evaluative processes. These strategies, formalized as "model-free" and "model-based" reinforcement learning algorithms, respectively, can lead to divergent behavioral outcomes, requiring a mechanism to arbitrate between them in a context-appropriate manner. Recent data suggest that individuals with obsessive-compulsive disorder (OCD) rely excessively on inflexible habit-like decision making during reinforcement-driven learning. We propose that inflexible reliance on habit in OCD may reflect a functional weakness in the mechanism for context-appropriate dynamic arbitration between model-free and model-based decision making. Support for this hypothesis derives from emerging functional imaging findings. A deficit in arbitration in OCD may help reconcile evidence for excessive reliance on habit in rewarded learning tasks with an older literature suggesting inappropriate recruitment of circuitry associated with model-based decision making in unreinforced procedural learning. The hypothesized deficit and corresponding circuitry may be a particularly fruitful target for interventions, including cognitive remediation.
在一个既具有可预测规律又频繁偏离常态的复杂世界中进行决策,需要在快速的习惯式自动过程和较慢的、更灵活的评估过程之间进行动态切换。这些策略分别被形式化为“无模型”和“基于模型”的强化学习算法,可能会导致不同的行为结果,这就需要一种机制在适当的情境中对它们进行仲裁。最近的数据表明,患有强迫症(OCD)的个体在强化驱动学习过程中过度依赖僵化的习惯式决策。我们提出,强迫症中对习惯的僵化依赖可能反映了在无模型和基于模型的决策之间进行情境适当动态仲裁机制的功能缺陷。对这一假设的支持来自新出现的功能成像研究结果。强迫症中仲裁功能的缺陷可能有助于调和在奖励学习任务中过度依赖习惯的证据与早期文献中关于在无强化程序学习中不适当调用与基于模型决策相关的神经回路的证据。假设的缺陷和相应的神经回路可能是包括认知修复在内的干预措施特别有效的目标。