Gillan Claire M, Otto A Ross, Phelps Elizabeth A, Daw Nathaniel D
Department of Psychology, New York University, 6 Washington Place, New York, NY, 10003, USA,
Cogn Affect Behav Neurosci. 2015 Sep;15(3):523-36. doi: 10.3758/s13415-015-0347-6.
Studies in humans and rodents have suggested that behavior can at times be "goal-directed"-that is, planned, and purposeful-and at times "habitual"-that is, inflexible and automatically evoked by stimuli. This distinction is central to conceptions of pathological compulsion, as in drug abuse and obsessive-compulsive disorder. Evidence for the distinction has primarily come from outcome devaluation studies, in which the sensitivity of a previously learned behavior to motivational change is used to assay the dominance of habits versus goal-directed actions. However, little is known about how habits and goal-directed control arise. Specifically, in the present study we sought to reveal the trial-by-trial dynamics of instrumental learning that would promote, and protect against, developing habits. In two complementary experiments with independent samples, participants completed a sequential decision task that dissociated two computational-learning mechanisms, model-based and model-free. We then tested for habits by devaluing one of the rewards that had reinforced behavior. In each case, we found that individual differences in model-based learning predicted the participants' subsequent sensitivity to outcome devaluation, suggesting that an associative mechanism underlies a bias toward habit formation in healthy individuals.
对人类和啮齿动物的研究表明,行为有时是“目标导向型”的——也就是说,是有计划、有目的的——有时是“习惯性”的——也就是说,是不灵活的,且由刺激自动引发的。这种区分对于病理性强迫(如药物滥用和强迫症)的概念至关重要。这种区分的证据主要来自结果贬值研究,在该研究中,先前习得行为对动机变化的敏感性被用来测定习惯与目标导向行动的主导性。然而,关于习惯和目标导向控制是如何产生的,人们知之甚少。具体而言,在本研究中,我们试图揭示有助于促进和防止习惯形成的工具性学习的逐次试验动态。在两个针对独立样本的互补实验中,参与者完成了一项顺序决策任务,该任务区分了两种计算学习机制,即基于模型的和无模型的。然后,我们通过贬低强化行为的其中一种奖励来测试习惯。在每种情况下,我们发现基于模型学习的个体差异预测了参与者随后对结果贬值的敏感性,这表明一种联想机制是健康个体中习惯形成偏向的基础。