Costa Vincent D, Tran Valery L, Turchi Janita, Averbeck Bruno B
Laboratory of Neuropsychology, National Institute of Mental Health, National Institutes of Health, Bethesda Maryland 20892-4415.
Laboratory of Neuropsychology, National Institute of Mental Health, National Institutes of Health, Bethesda Maryland 20892-4415
J Neurosci. 2015 Feb 11;35(6):2407-16. doi: 10.1523/JNEUROSCI.1989-14.2015.
Reversal learning has been studied as the process of learning to inhibit previously rewarded actions. Deficits in reversal learning have been seen after manipulations of dopamine and lesions of the orbitofrontal cortex. However, reversal learning is often studied in animals that have limited experience with reversals. As such, the animals are learning that reversals occur during data collection. We have examined a task regime in which monkeys have extensive experience with reversals and stable behavioral performance on a probabilistic two-arm bandit reversal learning task. We developed a Bayesian analysis approach to examine the effects of manipulations of dopamine on reversal performance in this regime. We find that the analysis can clarify the strategy of the animal. Specifically, at reversal, the monkeys switch quickly from choosing one stimulus to choosing the other, as opposed to gradually transitioning, which might be expected if they were using a naive reinforcement learning (RL) update of value. Furthermore, we found that administration of haloperidol affects the way the animals integrate prior knowledge into their choice behavior. Animals had a stronger prior on where reversals would occur on haloperidol than on levodopa (l-DOPA) or placebo. This strong prior was appropriate, because the animals had extensive experience with reversals occurring in the middle of the block. Overall, we find that Bayesian dissection of the behavior clarifies the strategy of the animals and reveals an effect of haloperidol on integration of prior information with evidence in favor of a choice reversal.
反转学习被视为一个学会抑制先前得到奖励的行为的过程。在对多巴胺进行操控以及眶额皮质受损后,人们观察到了反转学习方面的缺陷。然而,反转学习通常是在对反转仅有有限经验的动物身上进行研究的。因此,这些动物是在数据收集过程中才了解到反转的发生。我们研究了一种任务模式,在这种模式下,猴子在概率性双臂强盗反转学习任务中对反转有丰富的经验且行为表现稳定。我们开发了一种贝叶斯分析方法来研究在这种模式下多巴胺操控对反转表现的影响。我们发现这种分析能够阐明动物的策略。具体而言,在反转时,猴子会迅速从选择一种刺激物切换到选择另一种刺激物,而不是像如果它们使用朴素强化学习(RL)的价值更新方式所预期的那样逐渐过渡。此外,我们发现给予氟哌啶醇会影响动物将先验知识整合到其选择行为中的方式。与左旋多巴(l-DOPA)或安慰剂相比,动物在使用氟哌啶醇时对反转发生位置的先验更强。这种强烈的先验是合理的,因为动物在实验块中间出现反转方面有丰富的经验。总体而言,我们发现对行为进行贝叶斯剖析能够阐明动物的策略,并揭示氟哌啶醇对先验信息与支持选择反转的证据进行整合的影响。