人类探索性决策的皮质基础。
Cortical substrates for exploratory decisions in humans.
作者信息
Daw Nathaniel D, O'Doherty John P, Dayan Peter, Seymour Ben, Dolan Raymond J
机构信息
Gatsby Computational Neuroscience Unit, University College London (UCL), Alexandra House, 17 Queen Square, London WC1N 3AR, UK.
出版信息
Nature. 2006 Jun 15;441(7095):876-9. doi: 10.1038/nature04766.
Decision making in an uncertain environment poses a conflict between the opposing demands of gathering and exploiting information. In a classic illustration of this 'exploration-exploitation' dilemma, a gambler choosing between multiple slot machines balances the desire to select what seems, on the basis of accumulated experience, the richest option, against the desire to choose a less familiar option that might turn out more advantageous (and thereby provide information for improving future decisions). Far from representing idle curiosity, such exploration is often critical for organisms to discover how best to harvest resources such as food and water. In appetitive choice, substantial experimental evidence, underpinned by computational reinforcement learning (RL) theory, indicates that a dopaminergic, striatal and medial prefrontal network mediates learning to exploit. In contrast, although exploration has been well studied from both theoretical and ethological perspectives, its neural substrates are much less clear. Here we show, in a gambling task, that human subjects' choices can be characterized by a computationally well-regarded strategy for addressing the explore/exploit dilemma. Furthermore, using this characterization to classify decisions as exploratory or exploitative, we employ functional magnetic resonance imaging to show that the frontopolar cortex and intraparietal sulcus are preferentially active during exploratory decisions. In contrast, regions of striatum and ventromedial prefrontal cortex exhibit activity characteristic of an involvement in value-based exploitative decision making. The results suggest a model of action selection under uncertainty that involves switching between exploratory and exploitative behavioural modes, and provide a computationally precise characterization of the contribution of key decision-related brain systems to each of these functions.
在不确定环境中进行决策会在收集信息和利用信息这两种相互对立的需求之间产生冲突。在这个“探索 - 利用”困境的经典示例中,一名在多个老虎机之间做出选择的赌徒,需要在基于积累经验选择看起来最有收益的选项的欲望,与选择一个可能更具优势(从而为改进未来决策提供信息)但不太熟悉的选项的欲望之间进行权衡。这种探索远非代表着无意义的好奇心,对于生物体发现如何最好地获取食物和水等资源通常至关重要。在偏好选择中,大量基于计算强化学习(RL)理论的实验证据表明,多巴胺能、纹状体和内侧前额叶网络介导了利用性学习。相比之下,尽管从理论和行为学角度对探索都进行了充分研究,但其神经基础却不太明确。在这里,我们在一项赌博任务中表明,人类受试者的选择可以通过一种在计算上备受认可的策略来表征,该策略用于解决探索/利用困境。此外,利用这种表征将决策分类为探索性或利用性,我们采用功能磁共振成像来表明,在探索性决策过程中,额极皮质和顶内沟优先活跃。相比之下,纹状体和腹内侧前额叶皮质区域表现出参与基于价值的利用性决策的活动特征。这些结果提出了一种在不确定情况下的行动选择模型,该模型涉及在探索性和利用性行为模式之间切换,并为关键决策相关脑系统对这些功能的贡献提供了计算上精确的表征。