Suppr超能文献

通过自信学习最优决策。

Learning optimal decisions with confidence.

机构信息

Department of Neurobiology, Harvard Medical School, Boston, MA 02115;

Champalimaud Research, Champalimaud Centre for the Unknown, 1400-038 Lisbon, Portugal.

出版信息

Proc Natl Acad Sci U S A. 2019 Dec 3;116(49):24872-24880. doi: 10.1073/pnas.1906787116. Epub 2019 Nov 15.

Abstract

Diffusion decision models (DDMs) are immensely successful models for decision making under uncertainty and time pressure. In the context of perceptual decision making, these models typically start with two input units, organized in a neuron-antineuron pair. In contrast, in the brain, sensory inputs are encoded through the activity of large neuronal populations. Moreover, while DDMs are wired by hand, the nervous system must learn the weights of the network through trial and error. There is currently no normative theory of learning in DDMs and therefore no theory of how decision makers could learn to make optimal decisions in this context. Here, we derive such a rule for learning a near-optimal linear combination of DDM inputs based on trial-by-trial feedback. The rule is Bayesian in the sense that it learns not only the mean of the weights but also the uncertainty around this mean in the form of a covariance matrix. In this rule, the rate of learning is proportional (respectively, inversely proportional) to confidence for incorrect (respectively, correct) decisions. Furthermore, we show that, in volatile environments, the rule predicts a bias toward repeating the same choice after correct decisions, with a bias strength that is modulated by the previous choice's difficulty. Finally, we extend our learning rule to cases for which one of the choices is more likely a priori, which provides insights into how such biases modulate the mechanisms leading to optimal decisions in diffusion models.

摘要

扩散决策模型(DDM)是在不确定和时间压力下进行决策的非常成功的模型。在感知决策的背景下,这些模型通常从两个输入单元开始,这些单元以神经元-反神经元对的形式组织。相比之下,在大脑中,感觉输入是通过大量神经元群体的活动来编码的。此外,虽然 DDM 是手动布线的,但神经系统必须通过反复试验来学习网络的权重。目前,DDM 中没有规范的学习理论,因此也没有关于决策者如何在这种情况下学习做出最佳决策的理论。在这里,我们根据逐次反馈推导了一种基于学习 DDM 输入的近最优线性组合的规则。该规则在贝叶斯意义上是学习的,不仅学习权重的均值,而且还以协方差矩阵的形式学习围绕该均值的不确定性。在这个规则中,学习的速度与错误(正确)决策的置信度成正比(反比)。此外,我们表明,在不稳定的环境中,该规则预测在正确决策后会偏向于重复相同的选择,并且这种偏向的强度会受到先前选择的难度的调节。最后,我们将我们的学习规则扩展到其中一个选择更有可能先验的情况,这为这些偏见如何调节导致扩散模型中最优决策的机制提供了一些见解。

相似文献

1
Learning optimal decisions with confidence.通过自信学习最优决策。
Proc Natl Acad Sci U S A. 2019 Dec 3;116(49):24872-24880. doi: 10.1073/pnas.1906787116. Epub 2019 Nov 15.
6
Reward-modulated Hebbian learning of decision making.奖励调节的决策赫布学习。
Neural Comput. 2010 Jun;22(6):1399-444. doi: 10.1162/neco.2010.03-09-980.

引用本文的文献

5
A low-dimensional approximation of optimal confidence.最优置信的低维逼近。
PLoS Comput Biol. 2024 Jul 24;20(7):e1012273. doi: 10.1371/journal.pcbi.1012273. eCollection 2024 Jul.
6
Bayesian confidence in optimal decisions.贝叶斯置信度在最优决策中的应用。
Psychol Rev. 2024 Oct;131(5):1114-1160. doi: 10.1037/rev0000472. Epub 2024 Jul 18.

本文引用的文献

2
Counterfactual Reasoning Underlies the Learning of Priors in Decision Making.反事实推理是决策中先验学习的基础。
Neuron. 2018 Sep 5;99(5):1083-1097.e6. doi: 10.1016/j.neuron.2018.07.035. Epub 2018 Aug 16.
5
Optimal policy for value-based decision-making.基于价值的决策的最优策略。
Nat Commun. 2016 Aug 18;7:12400. doi: 10.1038/ncomms12400.
9
Information-limiting correlations.信息限制相关性
Nat Neurosci. 2014 Oct;17(10):1410-7. doi: 10.1038/nn.3807. Epub 2014 Sep 7.

文献检索

告别复杂PubMed语法,用中文像聊天一样搜索,搜遍4000万医学文献。AI智能推荐,让科研检索更轻松。

立即免费搜索

文件翻译

保留排版,准确专业,支持PDF/Word/PPT等文件格式,支持 12+语言互译。

免费翻译文档

深度研究

AI帮你快速写综述,25分钟生成高质量综述,智能提取关键信息,辅助科研写作。

立即免费体验