价值从何而来？

Where Does Value Come From?

机构信息

Department of Experimental Psychology, University of Oxford, Radcliffe Observatory Quarter, Woodstock Road, Oxford, UK.

出版信息

Trends Cogn Sci. 2019 Oct;23(10):836-850. doi: 10.1016/j.tics.2019.07.012. Epub 2019 Sep 4.

DOI:10.1016/j.tics.2019.07.012

Abstract

The computational framework of reinforcement learning (RL) has allowed us to both understand biological brains and build successful artificial agents. However, in this opinion, we highlight open challenges for RL as a model of animal behaviour in natural environments. We ask how the external reward function is designed for biological systems, and how we can account for the context sensitivity of valuation. We summarise both old and new theories proposing that animals track current and desired internal states and seek to minimise the distance to a goal across multiple value dimensions. We suggest that this framework readily accounts for canonical phenomena observed in the fields of psychology, behavioural ecology, and economics, and recent findings from brain-imaging studies of value-guided decision-making.

摘要

强化学习（RL）的计算框架使我们既能理解生物大脑，又能构建成功的人工智能。然而，在这篇观点文章中，我们强调了 RL 作为自然环境中动物行为模型的一些开放性挑战。我们提出了如何为生物系统设计外部奖励函数，以及如何解释估值的上下文敏感性。我们总结了一些旧的和新的理论，这些理论提出动物会跟踪当前和期望的内部状态，并试图在多个价值维度上最小化到目标的距离。我们认为，这个框架可以很好地解释心理学、行为生态学和经济学领域以及最近的价值导向决策的大脑成像研究中的典型现象。

相似文献

Where Does Value Come From?

Trends Cogn Sci. 2019 Oct;23(10):836-850. doi: 10.1016/j.tics.2019.07.012. Epub 2019 Sep 4.

Executive Function Assigns Value to Novel Goal-Congruent Outcomes.

Cereb Cortex. 2021 Nov 23;32(1):231-247. doi: 10.1093/cercor/bhab205.

Distributional reinforcement learning in prefrontal cortex.

Nat Neurosci. 2024 Mar;27(3):403-408. doi: 10.1038/s41593-023-01535-w. Epub 2024 Jan 10.

Dynamic Interaction between Reinforcement Learning and Attention in Multidimensional Environments.

Neuron. 2017 Jan 18;93(2):451-463. doi: 10.1016/j.neuron.2016.12.040.

Multiple memory systems as substrates for multiple decision systems.

Neurobiol Learn Mem. 2015 Jan;117:4-13. doi: 10.1016/j.nlm.2014.04.014. Epub 2014 May 15.

Meta-reinforcement learning via orbitofrontal cortex.

Nat Neurosci. 2023 Dec;26(12):2182-2191. doi: 10.1038/s41593-023-01485-3. Epub 2023 Nov 13.

Quantum reinforcement learning during human decision-making.

Nat Hum Behav. 2020 Mar;4(3):294-307. doi: 10.1038/s41562-019-0804-2. Epub 2020 Jan 20.

Mechanisms of reinforcement learning and decision making in the primate dorsolateral prefrontal cortex.

Ann N Y Acad Sci. 2007 May;1104:108-22. doi: 10.1196/annals.1390.007. Epub 2007 Mar 8.

A goal-centric outlook on learning.

Trends Cogn Sci. 2023 Dec;27(12):1150-1164. doi: 10.1016/j.tics.2023.08.011. Epub 2023 Sep 9.

Navigating complex decision spaces: Problems and paradigms in sequential choice.

Psychol Bull. 2014 Mar;140(2):466-86. doi: 10.1037/a0033455. Epub 2013 Jul 8.

引用本文的文献

Separable neural signals for reward and emotion prediction errors.

Nat Commun. 2025 Aug 22;16(1):7849. doi: 10.1038/s41467-025-63135-5.

The interoceptive origin of reinforcement learning.

Trends Cogn Sci. 2025 Sep;29(9):840-854. doi: 10.1016/j.tics.2025.05.008. Epub 2025 Jun 10.

Extending homeostasis to thought dynamics for a comprehensive explanation of mind-wandering.

Sci Rep. 2025 Mar 13;15(1):8677. doi: 10.1038/s41598-025-92561-0.

The devilish details affecting TDRL models in dopamine research.

Trends Cogn Sci. 2025 May;29(5):434-447. doi: 10.1016/j.tics.2025.02.001. Epub 2025 Feb 26.

Neural signatures of risk-taking adaptions across health, bipolar disorder, and lithium treatment.

Mol Psychiatry. 2025 Jan 29. doi: 10.1038/s41380-025-02900-w.

The human reward system encodes the subjective value of ideas during creative thinking.

Commun Biol. 2025 Jan 10;8(1):37. doi: 10.1038/s42003-024-07427-4.

Affective integration in experience, judgment, and decision-making.

Commun Psychol. 2024 Dec 20;2(1):126. doi: 10.1038/s44271-024-00178-2.

Modeling long-term nutritional behaviors using deep homeostatic reinforcement learning.

PNAS Nexus. 2024 Nov 28;3(12):pgae540. doi: 10.1093/pnasnexus/pgae540. eCollection 2024 Dec.

Value construction through sequential sampling explains serial dependencies in decision making.

Elife. 2024 Dec 10;13:RP96997. doi: 10.7554/eLife.96997.

Reward Bases: A simple mechanism for adaptive acquisition of multiple reward types.

PLoS Comput Biol. 2024 Nov 19;20(11):e1012580. doi: 10.1371/journal.pcbi.1012580. eCollection 2024 Nov.

文献AI研究员

20分钟写一篇综述，助力文献阅读效率提升50倍。

立即体验

用中文搜PubMed

大模型驱动的PubMed中文搜索引擎

马上搜索

文档翻译

学术文献翻译模型，支持多种主流文档格式。

立即体验

价值从何而来？

Where Does Value Come From?

机构信息

出版信息

相似文献

引用本文的文献

文献AI研究员

用中文搜PubMed

文档翻译

Suppr 超能文献

相似文献

引用本文的文献