Suppr超能文献

与马尔理论相结合的强化学习

Reinforcement learning with Marr.

作者信息

Niv Yael, Langdon Angela

机构信息

Psychology Department & Princeton Neuroscience Institute, Princeton University, Princeton, New Jersey, 08540.

出版信息

Curr Opin Behav Sci. 2016 Oct;11:67-73. doi: 10.1016/j.cobeha.2016.04.005.

Abstract

To many, the poster child for David Marr's famous three levels of scientific inquiry is reinforcement learning-a computational theory of reward optimization, which readily prescribes algorithmic solutions that evidence striking resemblance to signals found in the brain, suggesting a straightforward neural implementation. Here we review questions that remain open at each level of analysis, concluding that the path forward to their resolution calls for inspiration across levels, rather than a focus on mutual constraints.

摘要

对许多人来说,大卫·马尔著名的三个科学探究层次的典型代表是强化学习——一种奖励优化的计算理论,它很容易给出算法解决方案,这些方案与大脑中发现的信号惊人地相似,这表明有一种直接的神经实现方式。在这里,我们回顾了在每个分析层次上仍然悬而未决的问题,得出结论:解决这些问题的前进道路需要跨层次的启发,而不是专注于相互约束。

相似文献

1
Reinforcement learning with Marr.
Curr Opin Behav Sci. 2016 Oct;11:67-73. doi: 10.1016/j.cobeha.2016.04.005.
2
An assessment of Marr's theory of the hippocampus as a temporary memory store.
Philos Trans R Soc Lond B Biol Sci. 1990 Aug 29;329(1253):205-15. doi: 10.1098/rstb.1990.0165.
3
Marr's Levels Revisited: Understanding How Brains Break.
Top Cogn Sci. 2015 Apr;7(2):259-73. doi: 10.1111/tops.12130. Epub 2015 Apr 23.
4
Marr's Attacks: On Reductionism and Vagueness.
Top Cogn Sci. 2015 Apr;7(2):323-35. doi: 10.1111/tops.12133. Epub 2015 Mar 5.
5
Marr and reductionism.
Top Cogn Sci. 2015 Apr;7(2):299-311. doi: 10.1111/tops.12134. Epub 2015 Mar 13.
6
The Non-Redundant Contributions of Marr's Three Levels of Analysis for Explaining Information-Processing Mechanisms.
Top Cogn Sci. 2015 Apr;7(2):312-22. doi: 10.1111/tops.12141. Epub 2015 Apr 22.
8
The algorithmic level is the bridge between computation and brain.
Top Cogn Sci. 2015 Apr;7(2):230-42. doi: 10.1111/tops.12131. Epub 2015 Mar 30.
9
Thirty Years After Marr's Vision: Levels of Analysis in Cognitive Science.
Top Cogn Sci. 2015 Apr;7(2):187-90. doi: 10.1111/tops.12137. Epub 2015 Mar 9.
10
Marr's theory of the neocortex as a self-organizing neural network.
Neural Comput. 1997 May 15;9(4):911-36. doi: 10.1162/neco.1997.9.4.911.

引用本文的文献

1
Goals and the Structure of Experience.
ArXiv. 2025 Aug 20:arXiv:2508.15013v1.
2
Humans learn generalizable representations through efficient coding.
Nat Commun. 2025 Apr 29;16(1):3989. doi: 10.1038/s41467-025-58848-6.
3
Schemas, reinforcement learning and the medial prefrontal cortex.
Nat Rev Neurosci. 2025 Mar;26(3):141-157. doi: 10.1038/s41583-024-00893-z. Epub 2025 Jan 7.
4
Reinforcement-Learning-Informed Queries Guide Behavioral Change.
Clin Psychol Sci. 2024 Nov;12(6):1146-1161. doi: 10.1177/21677026231213368. Epub 2024 Jan 24.
5
Feasibility of dopamine as a vector-valued feedback signal in the basal ganglia.
Proc Natl Acad Sci U S A. 2023 Aug 8;120(32):e2221994120. doi: 10.1073/pnas.2221994120. Epub 2023 Aug 1.
6
A neuro-computational social learning framework to facilitate transdiagnostic classification and treatment across psychiatric disorders.
Neurosci Biobehav Rev. 2023 Jun;149:105181. doi: 10.1016/j.neubiorev.2023.105181. Epub 2023 Apr 14.
7
Understanding cingulotomy's therapeutic effect in OCD through computer models.
Front Integr Neurosci. 2023 Jan 10;16:889831. doi: 10.3389/fnint.2022.889831. eCollection 2022.
8
Incorporating social knowledge structures into computational models.
Nat Commun. 2022 Oct 20;13(1):6205. doi: 10.1038/s41467-022-33418-2.
9
Efficient coding of cognitive variables underlies dopamine response and choice behavior.
Nat Neurosci. 2022 Jun;25(6):738-748. doi: 10.1038/s41593-022-01085-7. Epub 2022 Jun 6.
10
Learning from other minds: An optimistic critique of reinforcement learning models of social learning.
Curr Opin Behav Sci. 2021 Apr;38:110-115. doi: 10.1016/j.cobeha.2021.01.006. Epub 2021 Mar 23.

本文引用的文献

1
Temporal Specificity of Reward Prediction Errors Signaled by Putative Dopamine Neurons in Rat VTA Depends on Ventral Striatum.
Neuron. 2016 Jul 6;91(1):182-93. doi: 10.1016/j.neuron.2016.05.015. Epub 2016 Jun 9.
3
When good news leads to bad choices.
J Exp Anal Behav. 2016 Jan;105(1):23-40. doi: 10.1002/jeab.192.
4
When good pigeons make bad decisions: Choice with probabilistic delays and outcomes.
J Exp Anal Behav. 2015 Nov;104(3):241-51. doi: 10.1002/jeab.177.
6
Brief optogenetic inhibition of dopamine neurons mimics endogenous negative reward prediction errors.
Nat Neurosci. 2016 Jan;19(1):111-6. doi: 10.1038/nn.4191. Epub 2015 Dec 7.
7
Action initiation shapes mesolimbic dopamine encoding of future rewards.
Nat Neurosci. 2016 Jan;19(1):34-6. doi: 10.1038/nn.4187. Epub 2015 Dec 7.
8
Striatal dynamics explain duration judgments.
Elife. 2015 Dec 7;4:e11386. doi: 10.7554/eLife.11386.
9
Arithmetic and local circuitry underlying dopamine prediction errors.
Nature. 2015 Sep 10;525(7568):243-6. doi: 10.1038/nature14855. Epub 2015 Aug 31.
10
Reinforcement learning in multidimensional environments relies on attention mechanisms.
J Neurosci. 2015 May 27;35(21):8145-57. doi: 10.1523/JNEUROSCI.2978-14.2015.

文献AI研究员

20分钟写一篇综述,助力文献阅读效率提升50倍。

立即体验

用中文搜PubMed

大模型驱动的PubMed中文搜索引擎

马上搜索

文档翻译

学术文献翻译模型,支持多种主流文档格式。

立即体验