主动推理与学习

Active inference and learning.

作者信息

Friston Karl, FitzGerald Thomas, Rigoli Francesco, Schwartenbeck Philipp, O Doherty John, Pezzulo Giovanni

机构信息

The Wellcome Trust Centre for Neuroimaging, UCL, 12 Queen Square, London, United Kingdom.

The Wellcome Trust Centre for Neuroimaging, UCL, 12 Queen Square, London, United Kingdom; Max-Planck⿿UCL Centre for Computational Psychiatry and Ageing Research, London, United Kingdom.

出版信息

Neurosci Biobehav Rev. 2016 Sep;68:862-879. doi: 10.1016/j.neubiorev.2016.06.022. Epub 2016 Jun 29.

DOI:10.1016/j.neubiorev.2016.06.022

PMID:27375276

原文链接:https://pmc.ncbi.nlm.nih.gov/articles/PMC5167251/

Abstract

This paper offers an active inference account of choice behaviour and learning. It focuses on the distinction between goal-directed and habitual behaviour and how they contextualise each other. We show that habits emerge naturally (and autodidactically) from sequential policy optimisation when agents are equipped with state-action policies. In active inference, behaviour has explorative (epistemic) and exploitative (pragmatic) aspects that are sensitive to ambiguity and risk respectively, where epistemic (ambiguity-resolving) behaviour enables pragmatic (reward-seeking) behaviour and the subsequent emergence of habits. Although goal-directed and habitual policies are usually associated with model-based and model-free schemes, we find the more important distinction is between belief-free and belief-based schemes. The underlying (variational) belief updating provides a comprehensive (if metaphorical) process theory for several phenomena, including the transfer of dopamine responses, reversal learning, habit formation and devaluation. Finally, we show that active inference reduces to a classical (Bellman) scheme, in the absence of ambiguity.

摘要

本文提供了一种关于选择行为和学习的主动推理解释。它着重于目标导向行为和习惯性行为之间的区别，以及它们如何相互关联。我们表明，当智能体配备状态-动作策略时，习惯会自然地（且自动地）从顺序策略优化中产生。在主动推理中，行为具有探索性（认知性）和利用性（实用性）两个方面，分别对模糊性和风险敏感，其中认知性（解决模糊性）行为促成实用性（寻求奖励）行为以及随后习惯的出现。尽管目标导向策略和习惯性策略通常与基于模型和无模型的方案相关联，但我们发现更重要的区别在于无信念和基于信念的方案之间。潜在的（变分）信念更新为包括多巴胺反应的传递、逆向学习、习惯形成和贬值在内的多种现象提供了一个全面的（如果是隐喻性的）过程理论。最后，我们表明在没有模糊性的情况下，主动推理简化为经典的（贝尔曼）方案。

https://cdn.ncbi.nlm.nih.gov/pmc/blobs/473b/5167251/eda558db8298/gr1.jpg

相似文献

Active inference and learning.主动推理与学习

Neurosci Biobehav Rev. 2016 Sep;68:862-879. doi: 10.1016/j.neubiorev.2016.06.022. Epub 2016 Jun 29.

The anatomy of choice: dopamine and decision-making.首选剖析：多巴胺与决策制定。

Philos Trans R Soc Lond B Biol Sci. 2014 Nov 5;369(1655). doi: 10.1098/rstb.2013.0481.

Dopamine role in learning and action inference.多巴胺在学习和行动推断中的作用。

Elife. 2020 Jul 7;9:e53262. doi: 10.7554/eLife.53262.

Active inference and epistemic value.主动推理与认知价值。

Cogn Neurosci. 2015;6(4):187-214. doi: 10.1080/17588928.2015.1020053. Epub 2015 Mar 13.

Caching mechanisms for habit formation in Active Inference.主动推理中习惯形成的缓存机制。

Neurocomputing (Amst). 2019 Sep 24;359:298-314. doi: 10.1016/j.neucom.2019.05.083.

Planning and navigation as active inference.作为主动推理的规划与导航

Biol Cybern. 2018 Aug;112(4):323-343. doi: 10.1007/s00422-018-0753-2. Epub 2018 Mar 23.

Reward Maximization Through Discrete Active Inference.通过离散主动推理实现奖励最大化。

Neural Comput. 2023 Apr 18;35(5):807-852. doi: 10.1162/neco_a_01574.

Goal-Directed and Habit-Like Modulations of Stimulus Processing during Reinforcement Learning.强化学习过程中刺激处理的目标导向与习惯样调制

J Neurosci. 2017 Mar 15;37(11):3009-3017. doi: 10.1523/JNEUROSCI.3205-16.2017. Epub 2017 Feb 13.

Dopamine sensitization by methamphetamine treatment prior to instrumental training delays the transition into habit in female rats.美沙酮治疗在仪器训练前使多巴胺敏感，从而延迟雌性大鼠进入习惯阶段。

Behav Brain Res. 2022 Feb 10;418:113636. doi: 10.1016/j.bbr.2021.113636. Epub 2021 Oct 20.

Active Inference: A Process Theory.主动推理：一种过程理论。

Neural Comput. 2017 Jan;29(1):1-49. doi: 10.1162/NECO_a_00912. Epub 2016 Nov 21.

引用本文的文献

Free Energy Projective Simulation (FEPS): Active inference with interpretability.自由能投射模拟（FEPS）：具有可解释性的主动推理

PLoS One. 2025 Sep 4;20(9):e0331047. doi: 10.1371/journal.pone.0331047. eCollection 2025.

The Criticality of Consciousness: Excitatory-Inhibitory Balance and Dual Memory Systems in Active Inference.意识的关键性：主动推理中的兴奋-抑制平衡与双重记忆系统

Entropy (Basel). 2025 Aug 4;27(8):829. doi: 10.3390/e27080829.

Pathfinding: a neurodynamical account of intuition.寻路：直觉的神经动力学阐释

Commun Biol. 2025 Aug 13;8(1):1214. doi: 10.1038/s42003-025-08612-9.

Supercomplexity: bridging the gap between aesthetics and cognition.超级复杂性：弥合美学与认知之间的差距。

Front Neurosci. 2025 Jul 29;19:1552363. doi: 10.3389/fnins.2025.1552363. eCollection 2025.

Cognitive computational model reveals repetition bias in a sequential decision-making task.认知计算模型揭示了序列决策任务中的重复偏差。

Commun Psychol. 2025 Jun 13;3(1):92. doi: 10.1038/s44271-025-00271-0.

Resilience phenotypes derived from an active inference account of allostasis.源自内稳态主动推理理论的适应力表型。

Front Behav Neurosci. 2025 May 9;19:1524722. doi: 10.3389/fnbeh.2025.1524722. eCollection 2025.

Mind the semantic gap: semantic efficiency in human computer interfaces.注意语义鸿沟：人机界面中的语义效率

Front Artif Intell. 2025 Mar 26;8:1451865. doi: 10.3389/frai.2025.1451865. eCollection 2025.

The neural correlates of novelty and variability in human decision-making under an active inference framework.主动推理框架下人类决策中新颖性和变异性的神经关联。

Elife. 2025 Mar 21;13:RP92892. doi: 10.7554/eLife.92892.

Learning of the mean, but not variance, of color distributions cues target location probability.了解颜色分布的均值而非方差可提示目标位置概率。

Sci Rep. 2025 Mar 4;15(1):7591. doi: 10.1038/s41598-024-84750-0.

Spontaneous perceptual alternations and higher-order cognitive processes: an exploratory study.自发感知交替与高阶认知过程：一项探索性研究。

Cogn Process. 2025 Feb 27. doi: 10.1007/s10339-025-01260-1.

本文引用的文献

Scene Construction, Visual Foraging, and Active Inference.场景构建、视觉觅食与主动推理

Front Comput Neurosci. 2016 Jun 14;10:56. doi: 10.3389/fncom.2016.00056. eCollection 2016.

Active Inference, epistemic value, and vicarious trial and error.主动推理、认知价值与替代性试错

Learn Mem. 2016 Jun 17;23(7):322-38. doi: 10.1101/lm.041780.116. Print 2016 Jul.

The Functional Anatomy of Time: What and When in the Brain.时间的功能解剖：大脑中的“什么”和“何时”。

Trends Cogn Sci. 2016 Jul;20(7):500-511. doi: 10.1016/j.tics.2016.05.001. Epub 2016 May 31.

Navigating the Affordance Landscape: Feedback Control as a Process Model of Behavior and Cognition.导航可供性景观：作为行为和认知的过程模型的反馈控制。

Trends Cogn Sci. 2016 Jun;20(6):414-424. doi: 10.1016/j.tics.2016.03.013. Epub 2016 Apr 22.

Dopamine, reward learning, and active inference.多巴胺、奖赏学习与主动推理

Front Comput Neurosci. 2015 Nov 4;9:136. doi: 10.3389/fncom.2015.00136. eCollection 2015.

Evidence for surprise minimization over value maximization in choice behavior.选择行为中惊喜最小化而非价值最大化的证据。

Sci Rep. 2015 Nov 13;5:16575. doi: 10.1038/srep16575.

Selection of cortical dynamics for motor behaviour by the basal ganglia.基底神经节对运动行为的皮质动力学选择

Biol Cybern. 2015 Dec;109(6):575-95. doi: 10.1007/s00422-015-0662-6. Epub 2015 Nov 4.

Prefrontal Goal Codes Emerge as Latent States in Probabilistic Value Learning.前额叶目标编码在概率性价值学习中作为潜在状态出现。

J Cogn Neurosci. 2016 Jan;28(1):140-57. doi: 10.1162/jocn_a_00886. Epub 2015 Oct 6.

Active Inference, homeostatic regulation and adaptive behavioural control.主动推理、稳态调节与适应性行为控制。

Prog Neurobiol. 2015 Nov;134:17-35. doi: 10.1016/j.pneurobio.2015.09.001. Epub 2015 Sep 10.

Theory of choice in bandit, information sampling and foraging tasks.强盗任务、信息采样和觅食任务中的选择理论。

PLoS Comput Biol. 2015 Mar 27;11(3):e1004164. doi: 10.1371/journal.pcbi.1004164. eCollection 2015 Mar.

文献检索

告别复杂PubMed语法，用中文像聊天一样搜索，搜遍4000万医学文献。AI智能推荐，让科研检索更轻松。

立即免费搜索

文件翻译

保留排版，准确专业，支持PDF/Word/PPT等文件格式，支持 12+语言互译。

免费翻译文档

深度研究

AI帮你快速写综述，25分钟生成高质量综述，智能提取关键信息，辅助科研写作。

立即免费体验

主动推理与学习

Active inference and learning.

作者信息

机构信息

出版信息

相似文献

引用本文的文献

本文引用的文献

文献检索

文件翻译

深度研究

Suppr 超能文献

相似文献

引用本文的文献

本文引用的文献