• 文献检索
  • 文档翻译
  • 深度研究
  • 学术资讯
  • Suppr Zotero 插件Zotero 插件
  • 邀请有礼
  • 套餐&价格
  • 历史记录
应用&插件
Suppr Zotero 插件Zotero 插件浏览器插件Mac 客户端Windows 客户端微信小程序
定价
高级版会员购买积分包购买API积分包
服务
文献检索文档翻译深度研究API 文档MCP 服务
关于我们
关于 Suppr公司介绍联系我们用户协议隐私条款
关注我们

Suppr 超能文献

核心技术专利:CN118964589B侵权必究
粤ICP备2023148730 号-1Suppr @ 2026

文献检索

告别复杂PubMed语法,用中文像聊天一样搜索,搜遍4000万医学文献。AI智能推荐,让科研检索更轻松。

立即免费搜索

文件翻译

保留排版,准确专业,支持PDF/Word/PPT等文件格式,支持 12+语言互译。

免费翻译文档

深度研究

AI帮你快速写综述,25分钟生成高质量综述,智能提取关键信息,辅助科研写作。

立即免费体验

强化的进化理论的正式基础。

The formal foundation of an evolutionary theory of reinforcement.

机构信息

University of Bamberg, Markusplatz 3, D-96047, Bamberg, Germany.

Technische Universitaet Braunschweig, Spielmannstrasse 19, D-38106, Braunschweig, Germany.

出版信息

Behav Processes. 2021 May;186:104370. doi: 10.1016/j.beproc.2021.104370. Epub 2021 Feb 26.

DOI:10.1016/j.beproc.2021.104370
PMID:33647442
Abstract

Reinforcement learning is often described by analogy to natural selection. However, there is no coherent theory relating reinforcement learning to evolution within a single formal model of selection. This paper provides the formal foundation of such a unified theory. The model is based on the most general description of natural selection as given by the Price equation. We extend the Price equation to cover reinforcement learning as the result of a behavioral selection process within individuals and relate it to the principle of natural selection via the concept of statistical fitness predictors by means of a multilevel model of behavioral selection. The main result is the covariance-based law of effect, which describes reinforcement learning on a molar level by means of the covariance between behavioral allocation and a statistical fitness predictor. We further demonstrate how this abstract principle can be applied to derive theoretical explanations of various empirical findings, like conditioned reinforcement, blocking, matching and response deprivation. Our model is the first to apply the abstract principle of selection to derive a unified description of reinforcement learning and natural selection within a single model. It provides a general analytical tool for behavioral psychology in a similar way that the theory of natural selection does for evolutionary biology. We thus lay the formal foundation of a general theory of reinforcement as the result of behavioral selection on multiple levels.

摘要

强化学习常被类比于自然选择。然而,在单一的选择形式模型中,并没有将强化学习与进化联系起来的连贯理论。本文为这样的统一理论提供了形式基础。该模型基于由 Price 方程给出的对自然选择的最一般描述。我们将 Price 方程扩展到涵盖强化学习,作为个体内部行为选择过程的结果,并通过行为选择的多层次模型,通过统计适应度预测器的概念,将其与自然选择原则联系起来。主要结果是基于协方差的效应定律,该定律通过行为分配与统计适应度预测器之间的协方差来描述宏观层面的强化学习。我们进一步展示了如何将这一抽象原则应用于推导出各种经验发现的理论解释,如条件强化、阻断、匹配和反应剥夺。我们的模型是第一个将选择的抽象原则应用于从单一模型中推导出强化学习和自然选择的统一描述的模型。它为行为心理学提供了一个通用的分析工具,就像自然选择理论为进化生物学提供的一样。因此,我们为在多个层次上进行行为选择的强化的一般理论奠定了形式基础。

相似文献

1
The formal foundation of an evolutionary theory of reinforcement.强化的进化理论的正式基础。
Behav Processes. 2021 May;186:104370. doi: 10.1016/j.beproc.2021.104370. Epub 2021 Feb 26.
2
Behavioral selection in structured populations.结构种群中的行为选择。
Theory Biosci. 2024 Jun;143(2):97-105. doi: 10.1007/s12064-024-00413-8. Epub 2024 Mar 5.
3
An evolutionary model of reinforcer value.强化物价值的进化模型。
Behav Processes. 2020 Jun;175:104109. doi: 10.1016/j.beproc.2020.104109. Epub 2020 Mar 21.
4
Selection by consequences, behavioral evolution, and the price equation.基于结果的选择、行为进化与价格方程。
J Exp Anal Behav. 2017 May;107(3):321-342. doi: 10.1002/jeab.256.
5
Why Do Individuals Seek Information? A Selectionist Perspective.为什么个体寻求信息?一种选择主义视角。
Front Psychol. 2021 Nov 19;12:684544. doi: 10.3389/fpsyg.2021.684544. eCollection 2021.
6
Model-based estimates for operant selection.基于模型的操作性选择估计。
J Exp Anal Behav. 2024 Jul;122(1):62-71. doi: 10.1002/jeab.924. Epub 2024 Jun 2.
7
The Price equation and the unity of social evolution theory.价格方程与社会进化理论的统一。
Philos Trans R Soc Lond B Biol Sci. 2020 Apr 27;375(1797):20190362. doi: 10.1098/rstb.2019.0362. Epub 2020 Mar 9.
8
A first formal link between the price equation and an optimization program.价格方程与一个优化程序之间的首个正式联系。
J Theor Biol. 2002 Jul 7;217(1):75-91. doi: 10.1006/jtbi.2002.3015.
9
The Price Equation, Gradient Dynamics, and Continuous Trait Game Theory.价格方程、梯度动力学与连续性状博弈论。
Am Nat. 2018 Jan;191(1):146-153. doi: 10.1086/694891. Epub 2017 Nov 3.
10
The evolutionary consequences of learning under competition.学习竞争下的进化后果。
Proc Biol Sci. 2024 Aug;291(2028):20241141. doi: 10.1098/rspb.2024.1141. Epub 2024 Aug 7.

引用本文的文献

1
A minute can only be spent once: Validating a behavioral paradigm to study societally relevant time allocation conflicts.一分钟只能使用一次:验证一种行为范式以研究与社会相关的时间分配冲突。
Behav Res Methods. 2025 Aug 1;57(9):246. doi: 10.3758/s13428-025-02769-1.
2
Explaining Performance on Interval and Ratio Schedules with a Molar View of Behavior.用行为的宏观视角解释间隔和比率强化程序下的表现。
Perspect Behav Sci. 2025 May 21;48(2):173-202. doi: 10.1007/s40614-025-00455-3. eCollection 2025 Jun.
3
A generalized Price equation for fuzzy set-mappings.
模糊集映射的广义普赖斯方程。
Theory Biosci. 2025 Jun;144(2):167-172. doi: 10.1007/s12064-025-00438-7. Epub 2025 May 30.
4
Grocery Shopping Under Simplified Marginal Value Theorem Predictions.简化边际价值定理预测下的食品杂货购物
Hum Nat. 2024 Dec;35(4):451-476. doi: 10.1007/s12110-024-09485-3. Epub 2025 Jan 17.
5
Reconceptualizing the Interaction of Behavior and Environment.重新构想行为与环境的相互作用。
Perspect Behav Sci. 2024 Aug 22;47(3):559-579. doi: 10.1007/s40614-024-00417-1. eCollection 2024 Sep.
6
Behavioral selection in structured populations.结构种群中的行为选择。
Theory Biosci. 2024 Jun;143(2):97-105. doi: 10.1007/s12064-024-00413-8. Epub 2024 Mar 5.
7
Potential Contributions of Behavior Analysis to Research on Pro-environmental Behavior.行为分析对亲环境行为研究的潜在贡献。
Front Psychol. 2022 May 17;13:685621. doi: 10.3389/fpsyg.2022.685621. eCollection 2022.
8
Quantifying the instrumental and noninstrumental underpinnings of Pavlovian responding with the Price equation.用量化方法评估条件性反应的工具性和非工具性基础。
Psychon Bull Rev. 2022 Aug;29(4):1295-1306. doi: 10.3758/s13423-021-02047-z. Epub 2021 Dec 16.
9
Why Do Individuals Seek Information? A Selectionist Perspective.为什么个体寻求信息?一种选择主义视角。
Front Psychol. 2021 Nov 19;12:684544. doi: 10.3389/fpsyg.2021.684544. eCollection 2021.