• 文献检索
  • 文档翻译
  • 深度研究
  • 学术资讯
  • Suppr Zotero 插件Zotero 插件
  • 邀请有礼
  • 套餐&价格
  • 历史记录
应用&插件
Suppr Zotero 插件Zotero 插件浏览器插件Mac 客户端Windows 客户端微信小程序
定价
高级版会员购买积分包购买API积分包
服务
文献检索文档翻译深度研究API 文档MCP 服务
关于我们
关于 Suppr公司介绍联系我们用户协议隐私条款
关注我们

Suppr 超能文献

核心技术专利:CN118964589B侵权必究
粤ICP备2023148730 号-1Suppr @ 2026

文献检索

告别复杂PubMed语法,用中文像聊天一样搜索,搜遍4000万医学文献。AI智能推荐,让科研检索更轻松。

立即免费搜索

文件翻译

保留排版,准确专业,支持PDF/Word/PPT等文件格式,支持 12+语言互译。

免费翻译文档

深度研究

AI帮你快速写综述,25分钟生成高质量综述,智能提取关键信息,辅助科研写作。

立即免费体验

使用深度稳态强化学习对长期营养行为进行建模。

Modeling long-term nutritional behaviors using deep homeostatic reinforcement learning.

作者信息

Yoshida Naoto, Arikawa Etsushi, Kanazawa Hoshinori, Kuniyoshi Yasuo

机构信息

Graduate School of Information Science and Technology, The University of Tokyo, Tokyo 113-8656, Japan.

Graduate School of Informatics, Kyoto University, Kyoto 606-8501, Japan.

出版信息

PNAS Nexus. 2024 Nov 28;3(12):pgae540. doi: 10.1093/pnasnexus/pgae540. eCollection 2024 Dec.

DOI:10.1093/pnasnexus/pgae540
PMID:39670260
原文链接:https://pmc.ncbi.nlm.nih.gov/articles/PMC11635831/
Abstract

The continual generation of behaviors that satisfy all conflicting demands that cannot be satisfied simultaneously, is a situation that is seen naturally in autonomous agents such as long-term operating household robots, and in animals in the natural world. Homeostatic reinforcement learning (homeostatic RL) is known as a bio-inspired framework that achieves such multiobjective control through behavioral optimization. Homeostatic RL achieves autonomous behavior optimization using only internal body information in complex environmental systems, including continuous motor control. However, it is still unknown whether the resulting behaviors actually have the similar long-term properties as real animals. To clarify this issue, this study focuses on the balancing of multiple nutrients in animal foraging as a situation in which such multiobjective control is achieved in animals in the natural world. We then focus on the nutritional geometry framework, which can quantitatively handle the long-term characteristics of foraging strategies for multiple nutrients in nutritional biology, and construct a similar verification environment to show experimentally that homeostatic RL agents exhibit long-term foraging characteristics seen in animals in nature. Furthermore, numerical simulation results show that the long-term foraging characteristics of the agent can be controlled by changing the weighting for the agent's multiobjective motivation. These results show that the long-term behavioral characteristics of homeostatic RL agents that perform behavioral emergence at the motor control level can be predicted and designed based on the internal dynamics of the body and the weighting of motivation, which change in real time.

摘要

持续产生满足所有无法同时满足的相互冲突需求的行为,这种情况在诸如长期运行的家用机器人等自主智能体以及自然界中的动物身上自然可见。稳态强化学习(homeostatic RL)是一种受生物启发的框架,它通过行为优化实现这种多目标控制。稳态强化学习在包括连续运动控制在内的复杂环境系统中仅使用内部身体信息来实现自主行为优化。然而,目前尚不清楚由此产生的行为是否真的具有与真实动物相似的长期特性。为了阐明这个问题,本研究将动物觅食过程中多种营养物质的平衡作为自然界中动物实现这种多目标控制的一种情况来进行研究。然后,我们关注营养几何学框架,该框架能够定量处理营养生物学中多种营养物质觅食策略的长期特征,并构建一个类似的验证环境,通过实验表明稳态强化学习智能体展现出自然界中动物所具有的长期觅食特征。此外,数值模拟结果表明,可以通过改变智能体多目标动机的权重来控制智能体的长期觅食特征。这些结果表明,在运动控制层面表现出行为涌现的稳态强化学习智能体的长期行为特征,可以基于身体的内部动态以及实时变化的动机权重进行预测和设计。

https://cdn.ncbi.nlm.nih.gov/pmc/blobs/7daf/11635831/b72f26267a4c/pgae540f4.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/7daf/11635831/5565c1ff5a49/pgae540f1.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/7daf/11635831/55682092ef01/pgae540f2.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/7daf/11635831/2dbf7298e631/pgae540f3.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/7daf/11635831/b72f26267a4c/pgae540f4.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/7daf/11635831/5565c1ff5a49/pgae540f1.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/7daf/11635831/55682092ef01/pgae540f2.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/7daf/11635831/2dbf7298e631/pgae540f3.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/7daf/11635831/b72f26267a4c/pgae540f4.jpg

相似文献

1
Modeling long-term nutritional behaviors using deep homeostatic reinforcement learning.使用深度稳态强化学习对长期营养行为进行建模。
PNAS Nexus. 2024 Nov 28;3(12):pgae540. doi: 10.1093/pnasnexus/pgae540. eCollection 2024 Dec.
2
Emergence of integrated behaviors through direct optimization for homeostasis.通过直接优化实现体内平衡来产生综合行为。
Neural Netw. 2024 Sep;177:106379. doi: 10.1016/j.neunet.2024.106379. Epub 2024 May 8.
3
Continuous action deep reinforcement learning for propofol dosing during general anesthesia.全身麻醉期间丙泊酚给药的连续动作深度强化学习
Artif Intell Med. 2022 Jan;123:102227. doi: 10.1016/j.artmed.2021.102227. Epub 2021 Dec 2.
4
Collective foraging of active particles trained by reinforcement learning.通过强化学习训练的活性粒子的集体觅食行为
Sci Rep. 2023 Oct 10;13(1):17055. doi: 10.1038/s41598-023-44268-3.
5
MOSAIC for multiple-reward environments.多奖励环境下的 MOSAIC 算法。
Neural Comput. 2012 Mar;24(3):577-606. doi: 10.1162/NECO_a_00246. Epub 2011 Dec 14.
6
Computational Mechanisms of Osmoregulation: A Reinforcement Learning Model for Sodium Appetite.渗透调节的计算机制:一种钠食欲的强化学习模型。
Front Neurosci. 2022 May 19;16:857009. doi: 10.3389/fnins.2022.857009. eCollection 2022.
7
"Notice of Violation of IEEE Publication Principles" Multiobjective Reinforcement Learning: A Comprehensive Overview.《违反IEEE出版原则通知》 多目标强化学习:全面概述
IEEE Trans Cybern. 2013 Apr 29. doi: 10.1109/TSMCC.2013.2249512.
8
Having multiple selves helps learning agents explore and adapt in complex changing worlds.拥有多个自我有助于学习代理在复杂多变的世界中探索和适应。
Proc Natl Acad Sci U S A. 2023 Jul 11;120(28):e2221180120. doi: 10.1073/pnas.2221180120. Epub 2023 Jul 3.
9
Quantifying Reinforcement-Learning Agent's Autonomy, Reliance on Memory and Internalisation of the Environment.量化强化学习智能体的自主性、对记忆的依赖以及环境内化程度。
Entropy (Basel). 2022 Mar 13;24(3):401. doi: 10.3390/e24030401.
10
A Hybrid Online Off-Policy Reinforcement Learning Agent Framework Supported by Transformers.基于 Transformer 的混合在线非策略强化学习代理框架。
Int J Neural Syst. 2023 Dec;33(12):2350065. doi: 10.1142/S012906572350065X. Epub 2023 Oct 20.

本文引用的文献

1
Emergence of integrated behaviors through direct optimization for homeostasis.通过直接优化实现体内平衡来产生综合行为。
Neural Netw. 2024 Sep;177:106379. doi: 10.1016/j.neunet.2024.106379. Epub 2024 May 8.
2
Towards an integrated understanding of dietary phenotypes.朝向对饮食表型的综合理解。
Philos Trans R Soc Lond B Biol Sci. 2023 Dec 4;378(1891):20220545. doi: 10.1098/rstb.2022.0545. Epub 2023 Oct 16.
3
Having multiple selves helps learning agents explore and adapt in complex changing worlds.拥有多个自我有助于学习代理在复杂多变的世界中探索和适应。
Proc Natl Acad Sci U S A. 2023 Jul 11;120(28):e2221180120. doi: 10.1073/pnas.2221180120. Epub 2023 Jul 3.
4
31st Annual Computational Neuroscience Meeting: CNS*2022.第31届年度计算神经科学会议:CNS*2022
J Comput Neurosci. 2023 Jan;51(Suppl 1):3-101. doi: 10.1007/s10827-022-00841-9.
5
Hypothalamic Interactions with Large-Scale Neural Circuits Underlying Reinforcement Learning and Motivated Behavior.下丘脑与强化学习和动机行为相关的大规模神经回路的相互作用。
Trends Neurosci. 2020 Sep;43(9):681-694. doi: 10.1016/j.tins.2020.06.006. Epub 2020 Aug 3.
6
A distributional code for value in dopamine-based reinforcement learning.多巴胺基强化学习中的价值分布代码。
Nature. 2020 Jan;577(7792):671-675. doi: 10.1038/s41586-019-1924-6. Epub 2020 Jan 15.
7
Grandmaster level in StarCraft II using multi-agent reinforcement learning.星际争霸 II 中的大师级水平使用多智能体强化学习。
Nature. 2019 Nov;575(7782):350-354. doi: 10.1038/s41586-019-1724-z. Epub 2019 Oct 30.
8
Where Does Value Come From?价值从何而来?
Trends Cogn Sci. 2019 Oct;23(10):836-850. doi: 10.1016/j.tics.2019.07.012. Epub 2019 Sep 4.
9
Neurocomputational theories of homeostatic control.神经计算理论的体内平衡控制。
Phys Life Rev. 2019 Dec;31:214-232. doi: 10.1016/j.plrev.2019.07.005. Epub 2019 Jul 19.
10
Protein Leverage: Theoretical Foundations and Ten Points of Clarification.蛋白质杠杆作用:理论基础与十点澄清。
Obesity (Silver Spring). 2019 Aug;27(8):1225-1238. doi: 10.1002/oby.22531.