Suppr超能文献

使用深度稳态强化学习对长期营养行为进行建模。

Modeling long-term nutritional behaviors using deep homeostatic reinforcement learning.

作者信息

Yoshida Naoto, Arikawa Etsushi, Kanazawa Hoshinori, Kuniyoshi Yasuo

机构信息

Graduate School of Information Science and Technology, The University of Tokyo, Tokyo 113-8656, Japan.

Graduate School of Informatics, Kyoto University, Kyoto 606-8501, Japan.

出版信息

PNAS Nexus. 2024 Nov 28;3(12):pgae540. doi: 10.1093/pnasnexus/pgae540. eCollection 2024 Dec.

Abstract

The continual generation of behaviors that satisfy all conflicting demands that cannot be satisfied simultaneously, is a situation that is seen naturally in autonomous agents such as long-term operating household robots, and in animals in the natural world. Homeostatic reinforcement learning (homeostatic RL) is known as a bio-inspired framework that achieves such multiobjective control through behavioral optimization. Homeostatic RL achieves autonomous behavior optimization using only internal body information in complex environmental systems, including continuous motor control. However, it is still unknown whether the resulting behaviors actually have the similar long-term properties as real animals. To clarify this issue, this study focuses on the balancing of multiple nutrients in animal foraging as a situation in which such multiobjective control is achieved in animals in the natural world. We then focus on the nutritional geometry framework, which can quantitatively handle the long-term characteristics of foraging strategies for multiple nutrients in nutritional biology, and construct a similar verification environment to show experimentally that homeostatic RL agents exhibit long-term foraging characteristics seen in animals in nature. Furthermore, numerical simulation results show that the long-term foraging characteristics of the agent can be controlled by changing the weighting for the agent's multiobjective motivation. These results show that the long-term behavioral characteristics of homeostatic RL agents that perform behavioral emergence at the motor control level can be predicted and designed based on the internal dynamics of the body and the weighting of motivation, which change in real time.

摘要

持续产生满足所有无法同时满足的相互冲突需求的行为,这种情况在诸如长期运行的家用机器人等自主智能体以及自然界中的动物身上自然可见。稳态强化学习(homeostatic RL)是一种受生物启发的框架,它通过行为优化实现这种多目标控制。稳态强化学习在包括连续运动控制在内的复杂环境系统中仅使用内部身体信息来实现自主行为优化。然而,目前尚不清楚由此产生的行为是否真的具有与真实动物相似的长期特性。为了阐明这个问题,本研究将动物觅食过程中多种营养物质的平衡作为自然界中动物实现这种多目标控制的一种情况来进行研究。然后,我们关注营养几何学框架,该框架能够定量处理营养生物学中多种营养物质觅食策略的长期特征,并构建一个类似的验证环境,通过实验表明稳态强化学习智能体展现出自然界中动物所具有的长期觅食特征。此外,数值模拟结果表明,可以通过改变智能体多目标动机的权重来控制智能体的长期觅食特征。这些结果表明,在运动控制层面表现出行为涌现的稳态强化学习智能体的长期行为特征,可以基于身体的内部动态以及实时变化的动机权重进行预测和设计。

https://cdn.ncbi.nlm.nih.gov/pmc/blobs/7daf/11635831/5565c1ff5a49/pgae540f1.jpg

文献检索

告别复杂PubMed语法,用中文像聊天一样搜索,搜遍4000万医学文献。AI智能推荐,让科研检索更轻松。

立即免费搜索

文件翻译

保留排版,准确专业,支持PDF/Word/PPT等文件格式,支持 12+语言互译。

免费翻译文档

深度研究

AI帮你快速写综述,25分钟生成高质量综述,智能提取关键信息,辅助科研写作。

立即免费体验