• 文献检索
  • 文档翻译
  • 深度研究
  • 学术资讯
  • Suppr Zotero 插件Zotero 插件
  • 邀请有礼
  • 套餐&价格
  • 历史记录
应用&插件
Suppr Zotero 插件Zotero 插件浏览器插件Mac 客户端Windows 客户端微信小程序
定价
高级版会员购买积分包购买API积分包
服务
文献检索文档翻译深度研究API 文档MCP 服务
关于我们
关于 Suppr公司介绍联系我们用户协议隐私条款
关注我们

Suppr 超能文献

核心技术专利:CN118964589B侵权必究
粤ICP备2023148730 号-1Suppr @ 2026

文献检索

告别复杂PubMed语法,用中文像聊天一样搜索,搜遍4000万医学文献。AI智能推荐,让科研检索更轻松。

立即免费搜索

文件翻译

保留排版,准确专业,支持PDF/Word/PPT等文件格式,支持 12+语言互译。

免费翻译文档

深度研究

AI帮你快速写综述,25分钟生成高质量综述,智能提取关键信息,辅助科研写作。

立即免费体验

利用时间分层架构优化注意力和认知控制成本。

Optimizing Attention and Cognitive Control Costs Using Temporally Layered Architectures.

机构信息

Manning College of Information and Computer Science, University of Massachusetts, Amherst, MA 01003, U.S.A.

Computational Neurobiology Laboratory, Salk Institute for Biological Studies, La Jolla, CA 92037, U.S.A.

出版信息

Neural Comput. 2024 Nov 19;36(12):2734-2763. doi: 10.1162/neco_a_01718.

DOI:10.1162/neco_a_01718
PMID:39383029
Abstract

The current reinforcement learning framework focuses exclusively on performance, often at the expense of efficiency. In contrast, biological control achieves remarkable performance while also optimizing computational energy expenditure and decision frequency. We propose a decision-bounded Markov decision process (DB-MDP) that constrains the number of decisions and computational energy available to agents in reinforcement learning environments. Our experiments demonstrate that existing reinforcement learning algorithms struggle within this framework, leading to either failure or suboptimal performance. To address this, we introduce a biologically inspired, temporally layered architecture (TLA), enabling agents to manage computational costs through two layers with distinct timescales and energy requirements. TLA achieves optimal performance in decision-bounded environments and in continuous control environments, matching state-of-the-art performance while using a fraction of the computing cost. Compared to current reinforcement learning algorithms that solely prioritize performance, our approach significantly lowers computational energy expenditure while maintaining performance. These findings establish a benchmark and pave the way for future research on energy and time-aware control.

摘要

当前的强化学习框架仅关注性能,往往以牺牲效率为代价。相比之下,生物控制在实现卓越性能的同时,还能优化计算能量消耗和决策频率。我们提出了一种决策受限的马尔可夫决策过程 (DB-MDP),该过程限制了强化学习环境中代理的决策数量和可用计算能量。我们的实验表明,现有的强化学习算法在这个框架内遇到了困难,要么失败,要么表现不佳。为了解决这个问题,我们引入了一种受生物启发的、时间分层的架构 (TLA),使代理能够通过具有不同时间尺度和能量要求的两个层来管理计算成本。TLA 在决策受限环境和连续控制环境中均能实现最优性能,在使用计算成本的一小部分的同时,达到了最先进的性能水平。与仅优先考虑性能的当前强化学习算法相比,我们的方法显著降低了计算能量消耗,同时保持了性能。这些发现为未来的能量和时间感知控制研究奠定了基础并确立了基准。

相似文献

1
Optimizing Attention and Cognitive Control Costs Using Temporally Layered Architectures.利用时间分层架构优化注意力和认知控制成本。
Neural Comput. 2024 Nov 19;36(12):2734-2763. doi: 10.1162/neco_a_01718.
2
Prescription of Controlled Substances: Benefits and Risks管制药品的处方:益处与风险
3
Dynamic Regulation of the Serotonin-Dopamine Interaction Within a Meta-reinforcement Learning Framework Encompassing the Prefrontal Cortex and Basal Ganglia.在包含前额叶皮层和基底神经节的元强化学习框架内血清素-多巴胺相互作用的动态调节
Int J Neural Syst. 2025 Aug;35(8):2550040. doi: 10.1142/S0129065725500406.
4
Privacy-Preserving Glycemic Management in Type 1 Diabetes: Development and Validation of a Multiobjective Federated Reinforcement Learning Framework.1型糖尿病中保护隐私的血糖管理:多目标联邦强化学习框架的开发与验证
JMIR Diabetes. 2025 Jul 4;10:e72874. doi: 10.2196/72874.
5
Data prioritization aware resource allocation in internet of vehicles using multi-agent deep reinforcement learning.基于多智能体深度强化学习的车联网中数据优先级感知资源分配
Neural Netw. 2025 Oct;190:107671. doi: 10.1016/j.neunet.2025.107671. Epub 2025 Jun 6.
6
Q-learning with temporal memory to navigate turbulence.基于时间记忆的Q学习以应对动荡。
Elife. 2025 Jul 21;13:RP102906. doi: 10.7554/eLife.102906.
7
Exploration versus exploitation decisions in the human brain: A systematic review of functional neuroimaging and neuropsychological studies.人类大脑中的探索与开发决策:功能神经影像学和神经心理学研究的系统综述。
Neuropsychologia. 2024 Jan 10;192:108740. doi: 10.1016/j.neuropsychologia.2023.108740. Epub 2023 Nov 29.
8
Predictive modeling of complications arising from early-onset preeclampsia in pregnant women.早发型子痫前期孕妇并发症的预测模型
Womens Health (Lond). 2025 Jan-Dec;21:17455057251348978. doi: 10.1177/17455057251348978. Epub 2025 Jul 21.
9
Does the Presence of Missing Data Affect the Performance of the SORG Machine-learning Algorithm for Patients With Spinal Metastasis? Development of an Internet Application Algorithm.缺失数据的存在是否会影响 SORG 机器学习算法在脊柱转移瘤患者中的性能?开发一种互联网应用算法。
Clin Orthop Relat Res. 2024 Jan 1;482(1):143-157. doi: 10.1097/CORR.0000000000002706. Epub 2023 Jun 12.
10
Data-driven equation discovery reveals nonlinear reinforcement learning in humans.数据驱动的方程发现揭示了人类的非线性强化学习。
Proc Natl Acad Sci U S A. 2025 Aug 5;122(31):e2413441122. doi: 10.1073/pnas.2413441122. Epub 2025 Jul 31.