利用时间分层架构优化注意力和认知控制成本。

Optimizing Attention and Cognitive Control Costs Using Temporally Layered Architectures.

机构信息

Manning College of Information and Computer Science, University of Massachusetts, Amherst, MA 01003, U.S.A.

Computational Neurobiology Laboratory, Salk Institute for Biological Studies, La Jolla, CA 92037, U.S.A.

出版信息

Neural Comput. 2024 Nov 19;36(12):2734-2763. doi: 10.1162/neco_a_01718.

DOI:10.1162/neco_a_01718

PMID:39383029

Abstract

The current reinforcement learning framework focuses exclusively on performance, often at the expense of efficiency. In contrast, biological control achieves remarkable performance while also optimizing computational energy expenditure and decision frequency. We propose a decision-bounded Markov decision process (DB-MDP) that constrains the number of decisions and computational energy available to agents in reinforcement learning environments. Our experiments demonstrate that existing reinforcement learning algorithms struggle within this framework, leading to either failure or suboptimal performance. To address this, we introduce a biologically inspired, temporally layered architecture (TLA), enabling agents to manage computational costs through two layers with distinct timescales and energy requirements. TLA achieves optimal performance in decision-bounded environments and in continuous control environments, matching state-of-the-art performance while using a fraction of the computing cost. Compared to current reinforcement learning algorithms that solely prioritize performance, our approach significantly lowers computational energy expenditure while maintaining performance. These findings establish a benchmark and pave the way for future research on energy and time-aware control.

摘要

当前的强化学习框架仅关注性能，往往以牺牲效率为代价。相比之下，生物控制在实现卓越性能的同时，还能优化计算能量消耗和决策频率。我们提出了一种决策受限的马尔可夫决策过程 (DB-MDP)，该过程限制了强化学习环境中代理的决策数量和可用计算能量。我们的实验表明，现有的强化学习算法在这个框架内遇到了困难，要么失败，要么表现不佳。为了解决这个问题，我们引入了一种受生物启发的、时间分层的架构 (TLA)，使代理能够通过具有不同时间尺度和能量要求的两个层来管理计算成本。TLA 在决策受限环境和连续控制环境中均能实现最优性能，在使用计算成本的一小部分的同时，达到了最先进的性能水平。与仅优先考虑性能的当前强化学习算法相比，我们的方法显著降低了计算能量消耗，同时保持了性能。这些发现为未来的能量和时间感知控制研究奠定了基础并确立了基准。

Suppr 超能文献

文献检索

文件翻译

深度研究

Suppr 超能文献

文献检索

文件翻译

深度研究

利用时间分层架构优化注意力和认知控制成本。

Optimizing Attention and Cognitive Control Costs Using Temporally Layered Architectures.

机构信息

出版信息

相似文献

利用时间分层架构优化注意力和认知控制成本。

Optimizing Attention and Cognitive Control Costs Using Temporally Layered Architectures.

机构信息

出版信息

相似文献