• 文献检索
  • 文档翻译
  • 深度研究
  • 学术资讯
  • Suppr Zotero 插件Zotero 插件
  • 邀请有礼
  • 套餐&价格
  • 历史记录
应用&插件
Suppr Zotero 插件Zotero 插件浏览器插件Mac 客户端Windows 客户端微信小程序
定价
高级版会员购买积分包购买API积分包
服务
文献检索文档翻译深度研究API 文档MCP 服务
关于我们
关于 Suppr公司介绍联系我们用户协议隐私条款
关注我们

Suppr 超能文献

核心技术专利:CN118964589B侵权必究
粤ICP备2023148730 号-1Suppr @ 2026

文献检索

告别复杂PubMed语法,用中文像聊天一样搜索,搜遍4000万医学文献。AI智能推荐,让科研检索更轻松。

立即免费搜索

文件翻译

保留排版,准确专业,支持PDF/Word/PPT等文件格式,支持 12+语言互译。

免费翻译文档

深度研究

AI帮你快速写综述,25分钟生成高质量综述,智能提取关键信息,辅助科研写作。

立即免费体验

针对未知多智能体线性系统的安全关键控制的海马体经验推断。

Hippocampus experience inference for safety critical control of unknown multi-agent linear systems.

机构信息

School of Aerospace, Transport and Manufacturing, Cranfield University, Bedford, MK43 0AL, UK.

School of Aerospace, Transport and Manufacturing, Cranfield University, Bedford, MK43 0AL, UK.

出版信息

ISA Trans. 2023 Jun;137:646-655. doi: 10.1016/j.isatra.2022.12.011. Epub 2022 Dec 16.

DOI:10.1016/j.isatra.2022.12.011
PMID:36543735
Abstract

Risk mitigation is usually addressed in simulated environments for safety critical control. The migration of the final controller requires further adjustments due to the simulation assumptions and constraints. This paper presents the design of an experience inference algorithm for safety critical control of unknown multi-agent linear systems. The approach is inspired in the close relationship between three main areas of the brain cortex that enables transfer learning and decision making: the hippocampus, the neocortex, and the striatum. The hippocampus is modelled as a stable linear model that communicates to the striatum how the real-world system is expected to behave. The hippocampus model is controlled by an adaptive dynamic programming (ADP) algorithm to achieve an optimal desired performance. The neocortex and the striatum are designed simultaneously by an actor control policy algorithm that ensures experience inference to the real-world system. Experimental and simulations studies are carried out to verify the proposed approach.

摘要

风险缓解通常在安全关键控制的模拟环境中进行。由于模拟假设和约束,最终控制器的迁移需要进一步调整。本文提出了一种用于未知多智能体线性系统安全关键控制的经验推理算法的设计。该方法的灵感来自于大脑皮层三个主要区域之间的密切关系,这些区域可以实现迁移学习和决策:海马体、新皮层和纹状体。海马体被建模为一个稳定的线性模型,它向纹状体传达了对现实世界系统行为的预期。海马体模型由自适应动态规划(ADP)算法控制,以实现最优的期望性能。新皮层和纹状体通过一个演员控制策略算法同时设计,该算法确保了对现实世界系统的经验推理。进行了实验和模拟研究以验证所提出的方法。

相似文献

1
Hippocampus experience inference for safety critical control of unknown multi-agent linear systems.针对未知多智能体线性系统的安全关键控制的海马体经验推断。
ISA Trans. 2023 Jun;137:646-655. doi: 10.1016/j.isatra.2022.12.011. Epub 2022 Dec 16.
2
A complementary learning approach for expertise transference of human-optimized controllers.一种用于人类优化控制器专业知识转移的互补学习方法。
Neural Netw. 2022 Jan;145:33-41. doi: 10.1016/j.neunet.2021.10.009. Epub 2021 Oct 21.
3
Online adaptive policy learning algorithm for H∞ state feedback control of unknown affine nonlinear discrete-time systems.用于未知仿射非线性离散时间系统 H∞状态反馈控制的在线自适应策略学习算法。
IEEE Trans Cybern. 2014 Dec;44(12):2706-18. doi: 10.1109/TCYB.2014.2313915. Epub 2014 Jul 28.
4
IHG-MA: Inductive heterogeneous graph multi-agent reinforcement learning for multi-intersection traffic signal control.IHG-MA:用于多交叉口交通信号控制的归纳异质图多智能体强化学习。
Neural Netw. 2021 Jul;139:265-277. doi: 10.1016/j.neunet.2021.03.015. Epub 2021 Mar 22.
5
Adaptive optimal control of unknown constrained-input systems using policy iteration and neural networks.基于策略迭代和神经网络的未知约束输入系统自适应最优控制。
IEEE Trans Neural Netw Learn Syst. 2013 Oct;24(10):1513-25. doi: 10.1109/TNNLS.2013.2276571.
6
Discrete-Time Nonzero-Sum Games for Multiplayer Using Policy-Iteration-Based Adaptive Dynamic Programming Algorithms.基于策略迭代的自适应动态规划算法的多人非零和离散时间博弈。
IEEE Trans Cybern. 2017 Oct;47(10):3331-3340. doi: 10.1109/TCYB.2016.2611613. Epub 2016 Oct 3.
7
Efficient model learning methods for actor-critic control.用于演员-评论家控制的高效模型学习方法。
IEEE Trans Syst Man Cybern B Cybern. 2012 Jun;42(3):591-602. doi: 10.1109/TSMCB.2011.2170565. Epub 2011 Dec 7.
8
Multi-Objective reinforcement learning approach for improving safety at intersections with adaptive traffic signal control.多目标强化学习方法在自适应交通信号控制中提高交叉口安全性。
Accid Anal Prev. 2020 Sep;144:105655. doi: 10.1016/j.aap.2020.105655. Epub 2020 Jul 14.
9
Adaptive dynamic programming for finite-horizon optimal control of discrete-time nonlinear systems with ε-error bound.具有ε误差界的离散时间非线性系统有限时域最优控制的自适应动态规划
IEEE Trans Neural Netw. 2011 Jan;22(1):24-36. doi: 10.1109/TNN.2010.2076370. Epub 2010 Sep 27.
10
Reinforcement learning for partially observable dynamic processes: adaptive dynamic programming using measured output data.部分可观测动态过程的强化学习:使用测量输出数据的自适应动态规划
IEEE Trans Syst Man Cybern B Cybern. 2011 Feb;41(1):14-25. doi: 10.1109/TSMCB.2010.2043839. Epub 2010 Mar 29.