• 文献检索
  • 文档翻译
  • 深度研究
  • 学术资讯
  • Suppr Zotero 插件Zotero 插件
  • 邀请有礼
  • 套餐&价格
  • 历史记录
应用&插件
Suppr Zotero 插件Zotero 插件浏览器插件Mac 客户端Windows 客户端微信小程序
定价
高级版会员购买积分包购买API积分包
服务
文献检索文档翻译深度研究API 文档MCP 服务
关于我们
关于 Suppr公司介绍联系我们用户协议隐私条款
关注我们

Suppr 超能文献

核心技术专利:CN118964589B侵权必究
粤ICP备2023148730 号-1Suppr @ 2026

文献检索

告别复杂PubMed语法,用中文像聊天一样搜索,搜遍4000万医学文献。AI智能推荐,让科研检索更轻松。

立即免费搜索

文件翻译

保留排版,准确专业,支持PDF/Word/PPT等文件格式,支持 12+语言互译。

免费翻译文档

深度研究

AI帮你快速写综述,25分钟生成高质量综述,智能提取关键信息,辅助科研写作。

立即免费体验

相似文献

1
Moment-Based Reinforcement Learning for Ensemble Control.用于集成控制的基于矩的强化学习
IEEE Trans Neural Netw Learn Syst. 2024 Sep;35(9):12653-12664. doi: 10.1109/TNNLS.2023.3264151. Epub 2024 Sep 4.
2
Prescription of Controlled Substances: Benefits and Risks管制药品的处方:益处与风险
3
Short-Term Memory Impairment短期记忆障碍
4
A New Measure of Quantified Social Health Is Associated With Levels of Discomfort, Capability, and Mental and General Health Among Patients Seeking Musculoskeletal Specialty Care.一种新的量化社会健康指标与寻求肌肉骨骼专科护理的患者的不适程度、能力以及心理和总体健康水平相关。
Clin Orthop Relat Res. 2025 Apr 1;483(4):647-663. doi: 10.1097/CORR.0000000000003394. Epub 2025 Feb 5.
5
Healthcare workers' informal uses of mobile phones and other mobile devices to support their work: a qualitative evidence synthesis.医护人员非正规使用手机和其他移动设备来支持工作:定性证据综合评价。
Cochrane Database Syst Rev. 2024 Aug 27;8(8):CD015705. doi: 10.1002/14651858.CD015705.pub2.
6
Active body surface warming systems for preventing complications caused by inadvertent perioperative hypothermia in adults.用于预防成人围手术期意外低温引起并发症的主动体表升温系统。
Cochrane Database Syst Rev. 2016 Apr 21;4(4):CD009016. doi: 10.1002/14651858.CD009016.pub2.
7
Plug-and-play use of tree-based methods: consequences for clinical prediction modeling.基于树的方法的即插即用:对临床预测模型的影响。
J Clin Epidemiol. 2025 Aug;184:111834. doi: 10.1016/j.jclinepi.2025.111834. Epub 2025 May 19.
8
Idiopathic (Genetic) Generalized Epilepsy特发性(遗传性)全身性癫痫
9
Q-learning with temporal memory to navigate turbulence.基于时间记忆的Q学习以应对动荡。
Elife. 2025 Jul 21;13:RP102906. doi: 10.7554/eLife.102906.
10
Is It Possible to Develop a Patient-reported Experience Measure With Lower Ceiling Effect?是否有可能开发一种天花板效应较低的患者报告体验测量方法?
Clin Orthop Relat Res. 2025 Apr 1;483(4):693-703. doi: 10.1097/CORR.0000000000003262. Epub 2024 Oct 25.

引用本文的文献

1
RI2AP: Robust and Interpretable 2D Anomaly Prediction in Assembly Pipelines.RI2AP:装配流水线中稳健且可解释的二维异常预测
Sensors (Basel). 2024 May 20;24(10):3244. doi: 10.3390/s24103244.

本文引用的文献

1
Neural-Network-Based Adaptive Event-Triggered Consensus Control of Nonstrict-Feedback Nonlinear Systems.基于神经网络的非严格反馈非线性系统自适应事件触发一致性控制
IEEE Trans Neural Netw Learn Syst. 2021 Apr;32(4):1750-1764. doi: 10.1109/TNNLS.2020.2991015. Epub 2021 Apr 2.
2
Novel electrode technologies for neural recordings.新型神经记录电极技术。
Nat Rev Neurosci. 2019 Jun;20(6):330-345. doi: 10.1038/s41583-019-0140-6.
3
Adaptive Optimal Output Regulation of Time-Delay Systems via Measurement Feedback.基于测量反馈的时滞系统自适应最优输出调节
IEEE Trans Neural Netw Learn Syst. 2019 Mar;30(3):938-945. doi: 10.1109/TNNLS.2018.2850520. Epub 2018 Jul 24.
4
Precise multimodal optical control of neural ensemble activity.对神经集群活动进行精确的多模态光学控制。
Nat Neurosci. 2018 Jun;21(6):881-893. doi: 10.1038/s41593-018-0139-8. Epub 2018 Apr 30.
5
Event-Sampled Direct Adaptive NN Output- and State-Feedback Control of Uncertain Strict-Feedback System.事件采样直接自适应神经网络不确定严格反馈系统的输出和状态反馈控制。
IEEE Trans Neural Netw Learn Syst. 2018 May;29(5):1850-1863. doi: 10.1109/TNNLS.2017.2678922. Epub 2017 Apr 11.
6
Value and Policy Iterations in Optimal Control and Adaptive Dynamic Programming.最优控制和自适应动态规划中的价值迭代和策略迭代。
IEEE Trans Neural Netw Learn Syst. 2017 Mar;28(3):500-509. doi: 10.1109/TNNLS.2015.2503980. Epub 2015 Dec 22.
7
Phase-selective entrainment of nonlinear oscillator ensembles.非线性振荡器集合的相位选择性同步
Nat Commun. 2016 Mar 18;7:10788. doi: 10.1038/ncomms10788.
8
Model-Based Reinforcement Learning for Infinite-Horizon Approximate Optimal Tracking.基于模型的强化学习在无限时域近似最优跟踪中的应用。
IEEE Trans Neural Netw Learn Syst. 2017 Mar;28(3):753-758. doi: 10.1109/TNNLS.2015.2511658. Epub 2016 Feb 3.
9
Adaptive Robust Output Feedback Control for a Marine Dynamic Positioning System Based on a High-Gain Observer.基于高增益观测器的海洋动力定位系统自适应鲁棒输出反馈控制。
IEEE Trans Neural Netw Learn Syst. 2015 Nov;26(11):2775-86. doi: 10.1109/TNNLS.2015.2396044. Epub 2015 Mar 5.
10
Globally Stable Adaptive Backstepping Neural Network Control for Uncertain Strict-Feedback Systems With Tracking Accuracy Known a Priori.具有先验跟踪精度的不确定严格反馈系统的全局稳定自适应反步神经网络控制。
IEEE Trans Neural Netw Learn Syst. 2015 Sep;26(9):1842-54. doi: 10.1109/TNNLS.2014.2357451. Epub 2014 Sep 25.

用于集成控制的基于矩的强化学习

Moment-Based Reinforcement Learning for Ensemble Control.

作者信息

Yu Yao-Chi, Narayanan Vignesh, Li Jr-Shin

出版信息

IEEE Trans Neural Netw Learn Syst. 2024 Sep;35(9):12653-12664. doi: 10.1109/TNNLS.2023.3264151. Epub 2024 Sep 4.

DOI:10.1109/TNNLS.2023.3264151
PMID:37043324
原文链接:https://pmc.ncbi.nlm.nih.gov/articles/PMC10676148/
Abstract

Problems involving controlling the collective behavior of a population of structurally similar dynamical systems, the so-called ensemble control, arise in diverse emerging applications and pose a grand challenge in systems science and control engineering. Owing to the severely under-actuated nature and the difficulty of placing large-scale sensor networks, ensemble systems are limited to being actuated and monitored at the population level. Moreover, mathematical models describing the dynamics of ensemble systems are often elusive. Therefore, it is essential to design broadcast controls that excite the entire population in such a way that the heterogeneity in system dynamics is robustly compensated. In this article, we propose a reinforcement learning (RL)-based data-driven control framework incorporating population-level aggregated measurement data to learn a global control signal for steering a dynamic population in the desired manner. In particular, we introduce the notion of ensemble moments induced by aggregated measurements and derive the associated moment system to the original ensemble system. Then, using the moment system, we learn an approximation of optimal value functions and the associated policies in terms of ensemble moments through RL. We illustrate the feasibility and scalability of the proposed moment-based approach via numerical experiments using a population of linear, bilinear, and nonlinear dynamic ensemble systems. We report that the proposed method achieves the desired control objectives of various ensemble control tasks and obtains significantly better averaged-reward when compared with three existing methods.

摘要

涉及控制结构相似的动态系统群体的集体行为问题,即所谓的总体控制,出现在各种新兴应用中,并在系统科学和控制工程中构成了巨大挑战。由于严重欠驱动的特性以及部署大规模传感器网络的困难,总体系统仅限于在群体层面进行驱动和监测。此外,描述总体系统动态的数学模型往往难以捉摸。因此,设计广播控制以激发整个群体,从而稳健地补偿系统动态中的异质性至关重要。在本文中,我们提出了一种基于强化学习(RL)的数据驱动控制框架,该框架结合了群体层面的聚合测量数据,以学习用于以期望方式引导动态群体的全局控制信号。具体而言,我们引入了由聚合测量引起的总体矩的概念,并推导了与原始总体系统相关的矩系统。然后,使用矩系统,我们通过强化学习学习最优值函数及其在总体矩方面的相关策略的近似值。我们通过使用线性、双线性和非线性动态总体系统群体的数值实验,说明了所提出的基于矩的方法的可行性和可扩展性。我们报告说,与三种现有方法相比,所提出的方法实现了各种总体控制任务的期望控制目标,并获得了显著更好的平均奖励。