• 文献检索
  • 文档翻译
  • 深度研究
  • 学术资讯
  • Suppr Zotero 插件Zotero 插件
  • 邀请有礼
  • 套餐&价格
  • 历史记录
应用&插件
Suppr Zotero 插件Zotero 插件浏览器插件Mac 客户端Windows 客户端微信小程序
定价
高级版会员购买积分包购买API积分包
服务
文献检索文档翻译深度研究API 文档MCP 服务
关于我们
关于 Suppr公司介绍联系我们用户协议隐私条款
关注我们

Suppr 超能文献

核心技术专利:CN118964589B侵权必究
粤ICP备2023148730 号-1Suppr @ 2026

文献检索

告别复杂PubMed语法,用中文像聊天一样搜索,搜遍4000万医学文献。AI智能推荐,让科研检索更轻松。

立即免费搜索

文件翻译

保留排版,准确专业,支持PDF/Word/PPT等文件格式,支持 12+语言互译。

免费翻译文档

深度研究

AI帮你快速写综述,25分钟生成高质量综述,智能提取关键信息,辅助科研写作。

立即免费体验

带有不确定性侧信息的强化学习。

Reinforcement Learning with Side Information for the Uncertainties.

机构信息

Department of A.I. Software Engineering, Seoul Media Institute of Technology, Seoul 07590, Republic of Korea.

出版信息

Sensors (Basel). 2022 Dec 14;22(24):9811. doi: 10.3390/s22249811.

DOI:10.3390/s22249811
PMID:36560180
原文链接:https://pmc.ncbi.nlm.nih.gov/articles/PMC9786629/
Abstract

Recently, there has been a growing interest in the consensus of a multi-agent system (MAS) with advances in artificial intelligence and distributed computing. Sliding mode control (SMC) is a well-known method that provides robust control in the presence of uncertainties. While our previous study introduced SMC to the reinforcement learning (RL) based on approximate dynamic programming in the context of optimal control, SMC is introduced to a conventional RL framework in this work. As a specific realization, the modified twin delayed deep deterministic policy gradient (DDPG) for consensus was exploited to develop sliding mode RL. Numerical experiments show that the sliding mode RL outperforms existing state-of-the-art RL methods and model-based methods in terms of the mean square error (MSE) performance.

摘要

最近,随着人工智能和分布式计算的发展,多智能体系统(MAS)的一致性引起了越来越多的关注。滑模控制(SMC)是一种在存在不确定性时提供鲁棒控制的知名方法。虽然我们之前的研究在最优控制的背景下将 SMC 引入了基于近似动态规划的强化学习(RL)中,但在这项工作中,SMC 被引入了传统的 RL 框架。作为一种具体的实现方式,利用改进的双时滞深度确定性策略梯度(DDPG)进行一致性来开发滑模 RL。数值实验表明,滑模 RL 在均方误差(MSE)性能方面优于现有的最先进的 RL 方法和基于模型的方法。

https://cdn.ncbi.nlm.nih.gov/pmc/blobs/561e/9786629/7e75fbeac38d/sensors-22-09811-g005.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/561e/9786629/bd873d010851/sensors-22-09811-g001.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/561e/9786629/a67655b42dbe/sensors-22-09811-g002.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/561e/9786629/976834986f6e/sensors-22-09811-g003.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/561e/9786629/b4bdb0f8f882/sensors-22-09811-g004.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/561e/9786629/7e75fbeac38d/sensors-22-09811-g005.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/561e/9786629/bd873d010851/sensors-22-09811-g001.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/561e/9786629/a67655b42dbe/sensors-22-09811-g002.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/561e/9786629/976834986f6e/sensors-22-09811-g003.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/561e/9786629/b4bdb0f8f882/sensors-22-09811-g004.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/561e/9786629/7e75fbeac38d/sensors-22-09811-g005.jpg

相似文献

1
Reinforcement Learning with Side Information for the Uncertainties.带有不确定性侧信息的强化学习。
Sensors (Basel). 2022 Dec 14;22(24):9811. doi: 10.3390/s22249811.
2
Accelerating reinforcement learning with case-based model-assisted experience augmentation for process control.通过基于案例的模型辅助经验增强加速强化学习以进行过程控制。
Neural Netw. 2023 Jan;158:197-215. doi: 10.1016/j.neunet.2022.10.016. Epub 2022 Oct 29.
3
Human locomotion with reinforcement learning using bioinspired reward reshaping strategies.基于生物启发式奖励重塑策略的强化学习的人类运动。
Med Biol Eng Comput. 2021 Jan;59(1):243-256. doi: 10.1007/s11517-020-02309-3. Epub 2021 Jan 8.
4
Predictive hierarchical reinforcement learning for path-efficient mapless navigation with moving target.具有移动目标的无图路径高效导航的预测分层强化学习。
Neural Netw. 2023 Aug;165:677-688. doi: 10.1016/j.neunet.2023.06.007. Epub 2023 Jun 10.
5
Reinforcement learning based temperature control of a fermentation bioreactor for ethanol production.基于强化学习的用于乙醇生产的发酵生物反应器温度控制
Biotechnol Bioeng. 2024 Oct;121(10):3114-3127. doi: 10.1002/bit.28784. Epub 2024 Jun 27.
6
Optimization of news dissemination push mode by intelligent edge computing technology for deep learning.基于深度学习的智能边缘计算技术对新闻传播推送模式的优化
Sci Rep. 2024 Mar 20;14(1):6671. doi: 10.1038/s41598-024-53859-7.
7
Reinforcement learning for closed-loop regulation of cardiovascular system with vagus nerve stimulation: a computational study.基于迷走神经刺激的心血管系统闭环调节的强化学习:一项计算研究。
J Neural Eng. 2024 Jun 3;21(3):036027. doi: 10.1088/1741-2552/ad48bb.
8
Application of reinforcement learning in cognitive radio networks: models and algorithms.强化学习在认知无线电网络中的应用:模型与算法
ScientificWorldJournal. 2014;2014:209810. doi: 10.1155/2014/209810. Epub 2014 Jun 5.
9
Deep learning, reinforcement learning, and world models.深度学习、强化学习和世界模型。
Neural Netw. 2022 Aug;152:267-275. doi: 10.1016/j.neunet.2022.03.037. Epub 2022 Apr 19.
10
A reinforcement learning model to inform optimal decision paths for HIV elimination.一种用于为消除艾滋病病毒提供最佳决策路径的强化学习模型。
Math Biosci Eng. 2021 Sep 6;18(6):7666-7684. doi: 10.3934/mbe.2021380.

本文引用的文献

1
Predictor-Based Extended-State-Observer Design for Consensus of MASs With Delays and Disturbances.基于预测器的扩展状态观测器设计用于具有时滞和干扰的 MASs 的一致性。
IEEE Trans Cybern. 2019 Apr;49(4):1259-1269. doi: 10.1109/TCYB.2018.2799798. Epub 2018 Feb 14.
2
Asynchronous Periodic Edge-Event Triggered Control for Double-Integrator Networks With Communication Time Delays.带通信时滞的双积分网络的异步周期边沿事件触发控制。
IEEE Trans Cybern. 2018 Feb;48(2):675-688. doi: 10.1109/TCYB.2017.2651026. Epub 2017 Jan 23.