• 文献检索
  • 文档翻译
  • 深度研究
  • 学术资讯
  • Suppr Zotero 插件Zotero 插件
  • 邀请有礼
  • 套餐&价格
  • 历史记录
应用&插件
Suppr Zotero 插件Zotero 插件浏览器插件Mac 客户端Windows 客户端微信小程序
定价
高级版会员购买积分包购买API积分包
服务
文献检索文档翻译深度研究API 文档MCP 服务
关于我们
关于 Suppr公司介绍联系我们用户协议隐私条款
关注我们

Suppr 超能文献

核心技术专利:CN118964589B侵权必究
粤ICP备2023148730 号-1Suppr @ 2026

文献检索

告别复杂PubMed语法,用中文像聊天一样搜索,搜遍4000万医学文献。AI智能推荐,让科研检索更轻松。

立即免费搜索

文件翻译

保留排版,准确专业,支持PDF/Word/PPT等文件格式,支持 12+语言互译。

免费翻译文档

深度研究

AI帮你快速写综述,25分钟生成高质量综述,智能提取关键信息,辅助科研写作。

立即免费体验

基于强化学习算法的自适应PI控制器用于直流电动机速度控制

Adaptive PI Controller Based on a Reinforcement Learning Algorithm for Speed Control of a DC Motor.

作者信息

Alejandro-Sanjines Ulbio, Maisincho-Jivaja Anthony, Asanza Victor, Lorente-Leyva Leandro L, Peluffo-Ordóñez Diego H

机构信息

Escuela Superior Politécnica del Litoral, Guayaquil 090903, Ecuador.

SDAS Research Group, Ben Guerir 43150, Morocco.

出版信息

Biomimetics (Basel). 2023 Sep 19;8(5):434. doi: 10.3390/biomimetics8050434.

DOI:10.3390/biomimetics8050434
PMID:37754185
原文链接:https://pmc.ncbi.nlm.nih.gov/articles/PMC10527306/
Abstract

Automated industrial processes require a controller to obtain an output signal similar to the reference indicated by the user. There are controllers such as PIDs, which are efficient if the system does not change its initial conditions. However, if this is not the case, the controller must be retuned, affecting production times. In this work, an adaptive PID controller is developed for a DC motor speed plant using an artificial intelligence algorithm based on reinforcement learning. This algorithm uses an actor-critic agent, where its objective is to optimize the actor's policy and train a critic for rewards. This will generate the appropriate gains without the need to know the system. The Deep Deterministic Policy Gradient with Twin Delayed (DDPG TD3) was used, with a network composed of 300 neurons for the agent's learning. Finally, the performance of the obtained controller is compared with a classical control one using a cost function.

摘要

自动化工业过程需要一个控制器来获得与用户指示的参考信号相似的输出信号。有诸如PID之类的控制器,如果系统不改变其初始条件,这些控制器是有效的。然而,如果情况并非如此,则必须对控制器进行重新调整,这会影响生产时间。在这项工作中,基于强化学习的人工智能算法为直流电动机调速系统开发了一种自适应PID控制器。该算法使用了一个演员-评论家智能体,其目标是优化演员的策略并训练评论家以获得奖励。这将在无需了解系统的情况下生成适当的增益。使用了带有双延迟的深度确定性策略梯度(DDPG TD3),其网络由300个神经元组成用于智能体的学习。最后,使用成本函数将获得的控制器的性能与经典控制的控制器进行比较。

https://cdn.ncbi.nlm.nih.gov/pmc/blobs/4cec/10527306/2109f02edd12/biomimetics-08-00434-g014.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/4cec/10527306/2595493330a8/biomimetics-08-00434-g001.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/4cec/10527306/3a737d11e8be/biomimetics-08-00434-g002.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/4cec/10527306/e1b8cea877f0/biomimetics-08-00434-g003.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/4cec/10527306/b2a0fcf81c6a/biomimetics-08-00434-g004.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/4cec/10527306/5eddb6bec9c1/biomimetics-08-00434-g005.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/4cec/10527306/899e33bbeb7b/biomimetics-08-00434-g006.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/4cec/10527306/c6f89596f3db/biomimetics-08-00434-g007.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/4cec/10527306/8f0ae315c11e/biomimetics-08-00434-g008.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/4cec/10527306/10c5f88eddb7/biomimetics-08-00434-g009.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/4cec/10527306/bdcb098e1287/biomimetics-08-00434-g010.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/4cec/10527306/96e0ffb47c25/biomimetics-08-00434-g011.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/4cec/10527306/acb47d24eed4/biomimetics-08-00434-g012.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/4cec/10527306/acaf5d8d89bc/biomimetics-08-00434-g013.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/4cec/10527306/2109f02edd12/biomimetics-08-00434-g014.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/4cec/10527306/2595493330a8/biomimetics-08-00434-g001.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/4cec/10527306/3a737d11e8be/biomimetics-08-00434-g002.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/4cec/10527306/e1b8cea877f0/biomimetics-08-00434-g003.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/4cec/10527306/b2a0fcf81c6a/biomimetics-08-00434-g004.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/4cec/10527306/5eddb6bec9c1/biomimetics-08-00434-g005.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/4cec/10527306/899e33bbeb7b/biomimetics-08-00434-g006.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/4cec/10527306/c6f89596f3db/biomimetics-08-00434-g007.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/4cec/10527306/8f0ae315c11e/biomimetics-08-00434-g008.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/4cec/10527306/10c5f88eddb7/biomimetics-08-00434-g009.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/4cec/10527306/bdcb098e1287/biomimetics-08-00434-g010.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/4cec/10527306/96e0ffb47c25/biomimetics-08-00434-g011.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/4cec/10527306/acb47d24eed4/biomimetics-08-00434-g012.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/4cec/10527306/acaf5d8d89bc/biomimetics-08-00434-g013.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/4cec/10527306/2109f02edd12/biomimetics-08-00434-g014.jpg

相似文献

1
Adaptive PI Controller Based on a Reinforcement Learning Algorithm for Speed Control of a DC Motor.基于强化学习算法的自适应PI控制器用于直流电动机速度控制
Biomimetics (Basel). 2023 Sep 19;8(5):434. doi: 10.3390/biomimetics8050434.
2
Reinforcement learning based temperature control of a fermentation bioreactor for ethanol production.基于强化学习的用于乙醇生产的发酵生物反应器温度控制
Biotechnol Bioeng. 2024 Oct;121(10):3114-3127. doi: 10.1002/bit.28784. Epub 2024 Jun 27.
3
Stochastic Integrated Actor-Critic for Deep Reinforcement Learning.用于深度强化学习的随机集成演员-评论家算法
IEEE Trans Neural Netw Learn Syst. 2024 May;35(5):6654-6666. doi: 10.1109/TNNLS.2022.3212273. Epub 2024 May 2.
4
Fuzzy-based collective pitch control for wind turbine via deep reinforcement learning.基于模糊的风力发电机组深度强化学习集体变桨控制
ISA Trans. 2024 May;148:307-325. doi: 10.1016/j.isatra.2024.03.023. Epub 2024 Mar 26.
5
Plug-and-Play Model-Agnostic Counterfactual Policy Synthesis for Deep Reinforcement Learning-Based Recommendation.用于基于深度强化学习的推荐的即插即用模型无关反事实策略合成
IEEE Trans Neural Netw Learn Syst. 2025 Jan;36(1):1044-1055. doi: 10.1109/TNNLS.2023.3329808. Epub 2025 Jan 7.
6
Meta attention for Off-Policy Actor-Critic.用于离策略演员-评论家的元注意力机制
Neural Netw. 2023 Jun;163:86-96. doi: 10.1016/j.neunet.2023.03.024. Epub 2023 Mar 28.
7
Adaptive control for circulating cooling water system using deep reinforcement learning.基于深度强化学习的循环冷却水系统自适应控制。
PLoS One. 2024 Jul 24;19(7):e0307767. doi: 10.1371/journal.pone.0307767. eCollection 2024.
8
Improved Performance for PMSM Sensorless Control Based on Robust-Type Controller, ESO-Type Observer, Multiple Neural Networks, and RL-TD3 Agent.基于鲁棒型控制器、扩张状态观测器、多个神经网络和强化学习TD3智能体的永磁同步电机无传感器控制性能提升
Sensors (Basel). 2023 Jun 21;23(13):5799. doi: 10.3390/s23135799.
9
Deep Reinforcement Learning-Based Accurate Control of Planetary Soft Landing.基于深度强化学习的行星软着陆精确控制
Sensors (Basel). 2021 Dec 6;21(23):8161. doi: 10.3390/s21238161.
10
An adaptive deep reinforcement learning approach for MIMO PID control of mobile robots.一种用于移动机器人MIMO PID控制的自适应深度强化学习方法。
ISA Trans. 2020 Jul;102:280-294. doi: 10.1016/j.isatra.2020.02.017. Epub 2020 Feb 19.

引用本文的文献

1
Reinforcement learning algorithm for improving speed response of a five-phase permanent magnet synchronous motor based model predictive control.基于模型预测控制的用于提高五相永磁同步电动机速度响应的强化学习算法
PLoS One. 2025 Jan 3;20(1):e0316326. doi: 10.1371/journal.pone.0316326. eCollection 2025.

本文引用的文献

1
Metaheuristic algorithms for PID controller parameters tuning: review, approaches and open problems.用于PID控制器参数整定的元启发式算法:综述、方法及开放性问题
Heliyon. 2022 May 11;8(5):e09399. doi: 10.1016/j.heliyon.2022.e09399. eCollection 2022 May.
2
Machine Learning: Algorithms, Real-World Applications and Research Directions.机器学习:算法、实际应用与研究方向。
SN Comput Sci. 2021;2(3):160. doi: 10.1007/s42979-021-00592-x. Epub 2021 Mar 22.
3
An adaptive deep reinforcement learning approach for MIMO PID control of mobile robots.
一种用于移动机器人MIMO PID控制的自适应深度强化学习方法。
ISA Trans. 2020 Jul;102:280-294. doi: 10.1016/j.isatra.2020.02.017. Epub 2020 Feb 19.