Suppr超能文献

用于混沌时空动力学深度强化学习主动控制的对称性约化

Symmetry reduction for deep reinforcement learning active control of chaotic spatiotemporal dynamics.

作者信息

Zeng Kevin, Graham Michael D

机构信息

Department of Chemical and Biological Engineering, University of Wisconsin-Madison, Madison, Wisconsin 53706, USA.

出版信息

Phys Rev E. 2021 Jul;104(1-1):014210. doi: 10.1103/PhysRevE.104.014210.

Abstract

Deep reinforcement learning (RL) is a data-driven, model-free method capable of discovering complex control strategies for macroscopic objectives in high-dimensional systems, making its application toward flow control promising. Many systems of flow control interest possess symmetries that, when neglected, can significantly inhibit the learning and performance of a naive deep RL approach. Using a test-bed consisting of the Kuramoto-Sivashinsky equation (KSE), equally spaced actuators, and a goal of minimizing dissipation and power cost, we demonstrate that by moving the deep RL problem to a symmetry-reduced space, we can alleviate limitations inherent in the naive application of deep RL. We demonstrate that symmetry-reduced deep RL yields improved data efficiency as well as improved control policy efficacy compared to policies found by naive deep RL. Interestingly, the policy learned by the symmetry aware control agent drives the system toward an equilibrium state of the forced KSE that is connected by continuation to an equilibrium of the unforced KSE, despite having been given no explicit information regarding its existence. That is, to achieve its goal, the RL algorithm discovers and stabilizes an equilibrium state of the system. Finally, we demonstrate that the symmetry-reduced control policy is robust to observation and actuation signal noise, as well as to system parameters it has not observed before.

摘要

深度强化学习(RL)是一种数据驱动的无模型方法,能够为高维系统中的宏观目标发现复杂的控制策略,这使得其在流量控制方面的应用前景广阔。许多人们感兴趣的流量控制系统都具有对称性,如果忽略这些对称性,可能会显著抑制简单深度强化学习方法的学习和性能。我们使用一个由Kuramoto-Sivashinsky方程(KSE)、等距致动器组成的测试平台,并以最小化耗散和功率成本为目标,证明了通过将深度强化学习问题转移到对称性降低的空间,我们可以缓解简单应用深度强化学习时固有的局限性。我们证明,与简单深度强化学习找到的策略相比,对称性降低的深度强化学习提高了数据效率以及控制策略的有效性。有趣的是,由对称性感知控制智能体学习到的策略将系统驱动到受迫KSE的一个平衡状态,该平衡状态通过连续与未受迫KSE 的一个平衡状态相连接,尽管没有给它关于其存在的明确信息就做到了这一点。也就是说,为了实现其目标,强化学习算法发现并稳定了系统的一个平衡状态。最后,我们证明,对称性降低的控制策略对观测和驱动信号噪声以及它之前未观测到的系统参数具有鲁棒性。

文献AI研究员

20分钟写一篇综述,助力文献阅读效率提升50倍。

立即体验

用中文搜PubMed

大模型驱动的PubMed中文搜索引擎

马上搜索

文档翻译

学术文献翻译模型,支持多种主流文档格式。

立即体验