强化学习在基于风险的网络物理能源系统鲁棒控制中的作用。

Role of reinforcement learning for risk-based robust control of cyber-physical energy systems.

作者信息

Du Yan, Chatterjee Samrat, Bhattacharya Arnab, Dutta Ashutosh, Halappanavar Mahantesh

机构信息

Optimization and Control Group, Pacific Northwest National Laboratory, Richland, WA, USA.

Data Sciences and Machine Intelligence Group, Pacific Northwest National Laboratory, Richland, WA, USA.

出版信息

Risk Anal. 2023 Nov;43(11):2280-2297. doi: 10.1111/risa.14104. Epub 2023 Feb 6.

DOI:10.1111/risa.14104

PMID:36746175

Abstract

Critical infrastructures such as cyber-physical energy systems (CPS-E) integrate information flow and physical operations that are vulnerable to natural and targeted failures. Safe, secure, and reliable operation and control of CPS-E is critical to ensure societal well-being and economic prosperity. Automated control is key for real-time operations and may be mathematically cast as a sequential decision-making problem under uncertainty. Emergence of data-driven techniques for decision making under uncertainty, such as reinforcement learning (RL), have led to promising advances for addressing sequential decision-making problems for risk-based robust CPS-E control. However, existing research challenges include understanding the applicability of RL methods across diverse CPS-E applications, addressing the effect of risk preferences across multiple RL methods, and development of open-source domain-aware simulation environments for RL experimentation within a CPS-E context. This article systematically analyzes the applicability of four types of RL methods (model-free, model-based, hybrid model-free and model-based, and hierarchical) for risk-based robust CPS-E control. Problem features and solution stability for the RL methods are also discussed. We demonstrate and compare the performance of multiple RL methods under different risk specifications (risk-averse, risk-neutral, and risk-seeking) through the development and application of an open-source simulation environment. Motivating numerical simulation examples include representative single-zone and multizone building control use cases. Finally, six key insights for future research and broader adoption of RL methods are identified, with specific emphasis on problem features, algorithmic explainability, and solution stability.

摘要

诸如网络物理能源系统（CPS-E）之类的关键基础设施整合了信息流和物理操作，而这些都容易受到自然故障和针对性故障的影响。CPS-E的安全、可靠运行与控制对于确保社会福祉和经济繁荣至关重要。自动化控制是实时运行的关键，并且在数学上可以被视为不确定性下的顺序决策问题。用于不确定性下决策的数据驱动技术的出现，如强化学习（RL），为解决基于风险的稳健CPS-E控制的顺序决策问题带来了有前景的进展。然而，现有的研究挑战包括理解RL方法在不同CPS-E应用中的适用性、解决多种RL方法中风险偏好的影响，以及开发用于在CPS-E环境中进行RL实验的开源领域感知仿真环境。本文系统地分析了四种类型的RL方法（无模型、基于模型、混合无模型和基于模型以及分层）在基于风险的稳健CPS-E控制中的适用性。还讨论了RL方法的问题特征和解决方案稳定性。我们通过开发和应用开源仿真环境，展示并比较了多种RL方法在不同风险规范（风险规避、风险中性和风险寻求）下的性能。具有启发性的数值模拟示例包括具有代表性的单区域和多区域建筑控制用例。最后，确定了未来研究和更广泛采用RL方法的六个关键见解，特别强调问题特征、算法可解释性和解决方案稳定性。