Suppr超能文献

用于动态系统有效监测的深度强化学习传感器调度

Deep Reinforcement Learning Sensor Scheduling for Effective Monitoring of Dynamical Systems.

作者信息

Alali Mohammad, Kazeminajafabadi Armita, Imani Mahdi

机构信息

Northeastern University, 360 Huntington Ave, Boston, MA, 02115, U.S.

出版信息

Syst Sci Control Eng. 2024;12(1). doi: 10.1080/21642583.2024.2329260. Epub 2024 Apr 23.

Abstract

Advances in technology have enabled the use of sensors with varied modalities to monitor different parts of systems, each providing diverse levels of information about the underlying system. However, resource limitations and computational power restrict the number of sensors/data that can be processed in real-time in most complex systems. These challenges necessitate the need for selecting/scheduling a subset of sensors to obtain measurements that guarantee the best monitoring objectives. This paper focuses on sensor scheduling for systems modeled by hidden Markov models. Despite the development of several sensor selection and scheduling methods, existing methods tend to be greedy and do not take into account the long-term impact of selected sensors on monitoring objectives. This paper formulates optimal sensor scheduling as a reinforcement learning problem defined over the posterior distribution of system states. Further, the paper derives a deep reinforcement learning policy for offline learning of the sensor scheduling policy, which can then be executed in real-time as new information unfolds. The proposed method applies to any monitoring objective that can be expressed in terms of the posterior distribution of the states (e.g., state estimation, information gain, etc.). The performance of the proposed method in terms of accuracy and robustness is investigated for monitoring the security of networked systems and the health monitoring of gene regulatory networks.

摘要

技术的进步使得能够使用具有不同模态的传感器来监测系统的不同部分,每个传感器都能提供关于底层系统的不同程度的信息。然而,资源限制和计算能力在大多数复杂系统中限制了能够实时处理的传感器/数据的数量。这些挑战使得有必要选择/调度传感器子集以获得能够保证最佳监测目标的测量值。本文重点关注由隐马尔可夫模型建模的系统的传感器调度。尽管已经开发了几种传感器选择和调度方法,但现有方法往往是贪婪的,没有考虑所选传感器对监测目标的长期影响。本文将最优传感器调度表述为一个基于系统状态后验分布定义的强化学习问题。此外,本文推导了一种用于传感器调度策略离线学习的深度强化学习策略,然后可以在新信息出现时实时执行该策略。所提出的方法适用于任何可以根据状态后验分布来表达的监测目标(例如,状态估计、信息增益等)。针对网络系统安全监测和基因调控网络健康监测,研究了所提出方法在准确性和鲁棒性方面的性能。

相似文献

本文引用的文献

1
Optimal Inference of Hidden Markov Models Through Expert-Acquired Data.通过专家获取的数据对隐马尔可夫模型进行最优推断。
IEEE Trans Artif Intell. 2024 Aug;5(8):3985-4000. doi: 10.1109/tai.2024.3358261. Epub 2024 Jan 24.
2
Learning to Fight Against Cell Stimuli: A Game Theoretic Perspective.从博弈论视角看学习对抗细胞刺激
2023 IEEE Conf Artif Intell (2023). 2023 Jun;2023:285-287. doi: 10.1109/cai54212.2023.00127. Epub 2023 Aug 2.
3
Reinforcement Learning Data-Acquiring for Causal Inference of Regulatory Networks.用于调控网络因果推断的强化学习数据获取
Proc Am Control Conf. 2023 May-Jun;2023:3957-3964. doi: 10.23919/acc55779.2023.10155867. Epub 2023 Jul 3.
4
Inference of regulatory networks through temporally sparse data.通过时间上稀疏的数据推断调控网络。
Front Control Eng. 2022;3. doi: 10.3389/fcteg.2022.1017256. Epub 2022 Dec 13.

文献AI研究员

20分钟写一篇综述,助力文献阅读效率提升50倍。

立即体验

用中文搜PubMed

大模型驱动的PubMed中文搜索引擎

马上搜索

文档翻译

学术文献翻译模型,支持多种主流文档格式。

立即体验