IEEE Trans Cybern. 2020 Jun;50(6):2861-2871. doi: 10.1109/TCYB.2019.2901897. Epub 2019 Mar 18.
A reinforcement learning-based method is proposed for optimal sensor placement in the spatial domain for modeling distributed parameter systems (DPSs). First, a low-dimensional subspace, derived by Karhunen-Loève decomposition, is identified to capture the dominant dynamic features of the DPS. Second, a spatial objective function is proposed for the sensor placement. This function is defined in the obtained low-dimensional subspace by exploiting the time-space separation property of distributed processes, and in turn aims at minimizing the modeling error over the entire time and space domain. Third, the sensor placement configuration is mathematically formulated as a Markov decision process (MDP) with specified elements. Finally, the sensor locations are optimized through learning the optimal policies of the MDP according to the spatial objective function. The experimental results of a simulated catalytic rod and a real snap curing oven system are provided to demonstrate the feasibility and efficiency of the proposed method in solving the combinatorial optimization problems, such as optimal sensor placement.
提出了一种基于强化学习的方法,用于在空间域中进行最优传感器布置,以对分布参数系统(DPS)进行建模。首先,通过卡恩-洛维分解,确定一个低维子空间,以捕获 DPS 的主要动态特征。其次,提出了一个用于传感器布置的空间目标函数。该函数是在获得的低维子空间中定义的,利用了分布过程的时空分离特性,并旨在最小化整个时间和空间域上的建模误差。第三,将传感器布置配置形式化地表述为具有指定元素的马尔可夫决策过程(MDP)。最后,根据空间目标函数,通过学习 MDP 的最优策略来优化传感器位置。提供了一个模拟催化棒和一个真实的热固性烤箱系统的实验结果,以证明所提出的方法在解决组合优化问题(如最优传感器布置)方面的可行性和效率。