Yang Lixin, Tao Jie, Liu Yong-Hua, Xu Yong, Su Chun-Yi
Guangdong Provincial Key Laboratory of Intelligent Decision and Cooperative Control, School of Automation, Guangdong University of Technology, Guangzhou 510006, China.
Department of Mechanical and Industrial Engineering, Concordia University, Montreal, QC H3G 1M8, Canada.
Neural Netw. 2023 Apr;161:735-745. doi: 10.1016/j.neunet.2023.02.028. Epub 2023 Feb 18.
This paper studies the energy scheduling for Denial-of-Service (DoS) attack against remote state estimation over multi-hop networks. A smart sensor observes a dynamic system, and transmits its local state estimate to a remote estimator. Due to the limited communication range of the sensor, some relay nodes are employed to deliver data packets from the sensor to the remote estimator, which constitutes a multi-hop network. To maximize the estimation error covariance with energy constraint, a DoS attacker needs to determine the energy level implemented on each channel. This problem is formulated as an associated Markov decision process (MDP), and the existence of an optimal deterministic and stationary policy (DSP) is proved for the attacker. Besides, a simple threshold structure of the optimal policy is obtained, which significantly reduces the computational complexity. Furthermore, an up-to-date deep reinforcement learning (DRL) algorithm, dueling double Q-network (D3QN), is introduced to approximate the optimal policy. Finally, a simulation example illustrates the developed results and verifies the effectiveness of D3QN for optimal DoS attack energy scheduling.
本文研究了针对多跳网络上远程状态估计的拒绝服务(DoS)攻击的能量调度问题。一个智能传感器观测一个动态系统,并将其局部状态估计传输给一个远程估计器。由于传感器的通信范围有限,一些中继节点被用于将数据包从传感器传递到远程估计器,这构成了一个多跳网络。为了在能量约束下最大化估计误差协方差,一个DoS攻击者需要确定在每个信道上实施的能量水平。这个问题被表述为一个相关的马尔可夫决策过程(MDP),并且证明了攻击者存在一个最优确定性平稳策略(DSP)。此外,得到了最优策略的一个简单阈值结构,这显著降低了计算复杂度。此外,引入了一种最新的深度强化学习(DRL)算法——决斗双Q网络(D3QN)来逼近最优策略。最后,一个仿真例子说明了所得到的结果,并验证了D3QN用于最优DoS攻击能量调度的有效性。