Li Fushuai, Bao Jiawang, Wang Jun, Liu Da, Chen Wencheng, Lin Ruiquan
College of Electrical Engineering and Automation, Fuzhou University, Fuzhou 350108, China.
Sensors (Basel). 2024 Aug 14;24(16):5273. doi: 10.3390/s24165273.
In the Energy-Harvesting (EH) Cognitive Internet of Things (EH-CIoT) network, due to the broadcast nature of wireless communication, the EH-CIoT network is susceptible to jamming attacks, which leads to a serious decrease in throughput. Therefore, this paper investigates an anti-jamming resource-allocation method, aiming to maximize the Long-Term Throughput (LTT) of the EH-CIoT network. Specifically, the resource-allocation problem is modeled as a Markov Decision Process (MDP) without prior knowledge. On this basis, this paper carefully designs a two-dimensional reward function that includes throughput and energy rewards. On the one hand, the Agent Base Station (ABS) intuitively evaluates the effectiveness of its actions through throughput rewards to maximize the LTT. On the other hand, considering the EH characteristics and battery capacity limitations, this paper proposes energy rewards to guide the ABS to reasonably allocate channels for Secondary Users (SUs) with insufficient power to harvest more energy for transmission, which can indirectly improve the LTT. In the case where the activity states of Primary Users (PUs), channel information and the jamming strategies of the jammer are not available in advance, this paper proposes a Linearly Weighted Deep Deterministic Policy Gradient (LWDDPG) algorithm to maximize the LTT. The LWDDPG is extended from DDPG to adapt to the design of the two-dimensional reward function, which enables the ABS to reasonably allocate transmission channels, continuous power and work modes to the SUs, and to let the SUs not only transmit on unjammed channels, but also harvest more RF energy to supplement the battery power. Finally, the simulation results demonstrate the validity and superiority of the proposed method compared with traditional methods under multiple jamming attacks.
在能量收集(EH)认知物联网(EH - CIoT)网络中,由于无线通信的广播特性,EH - CIoT网络容易受到干扰攻击,这导致吞吐量严重下降。因此,本文研究了一种抗干扰资源分配方法,旨在最大化EH - CIoT网络的长期吞吐量(LTT)。具体而言,资源分配问题被建模为一个无先验知识的马尔可夫决策过程(MDP)。在此基础上,本文精心设计了一个包含吞吐量奖励和能量奖励的二维奖励函数。一方面,代理基站(ABS)通过吞吐量奖励直观地评估其行动的有效性,以最大化LTT。另一方面,考虑到EH特性和电池容量限制,本文提出能量奖励,以引导ABS为功率不足的次用户(SUs)合理分配信道,使其能够收获更多能量用于传输,从而间接提高LTT。在主用户(PUs)的活动状态、信道信息和干扰器的干扰策略事先不可用的情况下,本文提出了一种线性加权深度确定性策略梯度(LWDDPG)算法来最大化LTT。LWDDPG是从DDPG扩展而来,以适应二维奖励函数的设计,这使得ABS能够为SUs合理分配传输信道、连续功率和工作模式,并让SUs不仅在未受干扰的信道上进行传输,还能收获更多射频能量来补充电池电量。最后,仿真结果表明了所提方法在多种干扰攻击下相较于传统方法的有效性和优越性。