Liu JainShing, Lin Chun-Hung Richard, Hu Yu-Chen, Donta Praveen Kumar
Department of Computer Science and Information Engineering, Providence University, Taichung 43301, Taiwan.
Department of Computer Science and Engineering, National Sun Yat-sen University, Kaohsiung 80424, Taiwan.
Sensors (Basel). 2022 Mar 17;22(6):2328. doi: 10.3390/s22062328.
Future wireless networks promise immense increases on data rate and energy efficiency while overcoming the difficulties of charging the wireless stations or devices in the Internet of Things (IoT) with the capability of simultaneous wireless information and power transfer (SWIPT). For such networks, jointly optimizing beamforming, power control, and energy harvesting to enhance the communication performance from the base stations (BSs) (or access points (APs)) to the mobile nodes (MNs) served would be a real challenge. In this work, we formulate the joint optimization as a mixed integer nonlinear programming (MINLP) problem, which can be also realized as a complex multiple resource allocation (MRA) optimization problem subject to different allocation constraints. By means of deep reinforcement learning to estimate future rewards of actions based on the reported information from the users served by the networks, we introduce single-layer MRA algorithms based on deep Q-learning (DQN) and deep deterministic policy gradient (DDPG), respectively, as the basis for the downlink wireless transmissions. Moreover, by incorporating the capability of data-driven DQN technique and the strength of noncooperative game theory model, we propose a two-layer iterative approach to resolve the NP-hard MRA problem, which can further improve the communication performance in terms of data rate, energy harvesting, and power consumption. For the two-layer approach, we also introduce a pricing strategy for BSs or APs to determine their power costs on the basis of social utility maximization to control the transmit power. Finally, with the simulated environment based on realistic wireless networks, our numerical results show that the two-layer MRA algorithm proposed can achieve up to 2.3 times higher value than the single-layer counterparts which represent the data-driven deep reinforcement learning-based algorithms extended to resolve the problem, in terms of the utilities designed to reflect the trade-off among the performance metrics considered.
未来的无线网络有望大幅提高数据速率和能源效率,同时克服为物联网(IoT)中的无线站或设备充电的困难,并具备同时进行无线信息和功率传输(SWIPT)的能力。对于此类网络,联合优化波束成形、功率控制和能量收集,以提高从基站(BS)(或接入点(AP))到所服务移动节点(MN)的通信性能,将是一项真正的挑战。在这项工作中,我们将联合优化问题表述为混合整数非线性规划(MINLP)问题,该问题也可实现为受不同分配约束的复杂多资源分配(MRA)优化问题。通过深度强化学习,根据网络所服务用户报告的信息来估计动作的未来奖励,我们分别引入基于深度Q学习(DQN)和深度确定性策略梯度(DDPG)的单层MRA算法,作为下行链路无线传输的基础。此外,通过结合数据驱动的DQN技术的能力和非合作博弈论模型的优势,我们提出了一种两层迭代方法来解决NP难的MRA问题,该方法可以在数据速率、能量收集和功耗方面进一步提高通信性能。对于两层方法,我们还为BS或AP引入了一种定价策略,以基于社会效用最大化来确定其功率成本,从而控制发射功率。最后,在基于现实无线网络的模拟环境下,我们的数值结果表明,就旨在反映所考虑性能指标之间权衡的效用而言,所提出的两层MRA算法比代表扩展以解决该问题的数据驱动深度强化学习算法的单层算法能实现高达2.3倍的更高值。