Qi Hang, Huang Hao, Hu Zhiqun, Wen Xiangming, Lu Zhaoming
School of Information and Communication Engineering, Beijing University of Posts and Telecommunication, Beijing 100876, China.
Beijing Key Laboratory of Network System Architecture and Convergence, Beijing University of Posts and Telecommunications, Beijing 100876, China.
Sensors (Basel). 2020 May 14;20(10):2789. doi: 10.3390/s20102789.
In order to meet the ever-increasing traffic demand of Wireless Local Area Networks (WLANs), channel bonding is introduced in IEEE 802.11 standards. Although channel bonding effectively increases the transmission rate, the wider channel reduces the number of non-overlapping channels and is more susceptible to interference. Meanwhile, the traffic load differs from one access point (AP) to another and changes significantly depending on the time of day. Therefore, the primary channel and channel bonding bandwidth should be carefully selected to meet traffic demand and guarantee the performance gain. In this paper, we proposed an On-Demand Channel Bonding (O-DCB) algorithm based on Deep Reinforcement Learning (DRL) for heterogeneous WLANs to reduce transmission delay, where the APs have different channel bonding capabilities. In this problem, the state space is continuous and the action space is discrete. However, the size of action space increases exponentially with the number of APs by using single-agent DRL, which severely affects the learning rate. To accelerate learning, Multi-Agent Deep Deterministic Policy Gradient (MADDPG) is used to train O-DCB. Real traffic traces collected from a campus WLAN are used to train and test O-DCB. Simulation results reveal that the proposed algorithm has good convergence and lower delay than other algorithms.
为了满足无线局域网(WLAN)不断增长的流量需求,IEEE 802.11标准中引入了信道绑定。尽管信道绑定有效地提高了传输速率,但更宽的信道减少了非重叠信道的数量,并且更容易受到干扰。同时,不同接入点(AP)的流量负载不同,并且会根据一天中的时间发生显著变化。因此,应仔细选择主信道和信道绑定带宽,以满足流量需求并保证性能提升。在本文中,我们针对异构WLAN提出了一种基于深度强化学习(DRL)的按需信道绑定(O-DCB)算法,以减少传输延迟,其中AP具有不同的信道绑定能力。在这个问题中,状态空间是连续的,动作空间是离散的。然而,使用单智能体DRL时,动作空间的大小会随着AP数量呈指数增长,这严重影响了学习速率。为了加速学习,使用多智能体深度确定性策略梯度(MADDPG)来训练O-DCB。从校园WLAN收集的实际流量踪迹用于训练和测试O-DCB。仿真结果表明,所提出的算法具有良好的收敛性,并且比其他算法具有更低的延迟。