College of Data Science and Application, Inner Mongolia University of Technology, Inner Mongolia Autonomous Region Engineering and Technology Research Center of Big Data Based Software Service, Huhhot, 10080, Inner Monglia, China.
Sci Rep. 2023 Jun 9;13(1):9396. doi: 10.1038/s41598-023-36606-2.
Intelligent traffic light control (ITLC) algorithms are very efficient for relieving traffic congestion. Recently, many decentralized multi-agent traffic light control algorithms are proposed. These researches mainly focus on improving reinforcement learning method and coordination method. But, as all the agents need to communicate while coordinating with each other, the communication details should be improved as well. To guarantee communication effectiveness, two aspect should be considered. Firstly, a traffic condition description method need to be designed. By using this method, traffic condition can be described simply and clearly. Secondly, synchronization should be considered. As different intersections have different cycle lengths and message sending event happens at the end of each traffic signal cycle, every agent will receive messages of other agents at different time. So it is hard for an agent to decide which message is the latest one and the most valuable. Apart from communication details, reinforcement learning algorithm used for traffic signal timing should also be improved. In the traditional reinforcement learning based ITLC algorithms, either queue length of congested cars or waiting time of these cars is considered while calculating reward value. But, both of them are very important. So a new reward calculation method is needed. To solve all these problems, in this paper, a new ITLC algorithm is proposed. To improve communication efficiency, this algorithm adopts a new message sending and processing method. Besides, to measure traffic congestion in a more reasonable way, a new reward calculation method is proposed and used. This method takes both waiting time and queue length into consideration.
智能交通灯控制 (ITLC) 算法对于缓解交通拥堵非常有效。最近,提出了许多分散式多智能体交通灯控制算法。这些研究主要集中在改进强化学习方法和协调方法上。但是,由于所有智能体在相互协调时都需要进行通信,因此也需要改进通信细节。为了保证通信的有效性,需要考虑两个方面。首先,需要设计一种交通状况描述方法。通过使用这种方法,可以简单明了地描述交通状况。其次,需要考虑同步。由于不同的交叉路口具有不同的周期长度,并且消息发送事件发生在每个交通信号周期的末尾,因此每个智能体将在不同的时间接收其他智能体的消息。因此,智能体很难确定哪条消息是最新的、最有价值的。除了通信细节外,用于交通信号定时的强化学习算法也需要改进。在传统的基于强化学习的 ITLC 算法中,在计算奖励值时,要么考虑拥堵车辆的队列长度,要么考虑这些车辆的等待时间。但是,两者都非常重要。因此,需要一种新的奖励计算方法。为了解决所有这些问题,本文提出了一种新的 ITLC 算法。为了提高通信效率,该算法采用了一种新的消息发送和处理方法。此外,为了更合理地衡量交通拥堵程度,提出并使用了一种新的奖励计算方法。该方法同时考虑了等待时间和队列长度。