Smart Transport Key Laboratory of Hunan Province, School of Traffic and Transportation Engineering, Central South University, Changsha, 410075, China.
Smart Transport Key Laboratory of Hunan Province, School of Traffic and Transportation Engineering, Central South University, Changsha, 410075, China.
Accid Anal Prev. 2022 Sep;174:106729. doi: 10.1016/j.aap.2022.106729. Epub 2022 Jun 11.
Car-following behavior is a common driving behavior. It is necessary to consider the following vehicle in the car-following model of autonomous vehicle (AV) under the background of the vehicle-to-vehicle transportation system. In this study, a safe velocity control method for AV based on reinforcement learning with considering the following vehicle is proposed. First, the mixed driving environment of AVs and human-driven vehicles is constructed, and the trajectories of the leading and following vehicles are extracted from the naturalistic High D driving dataset. Next, the soft actor-critic (SAC) algorithm is used as the velocity control algorithm, in which the agent is AV, the action is acceleration, and the state is the relative distance and relative speed between the AV and the leading and following vehicles. Then, a reward function based on state and corresponding action is designed to guide AV to choose acceleration without collision between the leading and following vehicles. Furthermore, AVs are gradually able to learn to avoid collisions between the leading and following vehicles after training the model. The test result of the trained model shows that the SAC agent can achieve complete collision avoidance, resulting in zero collision. Finally, the driving performance of the SAC agent and that of human driving are compared and analyzed for safety and efficiency. The results of this study are expected to improve the safety of the car-following process..
跟车行为是一种常见的驾驶行为。在车对车交通系统背景下,有必要考虑自动驾驶车辆(AV)的跟车模型中的后续车辆。在这项研究中,提出了一种基于强化学习并考虑后续车辆的 AV 安全速度控制方法。首先,构建了 AV 和人类驾驶车辆的混合驾驶环境,并从自然驾驶高 D 数据集提取了前车和后车的轨迹。接下来,使用软动作-评论家(SAC)算法作为速度控制算法,其中代理是 AV,动作是加速度,状态是 AV 与前车和后车之间的相对距离和相对速度。然后,设计了一个基于状态和相应动作的奖励函数,以指导 AV 在不与前车和后车发生碰撞的情况下选择加速度。此外,经过模型训练,AV 能够逐渐学会避免前车和后车之间的碰撞。经过训练的模型的测试结果表明,SAC 代理可以实现完全避免碰撞,从而实现零碰撞。最后,对 SAC 代理和人类驾驶的驾驶性能进行了安全性和效率的比较和分析。这项研究的结果有望提高跟车过程的安全性。