混合交通环境下城市无信号交叉口的强化学习控制方法。

A Control Method with Reinforcement Learning for Urban Un-Signalized Intersection in Hybrid Traffic Environment.

机构信息

School of Mechanical Engineering, Dalian University of Technology, Dalian 116024, China.

Department of Electrical and Computer Engineering, Western University, London, ON N6A 5B9, Canada.

出版信息

Sensors (Basel). 2022 Jan 20;22(3):779. doi: 10.3390/s22030779.

DOI:10.3390/s22030779

PMID:35161523

原文链接:https://pmc.ncbi.nlm.nih.gov/articles/PMC8840198/

Abstract

To control autonomous vehicles (AVs) in urban unsignalized intersections is a challenging problem, especially in a hybrid traffic environment where self-driving vehicles coexist with human driving vehicles. In this study, a coordinated control method with proximal policy optimization (PPO) in Vehicle-Road-Cloud Integration System (VRCIS) is proposed, where this control problem is formulated as a reinforcement learning (RL) problem. In this system, vehicles and everything (V2X) was used to keep communication between vehicles, and vehicle wireless technology can detect vehicles that use vehicles and infrastructure (V2I) wireless communication, thereby achieving a cost-efficient method. Then, the connected and autonomous vehicle (CAV) defined in the VRCIS learned a policy to adapt to human driving vehicles (HDVs) across the intersection safely by reinforcement learning (RL). We have developed a valid, scalable RL framework, which can communicate topologies that may be dynamic traffic. Then, state, action and reward of RL are designed according to urban unsignalized intersection problem. Finally, how to deploy within the RL framework was described, and several experiments with this framework were undertaken to verify the effectiveness of the proposed method.

摘要

在城市无信号交叉口控制自动驾驶车辆（AVs）是一个具有挑战性的问题，特别是在自动驾驶车辆与人类驾驶车辆共存的混合交通环境中。在这项研究中，提出了一种在车路云一体化系统（VRCIS）中结合近端策略优化（PPO）的协调控制方法，其中将这个控制问题表述为强化学习（RL）问题。在这个系统中，车辆与一切（V2X）被用于保持车辆之间的通信，并且车辆无线技术可以检测使用车辆和基础设施（V2I）无线通信的车辆，从而实现一种具有成本效益的方法。然后，在 VRCIS 中定义的联网自动驾驶车辆（CAV）通过强化学习（RL）学习了一项策略，以安全地适应交叉口处的人类驾驶车辆（HDVs）。我们已经开发了一个有效的、可扩展的 RL 框架，可以在可能是动态交通的拓扑中进行通信。然后，根据城市无信号交叉口问题设计了 RL 的状态、动作和奖励。最后，描述了如何在 RL 框架内进行部署，并对该框架进行了多项实验，以验证所提出方法的有效性。