Department of Civil and Environmental Engineering, University of Michigan, Ann Arbor, MI, USA.
University of Michigan Transportation Research Institute, Ann Arbor, MI, USA.
Nature. 2023 Mar;615(7953):620-627. doi: 10.1038/s41586-023-05732-2. Epub 2023 Mar 22.
One critical bottleneck that impedes the development and deployment of autonomous vehicles is the prohibitively high economic and time costs required to validate their safety in a naturalistic driving environment, owing to the rarity of safety-critical events. Here we report the development of an intelligent testing environment, where artificial-intelligence-based background agents are trained to validate the safety performances of autonomous vehicles in an accelerated mode, without loss of unbiasedness. From naturalistic driving data, the background agents learn what adversarial manoeuvre to execute through a dense deep-reinforcement-learning (D2RL) approach, in which Markov decision processes are edited by removing non-safety-critical states and reconnecting critical ones so that the information in the training data is densified. D2RL enables neural networks to learn from densified information with safety-critical events and achieves tasks that are intractable for traditional deep-reinforcement-learning approaches. We demonstrate the effectiveness of our approach by testing a highly automated vehicle in both highway and urban test tracks with an augmented-reality environment, combining simulated background vehicles with physical road infrastructure and a real autonomous test vehicle. Our results show that the D2RL-trained agents can accelerate the evaluation process by multiple orders of magnitude (10 to 10 times faster). In addition, D2RL will enable accelerated testing and training with other safety-critical autonomous systems.
一个阻碍自动驾驶汽车发展和部署的关键瓶颈是,由于安全关键事件的罕见性,在自然驾驶环境中验证其安全性所需的经济和时间成本过高。在这里,我们报告了一种智能测试环境的开发,其中基于人工智能的背景代理被训练以加速模式验证自动驾驶汽车的安全性能,而不会失去公正性。从自然驾驶数据中,背景代理通过密集的深度强化学习(D2RL)方法学习执行什么对抗性操作,在该方法中,通过删除非安全关键状态并重新连接关键状态来编辑马尔可夫决策过程,从而使训练数据中的信息更加密集。D2RL 使神经网络能够从具有安全关键事件的密集信息中学习,并完成传统深度强化学习方法难以完成的任务。我们通过在高速公路和城市测试轨道上使用增强现实环境测试高度自动化的车辆来验证我们方法的有效性,该环境将模拟背景车辆与物理道路基础设施和真实的自动驾驶测试车辆相结合。我们的结果表明,经过 D2RL 训练的代理可以将评估过程加速多个数量级(快 10 到 10 倍)。此外,D2RL 将能够为其他安全关键型自动驾驶系统提供加速测试和培训。