• 文献检索
  • 文档翻译
  • 深度研究
  • 学术资讯
  • Suppr Zotero 插件Zotero 插件
  • 邀请有礼
  • 套餐&价格
  • 历史记录
应用&插件
Suppr Zotero 插件Zotero 插件浏览器插件Mac 客户端Windows 客户端微信小程序
定价
高级版会员购买积分包购买API积分包
服务
文献检索文档翻译深度研究API 文档MCP 服务
关于我们
关于 Suppr公司介绍联系我们用户协议隐私条款
关注我们

Suppr 超能文献

核心技术专利:CN118964589B侵权必究
粤ICP备2023148730 号-1Suppr @ 2026

文献检索

告别复杂PubMed语法,用中文像聊天一样搜索,搜遍4000万医学文献。AI智能推荐,让科研检索更轻松。

立即免费搜索

文件翻译

保留排版,准确专业,支持PDF/Word/PPT等文件格式,支持 12+语言互译。

免费翻译文档

深度研究

AI帮你快速写综述,25分钟生成高质量综述,智能提取关键信息,辅助科研写作。

立即免费体验

用于自动驾驶决策的知识蒸馏增强行为Transformer

Knowledge Distillation-Enhanced Behavior Transformer for Decision-Making of Autonomous Driving.

作者信息

Zhao Rui, Fan Yuze, Li Yun, Zhang Dong, Gao Fei, Gao Zhenhai, Yang Zhengcai

机构信息

College of Automotive Engineering, Jilin University, Changchun 130025, China.

Graduate School of Information and Science Technology, The University of Tokyo, Tokyo 113-8654, Japan.

出版信息

Sensors (Basel). 2025 Jan 1;25(1):191. doi: 10.3390/s25010191.

DOI:10.3390/s25010191
PMID:39796987
原文链接:https://pmc.ncbi.nlm.nih.gov/articles/PMC11723080/
Abstract

Autonomous driving has demonstrated impressive driving capabilities, with behavior decision-making playing a crucial role as a bridge between perception and control. Imitation Learning (IL) and Reinforcement Learning (RL) have introduced innovative approaches to behavior decision-making in autonomous driving, but challenges remain. On one hand, RL's policy networks often lack sufficient reasoning ability to make optimal decisions in highly complex and stochastic environments. On the other hand, the complexity of these environments leads to low sample efficiency in RL, making it difficult to efficiently learn driving policies. To address these challenges, we propose an innovative Knowledge Distillation-Enhanced Behavior Transformer (KD-BeT) framework. Building on the successful application of Transformers in large language models, we introduce the Behavior Transformer as the policy network in RL, using observation-action history as input for sequential decision-making, thereby leveraging the Transformer's contextual reasoning capabilities. Using a teacher-student paradigm, we first train a small-capacity teacher model quickly and accurately through IL, then apply knowledge distillation to accelerate RL's training efficiency and performance. Simulation results demonstrate that KD-BeT maintains fast convergence and high asymptotic performance during training. In the CARLA NoCrash benchmark tests, KD-BeT outperforms other state-of-the-art methods in terms of traffic efficiency and driving safety, offering a novel solution for addressing real-world autonomous driving tasks.

摘要

自动驾驶已经展现出令人印象深刻的驾驶能力,行为决策作为感知与控制之间的桥梁发挥着关键作用。模仿学习(IL)和强化学习(RL)为自动驾驶中的行为决策引入了创新方法,但挑战依然存在。一方面,强化学习的策略网络在高度复杂和随机的环境中往往缺乏足够的推理能力来做出最优决策。另一方面,这些环境的复杂性导致强化学习中的样本效率低下,难以高效地学习驾驶策略。为应对这些挑战,我们提出了一种创新的知识蒸馏增强行为变换器(KD-BeT)框架。基于变换器在大语言模型中的成功应用,我们引入行为变换器作为强化学习中的策略网络,将观察-动作历史作为顺序决策的输入,从而利用变换器的上下文推理能力。使用师生范式,我们首先通过模仿学习快速准确地训练一个小容量的教师模型,然后应用知识蒸馏来提高强化学习的训练效率和性能。仿真结果表明,KD-BeT在训练过程中保持快速收敛和高渐近性能。在CARLA无碰撞基准测试中,KD-BeT在交通效率和驾驶安全性方面优于其他现有先进方法,为解决现实世界中的自动驾驶任务提供了一种新颖的解决方案。

相似文献

1
Knowledge Distillation-Enhanced Behavior Transformer for Decision-Making of Autonomous Driving.用于自动驾驶决策的知识蒸馏增强行为Transformer
Sensors (Basel). 2025 Jan 1;25(1):191. doi: 10.3390/s25010191.
2
Constraint-Guided Behavior Transformer for Centralized Coordination of Connected and Automated Vehicles at Intersections.用于交叉路口联网和自动驾驶车辆集中协调的约束引导行为变换器
Sensors (Basel). 2024 Aug 11;24(16):5187. doi: 10.3390/s24165187.
3
Bidirectional Planning for Autonomous Driving Framework with Large Language Model.基于大语言模型的自动驾驶框架双向规划
Sensors (Basel). 2024 Oct 19;24(20):6723. doi: 10.3390/s24206723.
4
Towards Robust Decision-Making for Autonomous Highway Driving Based on Safe Reinforcement Learning.基于安全强化学习的稳健自主高速公路驾驶决策方法
Sensors (Basel). 2024 Jun 26;24(13):4140. doi: 10.3390/s24134140.
5
Achieving efficient interpretability of reinforcement learning via policy distillation and selective input gradient regularization.通过策略蒸馏和选择性输入梯度正则化实现强化学习的高效可解释性。
Neural Netw. 2023 Apr;161:228-241. doi: 10.1016/j.neunet.2023.01.025. Epub 2023 Jan 24.
6
Safe Autonomous Driving with Latent Dynamics and State-Wise Constraints.基于潜在动力学和状态约束的安全自动驾驶
Sensors (Basel). 2024 May 15;24(10):3139. doi: 10.3390/s24103139.
7
Deep reinforcement learning navigation via decision transformer in autonomous driving.自动驾驶中基于决策变换器的深度强化学习导航
Front Neurorobot. 2024 Mar 19;18:1338189. doi: 10.3389/fnbot.2024.1338189. eCollection 2024.
8
Human-Guided Reinforcement Learning With Sim-to-Real Transfer for Autonomous Navigation.用于自主导航的基于人引导强化学习的模拟到现实迁移
IEEE Trans Pattern Anal Mach Intell. 2023 Dec;45(12):14745-14759. doi: 10.1109/TPAMI.2023.3314762. Epub 2023 Nov 3.
9
A Hybrid Online Off-Policy Reinforcement Learning Agent Framework Supported by Transformers.基于 Transformer 的混合在线非策略强化学习代理框架。
Int J Neural Syst. 2023 Dec;33(12):2350065. doi: 10.1142/S012906572350065X. Epub 2023 Oct 20.
10
On Transforming Reinforcement Learning With Transformers: The Development Trajectory.关于用Transformer架构改造强化学习:发展轨迹
IEEE Trans Pattern Anal Mach Intell. 2024 Dec;46(12):8580-8599. doi: 10.1109/TPAMI.2024.3408271. Epub 2024 Nov 6.

本文引用的文献

1
TransFuser: Imitation With Transformer-Based Sensor Fusion for Autonomous Driving.TransFuser:基于Transformer的传感器融合在自动驾驶中的模仿应用
IEEE Trans Pattern Anal Mach Intell. 2023 Nov;45(11):12878-12895. doi: 10.1109/TPAMI.2022.3200245. Epub 2023 Oct 3.
2
A comprehensive study of speed prediction in transportation system: From vehicle to traffic.交通系统中速度预测的全面研究:从车辆到交通流
iScience. 2022 Feb 12;25(3):103909. doi: 10.1016/j.isci.2022.103909. eCollection 2022 Mar 18.