• 文献检索
  • 文档翻译
  • 深度研究
  • 学术资讯
  • Suppr Zotero 插件Zotero 插件
  • 邀请有礼
  • 套餐&价格
  • 历史记录
应用&插件
Suppr Zotero 插件Zotero 插件浏览器插件Mac 客户端Windows 客户端微信小程序
定价
高级版会员购买积分包购买API积分包
服务
文献检索文档翻译深度研究API 文档MCP 服务
关于我们
关于 Suppr公司介绍联系我们用户协议隐私条款
关注我们

Suppr 超能文献

核心技术专利:CN118964589B侵权必究
粤ICP备2023148730 号-1Suppr @ 2026

文献检索

告别复杂PubMed语法,用中文像聊天一样搜索,搜遍4000万医学文献。AI智能推荐,让科研检索更轻松。

立即免费搜索

文件翻译

保留排版,准确专业,支持PDF/Word/PPT等文件格式,支持 12+语言互译。

免费翻译文档

深度研究

AI帮你快速写综述,25分钟生成高质量综述,智能提取关键信息,辅助科研写作。

立即免费体验

基于广义单车图的强化学习在自动驾驶决策中的应用。

Generalized Single-Vehicle-Based Graph Reinforcement Learning for Decision-Making in Autonomous Driving.

机构信息

School of Mechanical Engineering, Beijing Institute of Technology, Beijing 100081, China.

Department of Transport and Planning, Faculty of Civil Engineering and Geosciences, Delft University of Technology, Stevinweg 1, 2628 CN Delft, The Netherlands.

出版信息

Sensors (Basel). 2022 Jun 29;22(13):4935. doi: 10.3390/s22134935.

DOI:10.3390/s22134935
PMID:35808428
原文链接:https://pmc.ncbi.nlm.nih.gov/articles/PMC9269790/
Abstract

In the autonomous driving process, the decision-making system is mainly used to provide macro-control instructions based on the information captured by the sensing system. Learning-based algorithms have apparent advantages in information processing and understanding for an increasingly complex driving environment. To incorporate the interactive information between agents in the environment into the decision-making process, this paper proposes a generalized single-vehicle-based graph neural network reinforcement learning algorithm (SGRL algorithm). The SGRL algorithm introduces graph convolution into the traditional deep neural network (DQN) algorithm, adopts the training method for a single agent, designs a more explicit incentive reward function, and significantly improves the dimension of the action space. The SGRL algorithm is compared with the traditional DQN algorithm (NGRL) and the multi-agent training algorithm (MGRL) in the highway ramp scenario. Results show that the SGRL algorithm has outstanding advantages in network convergence, decision-making effect, and training efficiency.

摘要

在自动驾驶过程中,决策系统主要用于根据传感系统获取的信息提供宏观控制指令。基于学习的算法在处理和理解日益复杂的驾驶环境信息方面具有明显的优势。为了将环境中代理之间的交互信息纳入决策过程,本文提出了一种广义的基于单个车辆的图神经网络强化学习算法(SGRL 算法)。SGRL 算法将图卷积引入传统的深度神经网络(DQN)算法中,采用单代理的训练方法,设计了更明确的激励奖励函数,并显著提高了动作空间的维度。将 SGRL 算法与传统的 DQN 算法(NGRL)和多代理训练算法(MGRL)在高速公路匝道场景中进行了比较。结果表明,SGRL 算法在网络收敛性、决策效果和训练效率方面具有突出的优势。

https://cdn.ncbi.nlm.nih.gov/pmc/blobs/f8cd/9269790/f71523f524ee/sensors-22-04935-g024.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/f8cd/9269790/0bb22d792611/sensors-22-04935-g001.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/f8cd/9269790/0440d400cb5c/sensors-22-04935-g002.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/f8cd/9269790/a2c1163b059f/sensors-22-04935-g003.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/f8cd/9269790/eb0af9d15334/sensors-22-04935-g004.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/f8cd/9269790/a7f84e5f3fa9/sensors-22-04935-g005.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/f8cd/9269790/c28c98bd0369/sensors-22-04935-g006.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/f8cd/9269790/926f7b755b5f/sensors-22-04935-g007.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/f8cd/9269790/17e2915402db/sensors-22-04935-g008.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/f8cd/9269790/552759d9ebfe/sensors-22-04935-g009.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/f8cd/9269790/32865adaff59/sensors-22-04935-g010.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/f8cd/9269790/96cebbcd678f/sensors-22-04935-g011.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/f8cd/9269790/7bdcdbf4fd32/sensors-22-04935-g012.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/f8cd/9269790/ed8dc1b8a239/sensors-22-04935-g013.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/f8cd/9269790/394fbc664f7b/sensors-22-04935-g014.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/f8cd/9269790/d7448a7c7d53/sensors-22-04935-g015.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/f8cd/9269790/b59c4ecaae63/sensors-22-04935-g016.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/f8cd/9269790/aa9d6b49d418/sensors-22-04935-g017.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/f8cd/9269790/e2b9d0ad7849/sensors-22-04935-g018.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/f8cd/9269790/33d2dbc70f19/sensors-22-04935-g019.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/f8cd/9269790/d64f609411a7/sensors-22-04935-g020.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/f8cd/9269790/2179d6136743/sensors-22-04935-g021.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/f8cd/9269790/a87c52b501ad/sensors-22-04935-g022.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/f8cd/9269790/b384559e164d/sensors-22-04935-g023.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/f8cd/9269790/f71523f524ee/sensors-22-04935-g024.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/f8cd/9269790/0bb22d792611/sensors-22-04935-g001.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/f8cd/9269790/0440d400cb5c/sensors-22-04935-g002.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/f8cd/9269790/a2c1163b059f/sensors-22-04935-g003.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/f8cd/9269790/eb0af9d15334/sensors-22-04935-g004.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/f8cd/9269790/a7f84e5f3fa9/sensors-22-04935-g005.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/f8cd/9269790/c28c98bd0369/sensors-22-04935-g006.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/f8cd/9269790/926f7b755b5f/sensors-22-04935-g007.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/f8cd/9269790/17e2915402db/sensors-22-04935-g008.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/f8cd/9269790/552759d9ebfe/sensors-22-04935-g009.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/f8cd/9269790/32865adaff59/sensors-22-04935-g010.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/f8cd/9269790/96cebbcd678f/sensors-22-04935-g011.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/f8cd/9269790/7bdcdbf4fd32/sensors-22-04935-g012.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/f8cd/9269790/ed8dc1b8a239/sensors-22-04935-g013.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/f8cd/9269790/394fbc664f7b/sensors-22-04935-g014.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/f8cd/9269790/d7448a7c7d53/sensors-22-04935-g015.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/f8cd/9269790/b59c4ecaae63/sensors-22-04935-g016.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/f8cd/9269790/aa9d6b49d418/sensors-22-04935-g017.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/f8cd/9269790/e2b9d0ad7849/sensors-22-04935-g018.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/f8cd/9269790/33d2dbc70f19/sensors-22-04935-g019.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/f8cd/9269790/d64f609411a7/sensors-22-04935-g020.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/f8cd/9269790/2179d6136743/sensors-22-04935-g021.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/f8cd/9269790/a87c52b501ad/sensors-22-04935-g022.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/f8cd/9269790/b384559e164d/sensors-22-04935-g023.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/f8cd/9269790/f71523f524ee/sensors-22-04935-g024.jpg

相似文献

1
Generalized Single-Vehicle-Based Graph Reinforcement Learning for Decision-Making in Autonomous Driving.基于广义单车图的强化学习在自动驾驶决策中的应用。
Sensors (Basel). 2022 Jun 29;22(13):4935. doi: 10.3390/s22134935.
2
Multi-Agent Decision-Making Modes in Uncertain Interactive Traffic Scenarios via Graph Convolution-Based Deep Reinforcement Learning.基于图卷积的深度强化学习在不确定交互式交通场景中的多智能体决策模式。
Sensors (Basel). 2022 Jun 17;22(12):4586. doi: 10.3390/s22124586.
3
Optimizing hyperparameters of deep reinforcement learning for autonomous driving based on whale optimization algorithm.基于鲸鱼优化算法优化自动驾驶中深度强化学习的超参数。
PLoS One. 2021 Jun 10;16(6):e0252754. doi: 10.1371/journal.pone.0252754. eCollection 2021.
4
Deep Reinforcement Learning on Autonomous Driving Policy With Auxiliary Critic Network.基于辅助评论家网络的自动驾驶策略深度强化学习
IEEE Trans Neural Netw Learn Syst. 2023 Jul;34(7):3680-3690. doi: 10.1109/TNNLS.2021.3116063. Epub 2023 Jul 6.
5
PORF-DDPG: Learning Personalized Autonomous Driving Behavior with Progressively Optimized Reward Function.PORF-DDPG:使用逐步优化的奖励函数学习个性化自主驾驶行为。
Sensors (Basel). 2020 Oct 1;20(19):5626. doi: 10.3390/s20195626.
6
Deep reinforcement learning for automated radiation adaptation in lung cancer.深度强化学习在肺癌放射自适应中的应用。
Med Phys. 2017 Dec;44(12):6690-6705. doi: 10.1002/mp.12625. Epub 2017 Nov 14.
7
Graph Reinforcement Learning-Based Decision-Making Technology for Connected and Autonomous Vehicles: Framework, Review, and Future Trends.基于图强化学习的联网和自动驾驶车辆决策技术:框架、综述与未来趋势
Sensors (Basel). 2023 Oct 3;23(19):8229. doi: 10.3390/s23198229.
8
Deep Reinforcement Learning With Modulated Hebbian Plus Q-Network Architecture.具有调制赫布型加Q网络架构的深度强化学习
IEEE Trans Neural Netw Learn Syst. 2022 May;33(5):2045-2056. doi: 10.1109/TNNLS.2021.3110281. Epub 2022 May 2.
9
Lane Following Method Based on Improved DDPG Algorithm.基于改进 DDPG 算法的车道跟随方法。
Sensors (Basel). 2021 Jul 15;21(14):4827. doi: 10.3390/s21144827.
10
Interactive Lane Keeping System for Autonomous Vehicles Using LSTM-RNN Considering Driving Environments.基于 LSTM-RNN 的考虑驾驶环境的自动驾驶车辆交互式车道保持系统。
Sensors (Basel). 2022 Dec 15;22(24):9889. doi: 10.3390/s22249889.

引用本文的文献

1
Learning-Based Hierarchical Decision-Making Framework for Automatic Driving in Incompletely Connected Traffic Scenarios.基于学习的不完全连通交通场景下自动驾驶分层决策框架
Sensors (Basel). 2024 Apr 18;24(8):2592. doi: 10.3390/s24082592.
2
Graph Reinforcement Learning-Based Decision-Making Technology for Connected and Autonomous Vehicles: Framework, Review, and Future Trends.基于图强化学习的联网和自动驾驶车辆决策技术:框架、综述与未来趋势
Sensors (Basel). 2023 Oct 3;23(19):8229. doi: 10.3390/s23198229.
3
Advanced Sensing and Safety Control for Connected and Automated Vehicles.

本文引用的文献

1
Review on Vehicle Detection Technology for Unmanned Ground Vehicles.综述:无人地面车辆的车辆检测技术。
Sensors (Basel). 2021 Feb 14;21(4):1354. doi: 10.3390/s21041354.
车联网与自动驾驶的先进感知与安全控制
Sensors (Basel). 2023 Jan 16;23(2):1037. doi: 10.3390/s23021037.
4
A Decision-Making Strategy for Car Following Based on Naturalist Driving Data via Deep Reinforcement Learning.基于自然驾驶数据的深度强化学习跟驰决策策略。
Sensors (Basel). 2022 Oct 21;22(20):8055. doi: 10.3390/s22208055.