• 文献检索
  • 文档翻译
  • 深度研究
  • 学术资讯
  • Suppr Zotero 插件Zotero 插件
  • 邀请有礼
  • 套餐&价格
  • 历史记录
应用&插件
Suppr Zotero 插件Zotero 插件浏览器插件Mac 客户端Windows 客户端微信小程序
定价
高级版会员购买积分包购买API积分包
服务
文献检索文档翻译深度研究API 文档MCP 服务
关于我们
关于 Suppr公司介绍联系我们用户协议隐私条款
关注我们

Suppr 超能文献

核心技术专利:CN118964589B侵权必究
粤ICP备2023148730 号-1Suppr @ 2026

文献检索

告别复杂PubMed语法,用中文像聊天一样搜索,搜遍4000万医学文献。AI智能推荐,让科研检索更轻松。

立即免费搜索

文件翻译

保留排版,准确专业,支持PDF/Word/PPT等文件格式,支持 12+语言互译。

免费翻译文档

深度研究

AI帮你快速写综述,25分钟生成高质量综述,智能提取关键信息,辅助科研写作。

立即免费体验

基于优化嵌入强化学习的环岛场景下自动驾驶车辆自适应决策

Adaptive Decision-Making for Automated Vehicles Under Roundabout Scenarios Using Optimization Embedded Reinforcement Learning.

出版信息

IEEE Trans Neural Netw Learn Syst. 2021 Dec;32(12):5526-5538. doi: 10.1109/TNNLS.2020.3042981. Epub 2021 Nov 30.

DOI:10.1109/TNNLS.2020.3042981
PMID:33378264
Abstract

The roundabout is a typical changeable, interactive scenario in which automated vehicles should make adaptive and safe decisions. In this article, an optimization embedded reinforcement learning (OERL) is proposed to achieve adaptive decision-making under the roundabout. The promotion is the modified actor of the Actor-Critic framework, which embeds the model-based optimization method in reinforcement learning to explore continuous behaviors in action space directly. Therefore, the proposed method can determine the macroscale behavior (change lane or not) and medium-scale behaviors of desired acceleration and action time simultaneously with high sample efficiency. When scenarios change, medium-scale behaviors can be adjusted timely by the embedded direct search method, promoting the adaptability of decision-making. More notably, the modified actor matches human drivers' behaviors, macroscale behavior captures the human mind's jump, and medium-scale behaviors are preferentially adjusted through driving skills. To enable the agent adapts to different types of the roundabout, task representation is designed to restructure the policy network. In experiments, the algorithm efficiency and the learned driving strategy are compared with decision-making containing macroscale behavior and constant medium-scale behaviors of the desired acceleration and action time. To investigate the adaptability, the performance under an untrained type of roundabout and two more dangerous situations are simulated to verify that the proposed method changes the decisions with changeable scenarios accordingly. The results show that the proposed method has high algorithm efficiency and better system performance.

摘要

环岛是一个典型的多变、交互场景,自动驾驶车辆应在此场景中做出自适应和安全的决策。本文提出了一种优化嵌入式强化学习(OERL)方法,以实现环岛环境下的自适应决策。该方法的提升在于强化学习中的基于模型优化方法被嵌入到 Actor-Critic 框架的动作中,从而可以直接在动作空间中探索连续行为。因此,该方法可以以高样本效率同时确定宏观行为(是否变道)和期望加速度和动作时间的中观行为。当场景发生变化时,嵌入式直接搜索方法可以及时调整中观行为,从而提高决策的适应性。更值得注意的是,改进后的动作与人类驾驶员的行为相匹配,宏观行为捕捉了人类思维的跳跃,而中观行为则通过驾驶技能进行优先调整。为了使智能体适应不同类型的环岛,设计了任务表示来重构策略网络。在实验中,将算法效率和学习到的驾驶策略与包含宏观行为和期望加速度和动作时间不变的中观行为的决策进行了比较。为了研究适应性,模拟了未训练类型的环岛和两种更危险的情况,以验证所提出的方法可以根据变化的场景相应地改变决策。结果表明,该方法具有较高的算法效率和更好的系统性能。

相似文献

1
Adaptive Decision-Making for Automated Vehicles Under Roundabout Scenarios Using Optimization Embedded Reinforcement Learning.基于优化嵌入强化学习的环岛场景下自动驾驶车辆自适应决策
IEEE Trans Neural Netw Learn Syst. 2021 Dec;32(12):5526-5538. doi: 10.1109/TNNLS.2020.3042981. Epub 2021 Nov 30.
2
Reinforcement Learning-Based Autonomous Driving at Intersections in CARLA Simulator.基于强化学习的CARLA模拟器中十字路口自动驾驶
Sensors (Basel). 2022 Nov 1;22(21):8373. doi: 10.3390/s22218373.
3
Deep Reinforcement Learning on Autonomous Driving Policy With Auxiliary Critic Network.基于辅助评论家网络的自动驾驶策略深度强化学习
IEEE Trans Neural Netw Learn Syst. 2023 Jul;34(7):3680-3690. doi: 10.1109/TNNLS.2021.3116063. Epub 2023 Jul 6.
4
Intelligent control of self-driving vehicles based on adaptive sampling supervised actor-critic and human driving experience.基于自适应采样监督式智能体-评论家算法和人类驾驶经验的自动驾驶车辆智能控制
Math Biosci Eng. 2024 May 24;21(5):6077-6096. doi: 10.3934/mbe.2024267.
5
An integrated architecture for intelligence evaluation of automated vehicles.一种用于自动驾驶车辆智能评估的集成架构。
Accid Anal Prev. 2020 Sep;145:105681. doi: 10.1016/j.aap.2020.105681. Epub 2020 Jul 24.
6
Risk Assessment of Roundabout Scenarios in Virtual Testing Based on an Improved Driving Safety Field.基于改进驾驶安全场的虚拟测试中环形交叉场景风险评估
Sensors (Basel). 2024 Aug 27;24(17):5539. doi: 10.3390/s24175539.
7
Efficient Deep Reinforcement Learning With Imitative Expert Priors for Autonomous Driving.基于模仿专家先验的高效深度强化学习用于自动驾驶
IEEE Trans Neural Netw Learn Syst. 2023 Oct;34(10):7391-7403. doi: 10.1109/TNNLS.2022.3142822. Epub 2023 Oct 5.
8
End-to-End Automated Lane-Change Maneuvering Considering Driving Style Using a Deep Deterministic Policy Gradient Algorithm.基于深度确定性策略梯度算法的考虑驾驶风格的端到端自动变道行驶。
Sensors (Basel). 2020 Sep 22;20(18):5443. doi: 10.3390/s20185443.
9
A Multi-Task Fusion Strategy-Based Decision-Making and Planning Method for Autonomous Driving Vehicles.一种基于多任务融合策略的自动驾驶车辆决策与规划方法
Sensors (Basel). 2023 Aug 8;23(16):7021. doi: 10.3390/s23167021.
10
Bi-Level Coordinated Merging of Connected and Automated Vehicles at Roundabouts.环形交叉路口中联网和自动驾驶车辆的双级协同合并
Sensors (Basel). 2021 Sep 30;21(19):6533. doi: 10.3390/s21196533.

引用本文的文献

1
How Do Autonomous Vehicles Decide?自动驾驶汽车如何决策?
Sensors (Basel). 2022 Dec 28;23(1):317. doi: 10.3390/s23010317.
2
Multivariable Optimisation for Waiting-Time Minimisation at Roundabout Intersections in a Cyber-Physical Framework.多变量优化在网络物理框架下环岛交叉口的等待时间最小化。
Sensors (Basel). 2021 Jun 9;21(12):3968. doi: 10.3390/s21123968.