• 文献检索
  • 文档翻译
  • 深度研究
  • 学术资讯
  • Suppr Zotero 插件Zotero 插件
  • 邀请有礼
  • 套餐&价格
  • 历史记录
应用&插件
Suppr Zotero 插件Zotero 插件浏览器插件Mac 客户端Windows 客户端微信小程序
定价
高级版会员购买积分包购买API积分包
服务
文献检索文档翻译深度研究API 文档MCP 服务
关于我们
关于 Suppr公司介绍联系我们用户协议隐私条款
关注我们

Suppr 超能文献

核心技术专利:CN118964589B侵权必究
粤ICP备2023148730 号-1Suppr @ 2026

文献检索

告别复杂PubMed语法,用中文像聊天一样搜索,搜遍4000万医学文献。AI智能推荐,让科研检索更轻松。

立即免费搜索

文件翻译

保留排版,准确专业,支持PDF/Word/PPT等文件格式,支持 12+语言互译。

免费翻译文档

深度研究

AI帮你快速写综述,25分钟生成高质量综述,智能提取关键信息,辅助科研写作。

立即免费体验

基于深度强化学习的智能车模型转换轨迹规划方法。

Intelligent Land-Vehicle Model Transfer Trajectory Planning Method Based on Deep Reinforcement Learning.

机构信息

School of Information Science and Engineering, Central South University, Changsha 410083, China.

State Key Laboratory of Robotics and System, Harbin Institute of Technology, Haerbin 150001, China.

出版信息

Sensors (Basel). 2018 Sep 1;18(9):2905. doi: 10.3390/s18092905.

DOI:10.3390/s18092905
PMID:30200499
原文链接:https://pmc.ncbi.nlm.nih.gov/articles/PMC6164024/
Abstract

To address the problem of model error and tracking dependence in the process of intelligent vehicle motion planning, an intelligent vehicle model transfer trajectory planning method based on deep reinforcement learning is proposed, which is able to obtain an effective control action sequence directly. Firstly, an abstract model of the real environment is extracted. On this basis, a deep deterministic policy gradient (DDPG) and a vehicle dynamic model are adopted to jointly train a reinforcement learning model, and to decide the optimal intelligent driving maneuver. Secondly, the actual scene is transferred to an equivalent virtual abstract scene using a transfer model. Furthermore, the control action and trajectory sequences are calculated according to the trained deep reinforcement learning model. Thirdly, the optimal trajectory sequence is selected according to an evaluation function in the real environment. Finally, the results demonstrate that the proposed method can deal with the problem of intelligent vehicle trajectory planning for continuous input and continuous output. The model transfer method improves the model's generalization performance. Compared with traditional trajectory planning, the proposed method outputs continuous rotation-angle control sequences. Moreover, the lateral control errors are also reduced.

摘要

为了解决智能车辆运动规划过程中模型误差和跟踪依赖的问题,提出了一种基于深度强化学习的智能车辆模型迁移轨迹规划方法,该方法能够直接获得有效的控制动作序列。首先,提取真实环境的抽象模型。在此基础上,采用深度确定性策略梯度(DDPG)和车辆动力学模型共同训练强化学习模型,从而确定最优智能驾驶操纵。其次,使用迁移模型将实际场景转换为等效的虚拟抽象场景。然后,根据训练好的深度强化学习模型计算控制动作和轨迹序列。再次,根据真实环境中的评估函数选择最优轨迹序列。最后,结果表明,所提出的方法可以处理连续输入和连续输出的智能车辆轨迹规划问题。模型迁移方法提高了模型的泛化性能。与传统轨迹规划相比,所提出的方法输出连续的转角控制序列,并且减少了横向控制误差。

https://cdn.ncbi.nlm.nih.gov/pmc/blobs/4a93/6164024/a815288483c5/sensors-18-02905-g014a.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/4a93/6164024/0926794ff1f7/sensors-18-02905-g012.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/4a93/6164024/1cb6f2b9d0d7/sensors-18-02905-g013a.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/4a93/6164024/a815288483c5/sensors-18-02905-g014a.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/4a93/6164024/0926794ff1f7/sensors-18-02905-g012.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/4a93/6164024/1cb6f2b9d0d7/sensors-18-02905-g013a.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/4a93/6164024/a815288483c5/sensors-18-02905-g014a.jpg

相似文献

1
Intelligent Land-Vehicle Model Transfer Trajectory Planning Method Based on Deep Reinforcement Learning.基于深度强化学习的智能车模型转换轨迹规划方法。
Sensors (Basel). 2018 Sep 1;18(9):2905. doi: 10.3390/s18092905.
2
Model-Based Predictive Control and Reinforcement Learning for Planning Vehicle-Parking Trajectories for Vertical Parking Spaces.基于模型的预测控制与强化学习用于垂直停车位的车辆泊车轨迹规划
Sensors (Basel). 2023 Aug 11;23(16):7124. doi: 10.3390/s23167124.
3
Coordinated Decision Control of Lane-Change and Car-Following for Intelligent Vehicle Based on Time Series Prediction and Deep Reinforcement Learning.基于时间序列预测和深度强化学习的智能车辆变道与跟车协同决策控制
Sensors (Basel). 2024 Jan 9;24(2):403. doi: 10.3390/s24020403.
4
An Autonomous Path Planning Model for Unmanned Ships Based on Deep Reinforcement Learning.基于深度强化学习的无人船自主路径规划模型。
Sensors (Basel). 2020 Jan 11;20(2):426. doi: 10.3390/s20020426.
5
Intelligent Vehicle Decision-Making and Trajectory Planning Method Based on Deep Reinforcement Learning in the Frenet Space.基于弗伦内特空间深度强化学习的智能车辆决策与轨迹规划方法
Sensors (Basel). 2023 Dec 14;23(24):9819. doi: 10.3390/s23249819.
6
Lane changing trajectory planning and tracking control for intelligent vehicle on curved road.弯道上智能车辆的变道轨迹规划与跟踪控制
Springerplus. 2016 Jul 22;5(1):1150. doi: 10.1186/s40064-016-2806-0. eCollection 2016.
7
End-to-End Automated Lane-Change Maneuvering Considering Driving Style Using a Deep Deterministic Policy Gradient Algorithm.基于深度确定性策略梯度算法的考虑驾驶风格的端到端自动变道行驶。
Sensors (Basel). 2020 Sep 22;20(18):5443. doi: 10.3390/s20185443.
8
Deep Reinforcement Learning Based Trajectory Planning Under Uncertain Constraints.基于深度强化学习的不确定约束下轨迹规划
Front Neurorobot. 2022 May 2;16:883562. doi: 10.3389/fnbot.2022.883562. eCollection 2022.
9
Optimization of news dissemination push mode by intelligent edge computing technology for deep learning.基于深度学习的智能边缘计算技术对新闻传播推送模式的优化
Sci Rep. 2024 Mar 20;14(1):6671. doi: 10.1038/s41598-024-53859-7.
10
The use of deep learning algorithm and digital media art in all-media intelligent electronic music system.深度学习算法和数字媒体艺术在全媒体智能电子音乐系统中的应用。
PLoS One. 2020 Oct 19;15(10):e0240492. doi: 10.1371/journal.pone.0240492. eCollection 2020.

引用本文的文献

1
and Classification Based on Visible Capsule Images Using a Modified MobileNetV3-Small Network with Transfer Learning.以及基于可见胶囊图像的分类,使用具有迁移学习的改进型MobileNetV3小网络。
Entropy (Basel). 2023 Mar 3;25(3):447. doi: 10.3390/e25030447.
2
Metalearning-Based Fault-Tolerant Control for Skid Steering Vehicles under Actuator Fault Conditions.基于元学习的驱动轮转向车辆在执行器故障情况下的容错控制
Sensors (Basel). 2022 Jan 22;22(3):845. doi: 10.3390/s22030845.
3
A Collision Relationship-Based Driving Behavior Decision-Making Method for an Intelligent Land Vehicle at a Disorderly Intersection via DRQN.

本文引用的文献

1
Optimal Polygon Decomposition for UAV Survey Coverage Path Planning in Wind.风场中无人机勘测覆盖路径规划的最优多边形分解
Sensors (Basel). 2018 Jul 3;18(7):2132. doi: 10.3390/s18072132.
2
A Method on Dynamic Path Planning for Robotic Manipulator Autonomous Obstacle Avoidance Based on an Improved RRT Algorithm.一种基于改进RRT算法的机器人操作臂自主避障动态路径规划方法。
Sensors (Basel). 2018 Feb 13;18(2):571. doi: 10.3390/s18020571.
3
A simple introduction to Markov Chain Monte-Carlo sampling.马尔可夫链蒙特卡罗采样简介。
一种基于碰撞关系的智能陆地车辆在无序交叉路口通过深度循环Q网络的驾驶行为决策方法
Sensors (Basel). 2022 Jan 14;22(2):636. doi: 10.3390/s22020636.
4
An Autonomous Path Planning Model for Unmanned Ships Based on Deep Reinforcement Learning.基于深度强化学习的无人船自主路径规划模型。
Sensors (Basel). 2020 Jan 11;20(2):426. doi: 10.3390/s20020426.
5
Learn to Steer through Deep Reinforcement Learning.学会通过深度强化学习来掌舵。
Sensors (Basel). 2018 Oct 27;18(11):3650. doi: 10.3390/s18113650.
Psychon Bull Rev. 2018 Feb;25(1):143-154. doi: 10.3758/s13423-016-1015-8.
4
Mastering the game of Go with deep neural networks and tree search.用深度神经网络和树搜索掌握围棋游戏。
Nature. 2016 Jan 28;529(7587):484-9. doi: 10.1038/nature16961.
5
Human-level control through deep reinforcement learning.通过深度强化学习实现人类水平的控制。
Nature. 2015 Feb 26;518(7540):529-33. doi: 10.1038/nature14236.
6
Odometry and laser scanner fusion based on a discrete extended Kalman Filter for robotic platooning guidance.基于离散扩展卡尔曼滤波器的机器人编队制导的里程计和激光扫描仪融合。
Sensors (Basel). 2011;11(9):8339-57. doi: 10.3390/s110908339. Epub 2011 Aug 29.
7
A model of hippocampally dependent navigation, using the temporal difference learning rule.一种使用时间差分学习规则的海马体依赖性导航模型。
Hippocampus. 2000;10(1):1-16. doi: 10.1002/(SICI)1098-1063(2000)10:1<1::AID-HIPO1>3.0.CO;2-1.