• 文献检索
  • 文档翻译
  • 深度研究
  • 学术资讯
  • Suppr Zotero 插件Zotero 插件
  • 邀请有礼
  • 套餐&价格
  • 历史记录
应用&插件
Suppr Zotero 插件Zotero 插件浏览器插件Mac 客户端Windows 客户端微信小程序
定价
高级版会员购买积分包购买API积分包
服务
文献检索文档翻译深度研究API 文档MCP 服务
关于我们
关于 Suppr公司介绍联系我们用户协议隐私条款
关注我们

Suppr 超能文献

核心技术专利:CN118964589B侵权必究
粤ICP备2023148730 号-1Suppr @ 2026

文献检索

告别复杂PubMed语法,用中文像聊天一样搜索,搜遍4000万医学文献。AI智能推荐,让科研检索更轻松。

立即免费搜索

文件翻译

保留排版,准确专业,支持PDF/Word/PPT等文件格式,支持 12+语言互译。

免费翻译文档

深度研究

AI帮你快速写综述,25分钟生成高质量综述,智能提取关键信息,辅助科研写作。

立即免费体验

数据高效强化学习在自动化泊车系统中的横向规划与控制集成。

Data Efficient Reinforcement Learning for Integrated Lateral Planning and Control in Automated Parking System.

机构信息

School of Automotive Studies, Tongji University, Shanghai 201804, China.

出版信息

Sensors (Basel). 2020 Dec 18;20(24):7297. doi: 10.3390/s20247297.

DOI:10.3390/s20247297
PMID:33353153
原文链接:https://pmc.ncbi.nlm.nih.gov/articles/PMC7766926/
Abstract

Reinforcement learning (RL) is a promising direction in automated parking systems (APSs), as integrating planning and tracking control using RL can potentially maximize the overall performance. However, commonly used model-free RL requires many interactions to achieve acceptable performance, and model-based RL in APS cannot continuously learn. In this paper, a data-efficient RL method is constructed to learn from data by use of a model-based method. The proposed method uses a truncated Monte Carlo tree search to evaluate parking states and select moves. Two artificial neural networks are trained to provide the search probability of each tree branch and the final reward for each state using self-trained data. The data efficiency is enhanced by weighting exploration with parking trajectory returns, an adaptive exploration scheme, and experience augmentation with imaginary rollouts. Without human demonstrations, a novel training pipeline is also used to train the initial action guidance network and the state value network. Compared with path planning and path-following methods, the proposed integrated method can flexibly co-ordinate the longitudinal and lateral motion to park a smaller parking space in one maneuver. Its adaptability to changes in the vehicle model is verified by joint Carsim and MATLAB simulation, demonstrating that the algorithm converges within a few iterations. Finally, experiments using a real vehicle platform are used to further verify the effectiveness of the proposed method. Compared with obtaining rewards using simulation, the proposed method achieves a better final parking attitude and success rate.

摘要

强化学习 (RL) 是自动泊车系统 (APS) 的一个有前途的方向,因为使用 RL 集成规划和跟踪控制可以最大限度地提高整体性能。然而,常用的无模型 RL 需要许多交互才能达到可接受的性能,而 APS 中的基于模型的 RL 不能持续学习。在本文中,构建了一种数据高效的 RL 方法,通过基于模型的方法从数据中学习。所提出的方法使用截断的蒙特卡罗树搜索来评估泊车状态并选择移动。训练了两个人工神经网络,以使用自训练数据提供每条树分支的搜索概率和每个状态的最终奖励。通过使用泊车轨迹回报、自适应探索方案和想象滚动的经验增强来加权探索,提高了数据效率。在没有人为示范的情况下,还使用了一种新的训练管道来训练初始动作引导网络和状态值网络。与路径规划和路径跟踪方法相比,所提出的集成方法可以灵活地协调纵向和横向运动,以便在一次操作中停放更小的停车位。通过联合 Carsim 和 MATLAB 仿真验证了其对车辆模型变化的适应性,表明算法在几次迭代内收敛。最后,使用真实车辆平台进行的实验进一步验证了所提出方法的有效性。与使用仿真获得奖励相比,所提出的方法实现了更好的最终泊车姿态和成功率。

https://cdn.ncbi.nlm.nih.gov/pmc/blobs/b700/7766926/964fe846bdb1/sensors-20-07297-g016.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/b700/7766926/c42958987086/sensors-20-07297-g001.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/b700/7766926/52ac34680b3b/sensors-20-07297-g002.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/b700/7766926/decc9fcfe9ca/sensors-20-07297-g003.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/b700/7766926/6fb6841ae59f/sensors-20-07297-g004.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/b700/7766926/99101a239a95/sensors-20-07297-g005.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/b700/7766926/1678b1f99b1a/sensors-20-07297-g006.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/b700/7766926/6fa31f980c31/sensors-20-07297-g007.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/b700/7766926/a4ddaa8f79ba/sensors-20-07297-g008.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/b700/7766926/6fa4d94f30f7/sensors-20-07297-g009.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/b700/7766926/8bcc89e744fc/sensors-20-07297-g010.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/b700/7766926/80bbdad77a4f/sensors-20-07297-g011.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/b700/7766926/1268b38eed4f/sensors-20-07297-g012.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/b700/7766926/aa2328a17fbe/sensors-20-07297-g013.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/b700/7766926/32ab7fc4efca/sensors-20-07297-g014.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/b700/7766926/9b71648f6bd6/sensors-20-07297-g015.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/b700/7766926/964fe846bdb1/sensors-20-07297-g016.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/b700/7766926/c42958987086/sensors-20-07297-g001.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/b700/7766926/52ac34680b3b/sensors-20-07297-g002.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/b700/7766926/decc9fcfe9ca/sensors-20-07297-g003.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/b700/7766926/6fb6841ae59f/sensors-20-07297-g004.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/b700/7766926/99101a239a95/sensors-20-07297-g005.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/b700/7766926/1678b1f99b1a/sensors-20-07297-g006.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/b700/7766926/6fa31f980c31/sensors-20-07297-g007.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/b700/7766926/a4ddaa8f79ba/sensors-20-07297-g008.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/b700/7766926/6fa4d94f30f7/sensors-20-07297-g009.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/b700/7766926/8bcc89e744fc/sensors-20-07297-g010.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/b700/7766926/80bbdad77a4f/sensors-20-07297-g011.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/b700/7766926/1268b38eed4f/sensors-20-07297-g012.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/b700/7766926/aa2328a17fbe/sensors-20-07297-g013.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/b700/7766926/32ab7fc4efca/sensors-20-07297-g014.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/b700/7766926/9b71648f6bd6/sensors-20-07297-g015.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/b700/7766926/964fe846bdb1/sensors-20-07297-g016.jpg

相似文献

1
Data Efficient Reinforcement Learning for Integrated Lateral Planning and Control in Automated Parking System.数据高效强化学习在自动化泊车系统中的横向规划与控制集成。
Sensors (Basel). 2020 Dec 18;20(24):7297. doi: 10.3390/s20247297.
2
Reinforcement Learning-Based End-to-End Parking for Automatic Parking System.基于强化学习的全自动泊车系统端到端泊车
Sensors (Basel). 2019 Sep 16;19(18):3996. doi: 10.3390/s19183996.
3
Model-Based Predictive Control and Reinforcement Learning for Planning Vehicle-Parking Trajectories for Vertical Parking Spaces.基于模型的预测控制与强化学习用于垂直停车位的车辆泊车轨迹规划
Sensors (Basel). 2023 Aug 11;23(16):7124. doi: 10.3390/s23167124.
4
Hierarchical Trajectory Planning for Narrow-Space Automated Parking with Deep Reinforcement Learning: A Federated Learning Scheme.基于深度强化学习的狭窄空间自动泊车分层轨迹规划:联邦学习方案。
Sensors (Basel). 2023 Apr 18;23(8):4087. doi: 10.3390/s23084087.
5
Autonomous maneuver decision-making method based on reinforcement learning and Monte Carlo tree search.基于强化学习和蒙特卡洛树搜索的自主机动决策方法
Front Neurorobot. 2022 Oct 25;16:996412. doi: 10.3389/fnbot.2022.996412. eCollection 2022.
6
Hybrid Residual Multiexpert Reinforcement Learning for Spatial Scheduling of High-Density Parking Lots.用于高密度停车场空间调度的混合残差多专家强化学习
IEEE Trans Cybern. 2024 May;54(5):2771-2783. doi: 10.1109/TCYB.2023.3312647. Epub 2024 Apr 16.
7
Design of a reinforcement learning-based intelligent car transfer planning system for parking lots.基于强化学习的停车场智能车辆转移规划系统设计
Math Biosci Eng. 2024 Jan;21(1):1058-1081. doi: 10.3934/mbe.2024044. Epub 2022 Dec 22.
8
Robust Parking Path Planning with Error-Adaptive Sampling under Perception Uncertainty.基于感知不确定性的鲁棒停车路径规划与误差自适应采样。
Sensors (Basel). 2020 Jun 23;20(12):3560. doi: 10.3390/s20123560.
9
Reinforcement Learning Tracking Control for Robotic Manipulator With Kernel-Based Dynamic Model.基于核动态模型的机器人机械手强化学习跟踪控制
IEEE Trans Neural Netw Learn Syst. 2020 Sep;31(9):3570-3578. doi: 10.1109/TNNLS.2019.2945019. Epub 2019 Nov 1.
10
EPSDNet: Efficient Campus Parking Space Detection via Convolutional Neural Networks and Vehicle Image Recognition for Intelligent Human-Computer Interactions.EPSDNet:基于卷积神经网络和车辆图像识别的高效校园泊车位检测,用于智能人机交互。
Sensors (Basel). 2022 Dec 14;22(24):9835. doi: 10.3390/s22249835.

引用本文的文献

1
Comparative Analysis of Adaptation Behaviors of Different Types of Drivers to Steer-by-Wire Systems.不同类型驾驶员对线控转向系统适应行为的对比分析
Sensors (Basel). 2024 Aug 28;24(17):5562. doi: 10.3390/s24175562.
2
Hierarchical Trajectory Planning for Narrow-Space Automated Parking with Deep Reinforcement Learning: A Federated Learning Scheme.基于深度强化学习的狭窄空间自动泊车分层轨迹规划:联邦学习方案。
Sensors (Basel). 2023 Apr 18;23(8):4087. doi: 10.3390/s23084087.

本文引用的文献

1
Robust Parking Path Planning with Error-Adaptive Sampling under Perception Uncertainty.基于感知不确定性的鲁棒停车路径规划与误差自适应采样。
Sensors (Basel). 2020 Jun 23;20(12):3560. doi: 10.3390/s20123560.
2
Reinforcement Learning-Based End-to-End Parking for Automatic Parking System.基于强化学习的全自动泊车系统端到端泊车
Sensors (Basel). 2019 Sep 16;19(18):3996. doi: 10.3390/s19183996.
3
Mastering the game of Go without human knowledge.无需人类知识即可掌握围棋游戏。
Nature. 2017 Oct 18;550(7676):354-359. doi: 10.1038/nature24270.
4
Mastering the game of Go with deep neural networks and tree search.用深度神经网络和树搜索掌握围棋游戏。
Nature. 2016 Jan 28;529(7587):484-9. doi: 10.1038/nature16961.