• 文献检索
  • 文档翻译
  • 深度研究
  • 学术资讯
  • Suppr Zotero 插件Zotero 插件
  • 邀请有礼
  • 套餐&价格
  • 历史记录
应用&插件
Suppr Zotero 插件Zotero 插件浏览器插件Mac 客户端Windows 客户端微信小程序
定价
高级版会员购买积分包购买API积分包
服务
文献检索文档翻译深度研究API 文档MCP 服务
关于我们
关于 Suppr公司介绍联系我们用户协议隐私条款
关注我们

Suppr 超能文献

核心技术专利:CN118964589B侵权必究
粤ICP备2023148730 号-1Suppr @ 2026

文献检索

告别复杂PubMed语法,用中文像聊天一样搜索,搜遍4000万医学文献。AI智能推荐,让科研检索更轻松。

立即免费搜索

文件翻译

保留排版,准确专业,支持PDF/Word/PPT等文件格式,支持 12+语言互译。

免费翻译文档

深度研究

AI帮你快速写综述,25分钟生成高质量综述,智能提取关键信息,辅助科研写作。

立即免费体验

基于人类引导的优先经验强化学习在自动驾驶中的应用

Prioritized Experience-Based Reinforcement Learning With Human Guidance for Autonomous Driving.

作者信息

Wu Jingda, Huang Zhiyu, Huang Wenhui, Lv Chen

出版信息

IEEE Trans Neural Netw Learn Syst. 2022 Jun 10;PP. doi: 10.1109/TNNLS.2022.3177685.

DOI:10.1109/TNNLS.2022.3177685
PMID:35687630
Abstract

Reinforcement learning (RL) requires skillful definition and remarkable computational efforts to solve optimization and control problems, which could impair its prospect. Introducing human guidance into RL is a promising way to improve learning performance. In this article, a comprehensive human guidance-based RL framework is established. A novel prioritized experience replay mechanism that adapts to human guidance in the RL process is proposed to boost the efficiency and performance of the RL algorithm. To relieve the heavy workload on human participants, a behavior model is established based on an incremental online learning method to mimic human actions. We design two challenging autonomous driving tasks for evaluating the proposed algorithm. Experiments are conducted to access the training and testing performance and learning mechanism of the proposed algorithm. Comparative results against the state-of-the-art methods suggest the advantages of our algorithm in terms of learning efficiency, performance, and robustness.

摘要

强化学习(RL)需要巧妙的定义和巨大的计算量来解决优化和控制问题,这可能会损害其前景。将人类指导引入强化学习是提高学习性能的一种有前途的方法。在本文中,建立了一个全面的基于人类指导的强化学习框架。提出了一种新颖的优先经验回放机制,该机制在强化学习过程中适应人类指导,以提高强化学习算法的效率和性能。为了减轻人类参与者的繁重工作量,基于增量在线学习方法建立了一个行为模型来模仿人类行为。我们设计了两个具有挑战性的自动驾驶任务来评估所提出的算法。进行实验以评估所提出算法的训练和测试性能以及学习机制。与最先进方法的比较结果表明了我们算法在学习效率、性能和鲁棒性方面的优势。

相似文献

1
Prioritized Experience-Based Reinforcement Learning With Human Guidance for Autonomous Driving.基于人类引导的优先经验强化学习在自动驾驶中的应用
IEEE Trans Neural Netw Learn Syst. 2022 Jun 10;PP. doi: 10.1109/TNNLS.2022.3177685.
2
Human-Guided Reinforcement Learning With Sim-to-Real Transfer for Autonomous Navigation.用于自主导航的基于人引导强化学习的模拟到现实迁移
IEEE Trans Pattern Anal Mach Intell. 2023 Dec;45(12):14745-14759. doi: 10.1109/TPAMI.2023.3314762. Epub 2023 Nov 3.
3
Towards Robust Decision-Making for Autonomous Highway Driving Based on Safe Reinforcement Learning.基于安全强化学习的稳健自主高速公路驾驶决策方法
Sensors (Basel). 2024 Jun 26;24(13):4140. doi: 10.3390/s24134140.
4
RL-DOVS: Reinforcement Learning for Autonomous Robot Navigation in Dynamic Environments.RL-DOVS:动态环境下自主机器人导航的强化学习。
Sensors (Basel). 2022 May 19;22(10):3847. doi: 10.3390/s22103847.
5
Efficient Deep Reinforcement Learning With Imitative Expert Priors for Autonomous Driving.基于模仿专家先验的高效深度强化学习用于自动驾驶
IEEE Trans Neural Netw Learn Syst. 2023 Oct;34(10):7391-7403. doi: 10.1109/TNNLS.2022.3142822. Epub 2023 Oct 5.
6
Accelerating reinforcement learning with case-based model-assisted experience augmentation for process control.通过基于案例的模型辅助经验增强加速强化学习以进行过程控制。
Neural Netw. 2023 Jan;158:197-215. doi: 10.1016/j.neunet.2022.10.016. Epub 2022 Oct 29.
7
ACERAC: Efficient Reinforcement Learning in Fine Time Discretization.ACERAC:精细时间离散化中的高效强化学习
IEEE Trans Neural Netw Learn Syst. 2024 Feb;35(2):2719-2731. doi: 10.1109/TNNLS.2022.3190973. Epub 2024 Feb 5.
8
A Control Method with Reinforcement Learning for Urban Un-Signalized Intersection in Hybrid Traffic Environment.混合交通环境下城市无信号交叉口的强化学习控制方法。
Sensors (Basel). 2022 Jan 20;22(3):779. doi: 10.3390/s22030779.
9
Safe Autonomous Driving with Latent Dynamics and State-Wise Constraints.基于潜在动力学和状态约束的安全自动驾驶
Sensors (Basel). 2024 May 15;24(10):3139. doi: 10.3390/s24103139.
10
Optimization of On-Demand Shared Autonomous Vehicle Deployments Utilizing Reinforcement Learning.利用强化学习优化按需共享自动驾驶车辆部署。
Sensors (Basel). 2022 Oct 29;22(21):8317. doi: 10.3390/s22218317.

引用本文的文献

1
End-to-End Autonomous Driving Decision Method Based on Improved TD3 Algorithm in Complex Scenarios.复杂场景下基于改进TD3算法的端到端自动驾驶决策方法
Sensors (Basel). 2024 Jul 31;24(15):4962. doi: 10.3390/s24154962.
2
A Survey of Autonomous Vehicle Behaviors: Trajectory Planning Algorithms, Sensed Collision Risks, and User Expectations.自动驾驶车辆行为调查:轨迹规划算法、感知碰撞风险及用户期望
Sensors (Basel). 2024 Jul 24;24(15):4808. doi: 10.3390/s24154808.
3
An Integrated Framework for Multi-State Driver Monitoring Using Heterogeneous Loss and Attention-Based Feature Decoupling.
基于异构损失和基于注意力的特征解耦的多状态驾驶员监控综合框架。
Sensors (Basel). 2022 Sep 29;22(19):7415. doi: 10.3390/s22197415.