• 文献检索
  • 文档翻译
  • 深度研究
  • 学术资讯
  • Suppr Zotero 插件Zotero 插件
  • 邀请有礼
  • 套餐&价格
  • 历史记录
应用&插件
Suppr Zotero 插件Zotero 插件浏览器插件Mac 客户端Windows 客户端微信小程序
定价
高级版会员购买积分包购买API积分包
服务
文献检索文档翻译深度研究API 文档MCP 服务
关于我们
关于 Suppr公司介绍联系我们用户协议隐私条款
关注我们

Suppr 超能文献

核心技术专利:CN118964589B侵权必究
粤ICP备2023148730 号-1Suppr @ 2026

文献检索

告别复杂PubMed语法,用中文像聊天一样搜索,搜遍4000万医学文献。AI智能推荐,让科研检索更轻松。

立即免费搜索

文件翻译

保留排版,准确专业,支持PDF/Word/PPT等文件格式,支持 12+语言互译。

免费翻译文档

深度研究

AI帮你快速写综述,25分钟生成高质量综述,智能提取关键信息,辅助科研写作。

立即免费体验

用于自动驾驶车辆非线性预测控制的逆强化学习场景动力学学习

Inverse RL Scene Dynamics Learning for Nonlinear Predictive Control in Autonomous Vehicles.

作者信息

Grigorescu Sorin M, Zaha Mihai V

出版信息

IEEE Trans Neural Netw Learn Syst. 2025 Aug;36(8):13754-13768. doi: 10.1109/TNNLS.2025.3549816.

DOI:10.1109/TNNLS.2025.3549816
PMID:40146653
Abstract

This article introduces the deep learning-based nonlinear model predictive controller with scene dynamics (DL-NMPC-SD) method for autonomous navigation. DL-NMPC-SD uses an a priori nominal vehicle model in combination with a scene dynamics model learned from temporal range sensing information. The scene dynamics model is responsible for estimating the desired vehicle trajectory, as well as to adjust the true system model used by the underlying model predictive controller. We propose to encode the scene dynamics model within the layers of a deep neural network, which acts as a nonlinear approximator for the high-order state space of the operating conditions. The model is learned based on temporal sequences of range-sensing observations and system states, both integrated by an Augmented Memory component. We use inverse reinforcement learning (IRL) and the Bellman optimality principle to train our learning controller with a modified version of the deep Q-learning (DQL) algorithm, enabling us to estimate the desired state trajectory as an optimal action-value function. We have evaluated DL-NMPC-SD against the baseline dynamic window approach (DWA), as well as against two state-of-the-art End2End and RL methods, respectively. The performance has been measured in three experiments: 1) in our GridSim virtual environment; 2) on indoor and outdoor navigation tasks using our RovisLab autonomous mobile test unit (AMTU) platform; and 3) on a full-scale autonomous test vehicle driving on public roads.

摘要

本文介绍了一种用于自主导航的基于深度学习的带场景动力学的非线性模型预测控制器(DL-NMPC-SD)方法。DL-NMPC-SD使用先验标称车辆模型,并结合从时间范围传感信息中学习到的场景动力学模型。场景动力学模型负责估计期望的车辆轨迹,并调整底层模型预测控制器所使用的真实系统模型。我们建议将场景动力学模型编码在深度神经网络的各层中,该深度神经网络作为运行条件高阶状态空间的非线性逼近器。该模型基于距离传感观测和系统状态的时间序列进行学习,这两者都由增强记忆组件进行整合。我们使用逆强化学习(IRL)和贝尔曼最优性原理,通过深度Q学习(DQL)算法的改进版本来训练我们的学习控制器,使我们能够将期望状态轨迹估计为最优动作值函数。我们已将DL-NMPC-SD分别与基线动态窗口方法(DWA)以及两种最先进的端到端和强化学习方法进行了评估比较。性能在三个实验中进行了测量:1)在我们的GridSim虚拟环境中;2)在使用我们的RovisLab自主移动测试单元(AMTU)平台进行的室内和室外导航任务中;3)在公共道路上行驶的全尺寸自主测试车辆上。

相似文献

1
Inverse RL Scene Dynamics Learning for Nonlinear Predictive Control in Autonomous Vehicles.用于自动驾驶车辆非线性预测控制的逆强化学习场景动力学学习
IEEE Trans Neural Netw Learn Syst. 2025 Aug;36(8):13754-13768. doi: 10.1109/TNNLS.2025.3549816.
2
Short-Term Memory Impairment短期记忆障碍
3
Comparison of Two Modern Survival Prediction Tools, SORG-MLA and METSSS, in Patients With Symptomatic Long-bone Metastases Who Underwent Local Treatment With Surgery Followed by Radiotherapy and With Radiotherapy Alone.两种现代生存预测工具 SORG-MLA 和 METSSS 在接受手术联合放疗和单纯放疗治疗有症状长骨转移患者中的比较。
Clin Orthop Relat Res. 2024 Dec 1;482(12):2193-2208. doi: 10.1097/CORR.0000000000003185. Epub 2024 Jul 23.
4
Signs and symptoms to determine if a patient presenting in primary care or hospital outpatient settings has COVID-19.在基层医疗机构或医院门诊环境中,如果患者出现以下症状和体征,可判断其是否患有 COVID-19。
Cochrane Database Syst Rev. 2022 May 20;5(5):CD013665. doi: 10.1002/14651858.CD013665.pub3.
5
Systemic pharmacological treatments for chronic plaque psoriasis: a network meta-analysis.系统性药理学治疗慢性斑块状银屑病:网络荟萃分析。
Cochrane Database Syst Rev. 2021 Apr 19;4(4):CD011535. doi: 10.1002/14651858.CD011535.pub4.
6
Factors that impact on the use of mechanical ventilation weaning protocols in critically ill adults and children: a qualitative evidence-synthesis.影响重症成人和儿童机械通气撤机方案使用的因素:一项定性证据综合分析
Cochrane Database Syst Rev. 2016 Oct 4;10(10):CD011812. doi: 10.1002/14651858.CD011812.pub2.
7
Q-learning with temporal memory to navigate turbulence.基于时间记忆的Q学习以应对动荡。
Elife. 2025 Jul 21;13:RP102906. doi: 10.7554/eLife.102906.
8
Systemic pharmacological treatments for chronic plaque psoriasis: a network meta-analysis.慢性斑块状银屑病的全身药理学治疗:一项网状Meta分析。
Cochrane Database Syst Rev. 2020 Jan 9;1(1):CD011535. doi: 10.1002/14651858.CD011535.pub3.
9
Cost-effectiveness of using prognostic information to select women with breast cancer for adjuvant systemic therapy.利用预后信息为乳腺癌患者选择辅助性全身治疗的成本效益
Health Technol Assess. 2006 Sep;10(34):iii-iv, ix-xi, 1-204. doi: 10.3310/hta10340.
10
Does Augmenting Irradiated Autografts With Free Vascularized Fibula Graft in Patients With Bone Loss From a Malignant Tumor Achieve Union, Function, and Complication Rate Comparably to Patients Without Bone Loss and Augmentation When Reconstructing Intercalary Resections in the Lower Extremity?对于因恶性肿瘤导致骨缺损的患者,在重建下肢节段性切除时,采用带血管游离腓骨移植来增强照射后的自体骨移植,其骨愈合、功能及并发症发生率与无骨缺损且未进行增强的患者相比是否相当?
Clin Orthop Relat Res. 2025 Jun 26. doi: 10.1097/CORR.0000000000003599.