Suppr超能文献

自动驾驶中基于决策变换器的深度强化学习导航

Deep reinforcement learning navigation via decision transformer in autonomous driving.

作者信息

Ge Lun, Zhou Xiaoguang, Li Yongqiang, Wang Yongcong

机构信息

School of Modern Post (School of Automation), Beijing University of Posts and Telecommunications, Beijing, China.

Mogo Auto Intelligence and Telematics Information Technology Co., Ltd, Beijing, China.

出版信息

Front Neurorobot. 2024 Mar 19;18:1338189. doi: 10.3389/fnbot.2024.1338189. eCollection 2024.

Abstract

In real-world scenarios, making navigation decisions for autonomous driving involves a sequential set of steps. These judgments are made based on partial observations of the environment, while the underlying model of the environment remains unknown. A prevalent method for resolving such issues is reinforcement learning, in which the agent acquires knowledge through a succession of rewards in addition to fragmentary and noisy observations. This study introduces an algorithm named deep reinforcement learning navigation via decision transformer (DRLNDT) to address the challenge of enhancing the decision-making capabilities of autonomous vehicles operating in partially observable urban environments. The DRLNDT framework is built around the Soft Actor-Critic (SAC) algorithm. DRLNDT utilizes Transformer neural networks to effectively model the temporal dependencies in observations and actions. This approach aids in mitigating judgment errors that may arise due to sensor noise or occlusion within a given state. The process of extracting latent vectors from high-quality images involves the utilization of a variational autoencoder (VAE). This technique effectively reduces the dimensionality of the state space, resulting in enhanced training efficiency. The multimodal state space consists of vector states, including velocity and position, which the vehicle's intrinsic sensors can readily obtain. Additionally, latent vectors derived from high-quality images are incorporated to facilitate the Agent's assessment of the present trajectory. Experiments demonstrate that DRLNDT may achieve a superior optimal policy without prior knowledge of the environment, detailed maps, or routing assistance, surpassing the baseline technique and other policy methods that lack historical data.

摘要

在现实世界场景中,为自动驾驶做出导航决策涉及一系列连续的步骤。这些判断是基于对环境的部分观察做出的,而环境的潜在模型仍然未知。解决此类问题的一种普遍方法是强化学习,其中智能体除了通过零碎且有噪声的观察外,还通过一系列奖励来获取知识。本研究引入了一种名为基于决策变换器的深度强化学习导航(DRLNDT)的算法,以应对增强在部分可观测城市环境中运行的自动驾驶车辆决策能力的挑战。DRLNDT框架围绕软演员评论家(SAC)算法构建。DRLNDT利用变换器神经网络有效地对观察和行动中的时间依赖性进行建模。这种方法有助于减轻由于给定状态内的传感器噪声或遮挡可能出现的判断错误。从高质量图像中提取潜在向量的过程涉及使用变分自编码器(VAE)。该技术有效地降低了状态空间的维度,从而提高了训练效率。多模态状态空间由车辆固有传感器可以轻松获得的向量状态组成,包括速度和位置。此外,还纳入了从高质量图像中导出的潜在向量,以方便智能体评估当前轨迹。实验表明,DRLNDT在无需环境先验知识、详细地图或路线辅助的情况下,可以实现优于基线技术和其他缺乏历史数据的策略方法的最优策略。

https://cdn.ncbi.nlm.nih.gov/pmc/blobs/0b7a/10985319/31b88f838b25/fnbot-18-1338189-g0001.jpg

文献AI研究员

20分钟写一篇综述,助力文献阅读效率提升50倍。

立即体验

用中文搜PubMed

大模型驱动的PubMed中文搜索引擎

马上搜索

文档翻译

学术文献翻译模型,支持多种主流文档格式。

立即体验