Suppr超能文献

基于策略梯度和动作价值的自动驾驶车辆安全行驶的状态表示学习。

Policy-Gradient and Actor-Critic Based State Representation Learning for Safe Driving of Autonomous Vehicles.

机构信息

Department of Electrical, Computer and Biomedical Engineering, Ryerson University, Toronto, ON M5B2K3, Canada.

出版信息

Sensors (Basel). 2020 Oct 22;20(21):5991. doi: 10.3390/s20215991.

Abstract

In this paper, we propose an environment perception framework for autonomous driving using state representation learning (SRL). Unlike existing Q-learning based methods for efficient environment perception and object detection, our proposed method takes the learning loss into account under deterministic as well as stochastic policy gradient. Through a combination of variational autoencoder (VAE), deep deterministic policy gradient (DDPG), and soft actor-critic (SAC), we focus on uninterrupted and reasonably safe autonomous driving without steering off the track for a considerable driving distance. Our proposed technique exhibits learning in autonomous vehicles under complex interactions with the environment, without being explicitly trained on driving datasets. To ensure the effectiveness of the scheme over a sustained period of time, we employ a reward-penalty based system where a negative reward is associated with an unfavourable action and a positive reward is awarded for favourable actions. The results obtained through simulations on DonKey simulator show the effectiveness of our proposed method by examining the variations in policy loss, value loss, reward function, and cumulative reward for 'VAE+DDPG' and 'VAE+SAC' over the learning process.

摘要

在本文中,我们提出了一种使用状态表示学习(SRL)的自动驾驶环境感知框架。与现有的基于 Q-learning 的高效环境感知和目标检测方法不同,我们提出的方法在确定性和随机策略梯度下考虑了学习损失。通过将变分自动编码器(VAE)、深度确定性策略梯度(DDPG)和软动作-评论家(SAC)相结合,我们专注于在没有偏离轨道的情况下不间断地进行合理安全的自动驾驶,并且可以在相当长的驾驶距离内保持稳定。我们的技术在与环境的复杂交互下展示了自主车辆中的学习,而无需在驾驶数据集上进行显式训练。为了确保该方案在较长时间内的有效性,我们采用了基于奖惩的系统,其中不利操作会导致负奖励,有利操作会获得正奖励。通过在 DonKey 模拟器上进行的模拟结果,我们通过检查“VAE+DDPG”和“VAE+SAC”在学习过程中的策略损失、价值损失、奖励函数和累积奖励的变化,展示了我们提出的方法的有效性。

https://cdn.ncbi.nlm.nih.gov/pmc/blobs/8699/7660054/5af3f4fd905e/sensors-20-05991-g001.jpg

文献检索

告别复杂PubMed语法,用中文像聊天一样搜索,搜遍4000万医学文献。AI智能推荐,让科研检索更轻松。

立即免费搜索

文件翻译

保留排版,准确专业,支持PDF/Word/PPT等文件格式,支持 12+语言互译。

免费翻译文档

深度研究

AI帮你快速写综述,25分钟生成高质量综述,智能提取关键信息,辅助科研写作。

立即免费体验