• 文献检索
  • 文档翻译
  • 深度研究
  • 学术资讯
  • Suppr Zotero 插件Zotero 插件
  • 邀请有礼
  • 套餐&价格
  • 历史记录
应用&插件
Suppr Zotero 插件Zotero 插件浏览器插件Mac 客户端Windows 客户端微信小程序
定价
高级版会员购买积分包购买API积分包
服务
文献检索文档翻译深度研究API 文档MCP 服务
关于我们
关于 Suppr公司介绍联系我们用户协议隐私条款
关注我们

Suppr 超能文献

核心技术专利:CN118964589B侵权必究
粤ICP备2023148730 号-1Suppr @ 2026

文献检索

告别复杂PubMed语法,用中文像聊天一样搜索,搜遍4000万医学文献。AI智能推荐,让科研检索更轻松。

立即免费搜索

文件翻译

保留排版,准确专业,支持PDF/Word/PPT等文件格式,支持 12+语言互译。

免费翻译文档

深度研究

AI帮你快速写综述,25分钟生成高质量综述,智能提取关键信息,辅助科研写作。

立即免费体验

基于深度强化学习的目标驱动视觉导航的多自监督辅助任务

Multiple Self-Supervised Auxiliary Tasks for Target-Driven Visual Navigation Using Deep Reinforcement Learning.

作者信息

Zhang Wenzhi, He Li, Wang Hongwei, Yuan Liang, Xiao Wendong

机构信息

School of Mechanical Engineering, Xinjiang University, Urumqi 830046, China.

School of Information Science and Technology, Beijing University of Chemical Technology, Beijing 100029, China.

出版信息

Entropy (Basel). 2023 Jun 30;25(7):1007. doi: 10.3390/e25071007.

DOI:10.3390/e25071007
PMID:37509957
原文链接:https://pmc.ncbi.nlm.nih.gov/articles/PMC10378290/
Abstract

Visual navigation based on deep reinforcement learning requires a large amount of interaction with the environment, and due to the reward sparsity, it requires a large amount of training time and computational resources. In this paper, we focus on sample efficiency and navigation performance and propose a framework for visual navigation based on multiple self-supervised auxiliary tasks. Specifically, we present an LSTM-based dynamics model and an attention-based image-reconstruction model as auxiliary tasks. These self-supervised auxiliary tasks enable agents to learn navigation strategies directly from the original high-dimensional images without relying on ResNet features by constructing latent representation learning. Experimental results show that without manually designed features and prior demonstrations, our method significantly improves the training efficiency and outperforms the baseline algorithms on the simulator and real-world image datasets.

摘要

基于深度强化学习的视觉导航需要与环境进行大量交互,并且由于奖励稀疏性,需要大量的训练时间和计算资源。在本文中,我们关注样本效率和导航性能,并提出了一个基于多个自监督辅助任务的视觉导航框架。具体来说,我们提出了基于长短期记忆网络(LSTM)的动力学模型和基于注意力的图像重建模型作为辅助任务。这些自监督辅助任务通过构建潜在表征学习,使智能体能够直接从原始高维图像中学习导航策略,而无需依赖残差网络(ResNet)特征。实验结果表明,在没有人工设计特征和先验演示的情况下,我们的方法显著提高了训练效率,并且在模拟器和真实世界图像数据集上优于基线算法。

https://cdn.ncbi.nlm.nih.gov/pmc/blobs/2cdd/10378290/c7fc9d1dea03/entropy-25-01007-g009.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/2cdd/10378290/946223b69881/entropy-25-01007-g001.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/2cdd/10378290/20b6530e5f65/entropy-25-01007-g002.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/2cdd/10378290/bb4453cfaffb/entropy-25-01007-g003.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/2cdd/10378290/932f4df469a4/entropy-25-01007-g004.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/2cdd/10378290/61f18479c93a/entropy-25-01007-g005.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/2cdd/10378290/90bcd4a9fe44/entropy-25-01007-g006.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/2cdd/10378290/da838ebbe0d2/entropy-25-01007-g007.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/2cdd/10378290/4a8141185c55/entropy-25-01007-g008.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/2cdd/10378290/c7fc9d1dea03/entropy-25-01007-g009.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/2cdd/10378290/946223b69881/entropy-25-01007-g001.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/2cdd/10378290/20b6530e5f65/entropy-25-01007-g002.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/2cdd/10378290/bb4453cfaffb/entropy-25-01007-g003.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/2cdd/10378290/932f4df469a4/entropy-25-01007-g004.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/2cdd/10378290/61f18479c93a/entropy-25-01007-g005.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/2cdd/10378290/90bcd4a9fe44/entropy-25-01007-g006.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/2cdd/10378290/da838ebbe0d2/entropy-25-01007-g007.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/2cdd/10378290/4a8141185c55/entropy-25-01007-g008.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/2cdd/10378290/c7fc9d1dea03/entropy-25-01007-g009.jpg

相似文献

1
Multiple Self-Supervised Auxiliary Tasks for Target-Driven Visual Navigation Using Deep Reinforcement Learning.基于深度强化学习的目标驱动视觉导航的多自监督辅助任务
Entropy (Basel). 2023 Jun 30;25(7):1007. doi: 10.3390/e25071007.
2
Deep imitation learning for 3D navigation tasks.用于3D导航任务的深度模仿学习
Neural Comput Appl. 2018;29(7):389-404. doi: 10.1007/s00521-017-3241-z. Epub 2017 Dec 4.
3
Self-Supervised Learning for Label Sparsity in Computational Drug Repositioning.计算药物重新定位中标签稀疏性的自监督学习
IEEE/ACM Trans Comput Biol Bioinform. 2023 Sep-Oct;20(5):3245-3256. doi: 10.1109/TCBB.2023.3254163. Epub 2023 Oct 9.
4
Visual Navigation With Multiple Goals Based on Deep Reinforcement Learning.基于深度强化学习的多目标视觉导航。
IEEE Trans Neural Netw Learn Syst. 2021 Dec;32(12):5445-5455. doi: 10.1109/TNNLS.2021.3057424. Epub 2021 Nov 30.
5
Action-driven contrastive representation for reinforcement learning.基于动作的强化学习对比表示。
PLoS One. 2022 Mar 18;17(3):e0265456. doi: 10.1371/journal.pone.0265456. eCollection 2022.
6
Investigating navigation strategies in the Morris Water Maze through deep reinforcement learning.通过深度强化学习研究 Morris 水迷宫中的导航策略。
Neural Netw. 2024 Apr;172:106050. doi: 10.1016/j.neunet.2023.12.004. Epub 2023 Dec 14.
7
Leveraging Predictions of Task-Related Latents for Interactive Visual Navigation.利用任务相关潜在因素的预测进行交互式视觉导航。
IEEE Trans Neural Netw Learn Syst. 2025 Jan;36(1):704-717. doi: 10.1109/TNNLS.2023.3335416. Epub 2025 Jan 7.
8
Leveraging Expert Demonstration Features for Deep Reinforcement Learning in Floor Cleaning Robot Navigation.利用专家演示特征进行地板清洁机器人导航中的深度强化学习。
Sensors (Basel). 2022 Oct 12;22(20):7750. doi: 10.3390/s22207750.
9
C2RL: Convolutional-Contrastive Learning for Reinforcement Learning Based on Self-Pretraining for Strong Augmentation.C2RL:基于自预训练的强化学习的卷积对比学习,用于强增强。
Sensors (Basel). 2023 May 21;23(10):4946. doi: 10.3390/s23104946.
10
Semi-supervised mp-MRI data synthesis with StitchLayer and auxiliary distance maximization.基于 StitchLayer 和辅助距离最大化的半监督多模态 MRI 数据合成。
Med Image Anal. 2020 Jan;59:101565. doi: 10.1016/j.media.2019.101565. Epub 2019 Oct 1.

引用本文的文献

1
A Novel Obstacle Traversal Method for Multiple Robotic Fish Based on Cross-Modal Variational Autoencoders and Imitation Learning.一种基于跨模态变分自编码器和模仿学习的多机器人鱼新型障碍物穿越方法。
Biomimetics (Basel). 2024 Apr 8;9(4):221. doi: 10.3390/biomimetics9040221.

本文引用的文献

1
Autonomous Exploration and Mapping with RFS Occupancy-Grid SLAM.基于随机有限集占用栅格同时定位与地图构建的自主探索与测绘
Entropy (Basel). 2018 Jun 12;20(6):456. doi: 10.3390/e20060456.
2
A Review of Visual-LiDAR Fusion based Simultaneous Localization and Mapping.基于视觉激光雷达融合的同时定位与建图综述
Sensors (Basel). 2020 Apr 7;20(7):2068. doi: 10.3390/s20072068.
3
Mastering the game of Go without human knowledge.无需人类知识即可掌握围棋游戏。
Nature. 2017 Oct 18;550(7676):354-359. doi: 10.1038/nature24270.
4
Mastering the game of Go with deep neural networks and tree search.用深度神经网络和树搜索掌握围棋游戏。
Nature. 2016 Jan 28;529(7587):484-9. doi: 10.1038/nature16961.
5
Human-level control through deep reinforcement learning.通过深度强化学习实现人类水平的控制。
Nature. 2015 Feb 26;518(7540):529-33. doi: 10.1038/nature14236.