• 文献检索
  • 文档翻译
  • 深度研究
  • 学术资讯
  • Suppr Zotero 插件Zotero 插件
  • 邀请有礼
  • 套餐&价格
  • 历史记录
应用&插件
Suppr Zotero 插件Zotero 插件浏览器插件Mac 客户端Windows 客户端微信小程序
定价
高级版会员购买积分包购买API积分包
服务
文献检索文档翻译深度研究API 文档MCP 服务
关于我们
关于 Suppr公司介绍联系我们用户协议隐私条款
关注我们

Suppr 超能文献

核心技术专利:CN118964589B侵权必究
粤ICP备2023148730 号-1Suppr @ 2026

文献检索

告别复杂PubMed语法,用中文像聊天一样搜索,搜遍4000万医学文献。AI智能推荐,让科研检索更轻松。

立即免费搜索

文件翻译

保留排版,准确专业,支持PDF/Word/PPT等文件格式,支持 12+语言互译。

免费翻译文档

深度研究

AI帮你快速写综述,25分钟生成高质量综述,智能提取关键信息,辅助科研写作。

立即免费体验

通过扰动感知对比学习实现抗偏差智能体导航

Towards Deviation-Robust Agent Navigation via Perturbation-Aware Contrastive Learning.

作者信息

Lin Bingqian, Long Yanxin, Zhu Yi, Zhu Fengda, Liang Xiaodan, Ye Qixiang, Lin Liang

出版信息

IEEE Trans Pattern Anal Mach Intell. 2023 Oct;45(10):12535-12549. doi: 10.1109/TPAMI.2023.3273594. Epub 2023 Sep 5.

DOI:10.1109/TPAMI.2023.3273594
PMID:37155380
Abstract

Vision-and-language navigation (VLN) asks an agent to follow a given language instruction to navigate through a real 3D environment. Despite significant advances, conventional VLN agents are trained typically under disturbance-free environments and may easily fail in real-world navigation scenarios, since they are unaware of how to deal with various possible disturbances, such as sudden obstacles or human interruptions, which widely exist and may usually cause an unexpected route deviation. In this paper, we present a model-agnostic training paradigm, called Progressive Perturbation-aware Contrastive Learning (PROPER) to enhance the generalization ability of existing VLN agents to the real world, by requiring them to learn towards deviation-robust navigation. Specifically, a simple yet effective path perturbation scheme is introduced to implement the route deviation, with which the agent is required to still navigate successfully following the original instruction. Since directly enforcing the agent to learn perturbed trajectories may lead to insufficient and inefficient training, a progressively perturbed trajectory augmentation strategy is designed, where the agent can self-adaptively learn to navigate under perturbation with the improvement of its navigation performance for each specific trajectory. For encouraging the agent to well capture the difference brought by perturbation and adapt to both perturbation-free and perturbation-based environments, a perturbation-aware contrastive learning mechanism is further developed by contrasting perturbation-free trajectory encodings and perturbation-based counterparts. Extensive experiments on the standard Room-to-Room (R2R) benchmark show that PROPER can benefit multiple state-of-the-art VLN baselines in perturbation-free scenarios. We further collect the perturbed path data to construct an introspection subset based on the R2R, called Path-Perturbed R2R (PP-R2R). The results on PP-R2R show unsatisfying robustness of popular VLN agents and the capability of PROPER in improving the navigation robustness under deviation.

摘要

视觉与语言导航(VLN)要求智能体遵循给定的语言指令在真实的三维环境中导航。尽管取得了显著进展,但传统的VLN智能体通常是在无干扰环境下训练的,在现实世界的导航场景中可能很容易失败,因为它们不知道如何应对各种可能的干扰,比如突然出现的障碍物或人为干扰,这些干扰广泛存在且通常会导致意外的路线偏差。在本文中,我们提出了一种与模型无关的训练范式,称为渐进扰动感知对比学习(PROPER),以提高现有VLN智能体对现实世界的泛化能力,方法是要求它们学习抗偏差的导航。具体来说,引入了一种简单而有效的路径扰动方案来实现路线偏差,要求智能体在这种情况下仍能按照原始指令成功导航。由于直接强制智能体学习受扰动的轨迹可能会导致训练不足和效率低下,因此设计了一种渐进式受扰动轨迹增强策略,在该策略中,随着智能体在每个特定轨迹上导航性能的提高,它可以自适应地学习在扰动下导航。为了鼓励智能体很好地捕捉扰动带来的差异并适应无扰动和基于扰动的环境,通过对比无扰动轨迹编码和基于扰动的对应编码,进一步开发了一种扰动感知对比学习机制。在标准的“房间到房间”(R2R)基准上进行的大量实验表明,PROPER可以在无扰动场景中使多个先进的VLN基线受益。我们进一步收集受扰动的路径数据,基于R2R构建一个自省子集,称为路径扰动R2R(PP-R2R)。在PP-R2R上的结果表明,流行的VLN智能体的鲁棒性不尽人意,而PROPER具有提高偏差情况下导航鲁棒性的能力。

相似文献

1
Towards Deviation-Robust Agent Navigation via Perturbation-Aware Contrastive Learning.通过扰动感知对比学习实现抗偏差智能体导航
IEEE Trans Pattern Anal Mach Intell. 2023 Oct;45(10):12535-12549. doi: 10.1109/TPAMI.2023.3273594. Epub 2023 Sep 5.
2
Correctable Landmark Discovery via Large Models for Vision-Language Navigation.通过大型模型进行视觉语言导航的可校正地标发现
IEEE Trans Pattern Anal Mach Intell. 2024 Dec;46(12):8534-8548. doi: 10.1109/TPAMI.2024.3407759. Epub 2024 Nov 6.
3
Vision-Language Navigation With Beam-Constrained Global Normalization.具有光束约束全局归一化的视觉语言导航
IEEE Trans Neural Netw Learn Syst. 2024 Jan;35(1):1352-1363. doi: 10.1109/TNNLS.2022.3183287. Epub 2024 Jan 4.
4
ETPNav: Evolving Topological Planning for Vision-Language Navigation in Continuous Environments.ETPNav:连续环境中视觉语言导航的演进拓扑规划
IEEE Trans Pattern Anal Mach Intell. 2024 Apr 9;PP. doi: 10.1109/TPAMI.2024.3386695.
5
HOP+: History-Enhanced and Order-Aware Pre-Training for Vision-and-Language Navigation.HOP+:用于视觉语言导航的具有历史增强和顺序感知的预训练。
IEEE Trans Pattern Anal Mach Intell. 2023 Jul;45(7):8524-8537. doi: 10.1109/TPAMI.2023.3234243. Epub 2023 Jun 5.
6
Adversarial Reinforced Instruction Attacker for Robust Vision-Language Navigation.对抗强化指令攻击的鲁棒视觉-语言导航
IEEE Trans Pattern Anal Mach Intell. 2022 Oct;44(10):7175-7189. doi: 10.1109/TPAMI.2021.3097435. Epub 2022 Sep 14.
7
Visual Perception Generalization for Vision-and-Language Navigation via Meta-Learning.通过元学习实现视觉与语言导航的视觉感知泛化
IEEE Trans Neural Netw Learn Syst. 2023 Aug;34(8):5193-5199. doi: 10.1109/TNNLS.2021.3122579. Epub 2023 Aug 4.
8
Outdoor Vision-and-Language Navigation Needs Object-Level Alignment.户外视觉与语言导航需要目标级对齐。
Sensors (Basel). 2023 Jun 29;23(13):6028. doi: 10.3390/s23136028.
9
Discovering Intrinsic Subgoals for Vision- and-Language Navigation via Hierarchical Reinforcement Learning.通过分层强化学习发现视觉与语言导航的内在子目标
IEEE Trans Neural Netw Learn Syst. 2025 Apr;36(4):6516-6528. doi: 10.1109/TNNLS.2024.3398300. Epub 2025 Apr 4.
10
Self-Supervised 3-D Semantic Representation Learning for Vision-and-Language Navigation.用于视觉与语言导航的自监督3D语义表征学习
IEEE Trans Neural Netw Learn Syst. 2025 Apr;36(4):6738-6751. doi: 10.1109/TNNLS.2024.3395633. Epub 2025 Apr 4.