• 文献检索
  • 文档翻译
  • 深度研究
  • 学术资讯
  • Suppr Zotero 插件Zotero 插件
  • 邀请有礼
  • 套餐&价格
  • 历史记录
应用&插件
Suppr Zotero 插件Zotero 插件浏览器插件Mac 客户端Windows 客户端微信小程序
定价
高级版会员购买积分包购买API积分包
服务
文献检索文档翻译深度研究API 文档MCP 服务
关于我们
关于 Suppr公司介绍联系我们用户协议隐私条款
关注我们

Suppr 超能文献

核心技术专利:CN118964589B侵权必究
粤ICP备2023148730 号-1Suppr @ 2026

文献检索

告别复杂PubMed语法,用中文像聊天一样搜索,搜遍4000万医学文献。AI智能推荐,让科研检索更轻松。

立即免费搜索

文件翻译

保留排版,准确专业,支持PDF/Word/PPT等文件格式,支持 12+语言互译。

免费翻译文档

深度研究

AI帮你快速写综述,25分钟生成高质量综述,智能提取关键信息,辅助科研写作。

立即免费体验

具有空间记忆和语义推理认知的具身机器人的视觉运动导航

Visuomotor Navigation for Embodied Robots With Spatial Memory and Semantic Reasoning Cognition.

作者信息

Liu Qiming, Wang Guangzhan, Liu Zhe, Wang Hesheng

出版信息

IEEE Trans Neural Netw Learn Syst. 2025 May;36(5):9512-9523. doi: 10.1109/TNNLS.2024.3418857. Epub 2025 May 2.

DOI:10.1109/TNNLS.2024.3418857
PMID:39288036
Abstract

The fundamental prerequisite for embodied agents to make intelligent decisions lies in autonomous cognition. Typically, agents optimize decision-making by leveraging extensive spatiotemporal information from episodic memory. Concurrently, they utilize long-term experience for task reasoning and foster conscious behavioral tendencies. However, due to the significant disparities in the heterogeneous modalities of these two cognitive abilities, existing literature falls short in designing effective coupling mechanisms, thus failing to endow robots with comprehensive intelligence. This article introduces a navigation framework, the hierarchical topology-semantic cognitive navigation (HTSCN), which seamlessly integrates both memory and reasoning abilities within a singular end-to-end system. Specifically, we represent memory and reasoning abilities with a topological map and a semantic relation graph, respectively, within a unified dual-layer graph structure. Additionally, we incorporate a neural-based cognition extraction process to capture cross-modal relationships between hierarchical graphs. HTSCN forges a link between two different cognitive modalities, thus further enhancing decision-making performance and the overall level of intelligence. Experimental results demonstrate that in comparison to existing cognitive structures, HTSCN significantly enhances the performance and path efficiency of image-goal navigation. Visualization and interpretability experiments further corroborate the promoting role of memory, reasoning, as well as their online learned relationships, on intelligent behavioral patterns. Furthermore, we deploy HTSCN in real-world scenarios to further verify its feasibility and adaptability.

摘要

具身智能体做出智能决策的基本前提在于自主认知。通常情况下,智能体通过利用来自情景记忆的广泛时空信息来优化决策。同时,它们利用长期经验进行任务推理并培养有意识的行为倾向。然而,由于这两种认知能力在异构模态上存在显著差异,现有文献在设计有效的耦合机制方面存在不足,从而无法赋予机器人全面的智能。本文介绍了一种导航框架,即分层拓扑 - 语义认知导航(HTSCN),它在一个单一的端到端系统中无缝集成了记忆和推理能力。具体而言,我们在统一的双层图结构中,分别用拓扑地图和语义关系图来表示记忆和推理能力。此外,我们纳入了基于神经网络的认知提取过程,以捕捉分层图之间的跨模态关系。HTSCN在两种不同的认知模态之间建立了联系,从而进一步提高了决策性能和整体智能水平。实验结果表明,与现有认知结构相比,HTSCN显著提高了图像目标导航的性能和路径效率。可视化和可解释性实验进一步证实了记忆、推理以及它们在线学习的关系对智能行为模式的促进作用。此外,我们在实际场景中部署HTSCN以进一步验证其可行性和适应性。

相似文献

1
Visuomotor Navigation for Embodied Robots With Spatial Memory and Semantic Reasoning Cognition.具有空间记忆和语义推理认知的具身机器人的视觉运动导航
IEEE Trans Neural Netw Learn Syst. 2025 May;36(5):9512-9523. doi: 10.1109/TNNLS.2024.3418857. Epub 2025 May 2.
2
Integrating Neural Radiance Fields End-to-End for Cognitive Visuomotor Navigation.端到端集成神经辐射场进行认知视动导航。
IEEE Trans Pattern Anal Mach Intell. 2024 Dec;46(12):11200-11215. doi: 10.1109/TPAMI.2024.3455252. Epub 2024 Nov 6.
3
DICCR: Double-gated intervention and confounder causal reasoning for vision-language navigation.DICCR:用于视觉语言导航的双门干预与混杂因素因果推理
Neural Netw. 2025 Apr;184:107078. doi: 10.1016/j.neunet.2024.107078. Epub 2024 Dec 30.
4
Object-Based Reliable Visual Navigation for Mobile Robot.基于目标的移动机器人可靠视觉导航
Sensors (Basel). 2022 Mar 20;22(6):2387. doi: 10.3390/s22062387.
5
Embodied cognition for autonomous interactive robots.具身认知的自主交互机器人
Top Cogn Sci. 2012 Oct;4(4):759-72. doi: 10.1111/j.1756-8765.2012.01218.x. Epub 2012 Aug 14.
6
Both Default and Multiple-Demand Regions Represent Semantic Goal Information.默认网络和多重需求区域都代表语义目标信息。
J Neurosci. 2021 Apr 21;41(16):3679-3691. doi: 10.1523/JNEUROSCI.1782-20.2021. Epub 2021 Mar 4.
7
Configurable Graph Reasoning for Visual Relationship Detection.可配置图推理在视觉关系检测中的应用。
IEEE Trans Neural Netw Learn Syst. 2022 Jan;33(1):117-129. doi: 10.1109/TNNLS.2020.3027575. Epub 2022 Jan 5.
8
Path-based knowledge reasoning with textual semantic information for medical knowledge graph completion.基于路径的知识推理与文本语义信息融合的医疗知识图谱补全方法
BMC Med Inform Decis Mak. 2021 Nov 29;21(Suppl 9):335. doi: 10.1186/s12911-021-01622-7.
9
Research on obstacle avoidance optimization and path planning of autonomous vehicles based on attention mechanism combined with multimodal information decision-making thoughts of robots.基于注意力机制结合机器人多模态信息决策思想的自动驾驶车辆避障优化与路径规划研究
Front Neurorobot. 2023 Sep 22;17:1269447. doi: 10.3389/fnbot.2023.1269447. eCollection 2023.
10
Neural network architecture for cognitive navigation in dynamic environments.用于动态环境中认知导航的神经网络架构。
IEEE Trans Neural Netw Learn Syst. 2013 Dec;24(12):2075-87. doi: 10.1109/TNNLS.2013.2271645.