具有空间记忆和语义推理认知的具身机器人的视觉运动导航

Visuomotor Navigation for Embodied Robots With Spatial Memory and Semantic Reasoning Cognition.

作者信息

Liu Qiming, Wang Guangzhan, Liu Zhe, Wang Hesheng

出版信息

IEEE Trans Neural Netw Learn Syst. 2025 May;36(5):9512-9523. doi: 10.1109/TNNLS.2024.3418857. Epub 2025 May 2.

DOI:10.1109/TNNLS.2024.3418857

Abstract

The fundamental prerequisite for embodied agents to make intelligent decisions lies in autonomous cognition. Typically, agents optimize decision-making by leveraging extensive spatiotemporal information from episodic memory. Concurrently, they utilize long-term experience for task reasoning and foster conscious behavioral tendencies. However, due to the significant disparities in the heterogeneous modalities of these two cognitive abilities, existing literature falls short in designing effective coupling mechanisms, thus failing to endow robots with comprehensive intelligence. This article introduces a navigation framework, the hierarchical topology-semantic cognitive navigation (HTSCN), which seamlessly integrates both memory and reasoning abilities within a singular end-to-end system. Specifically, we represent memory and reasoning abilities with a topological map and a semantic relation graph, respectively, within a unified dual-layer graph structure. Additionally, we incorporate a neural-based cognition extraction process to capture cross-modal relationships between hierarchical graphs. HTSCN forges a link between two different cognitive modalities, thus further enhancing decision-making performance and the overall level of intelligence. Experimental results demonstrate that in comparison to existing cognitive structures, HTSCN significantly enhances the performance and path efficiency of image-goal navigation. Visualization and interpretability experiments further corroborate the promoting role of memory, reasoning, as well as their online learned relationships, on intelligent behavioral patterns. Furthermore, we deploy HTSCN in real-world scenarios to further verify its feasibility and adaptability.

摘要

具身智能体做出智能决策的基本前提在于自主认知。通常情况下，智能体通过利用来自情景记忆的广泛时空信息来优化决策。同时，它们利用长期经验进行任务推理并培养有意识的行为倾向。然而，由于这两种认知能力在异构模态上存在显著差异，现有文献在设计有效的耦合机制方面存在不足，从而无法赋予机器人全面的智能。本文介绍了一种导航框架，即分层拓扑 - 语义认知导航（HTSCN），它在一个单一的端到端系统中无缝集成了记忆和推理能力。具体而言，我们在统一的双层图结构中，分别用拓扑地图和语义关系图来表示记忆和推理能力。此外，我们纳入了基于神经网络的认知提取过程，以捕捉分层图之间的跨模态关系。HTSCN在两种不同的认知模态之间建立了联系，从而进一步提高了决策性能和整体智能水平。实验结果表明，与现有认知结构相比，HTSCN显著提高了图像目标导航的性能和路径效率。可视化和可解释性实验进一步证实了记忆、推理以及它们在线学习的关系对智能行为模式的促进作用。此外，我们在实际场景中部署HTSCN以进一步验证其可行性和适应性。

相似文献

Visuomotor Navigation for Embodied Robots With Spatial Memory and Semantic Reasoning Cognition.具有空间记忆和语义推理认知的具身机器人的视觉运动导航

IEEE Trans Neural Netw Learn Syst. 2025 May;36(5):9512-9523. doi: 10.1109/TNNLS.2024.3418857. Epub 2025 May 2.

Integrating Neural Radiance Fields End-to-End for Cognitive Visuomotor Navigation.端到端集成神经辐射场进行认知视动导航。

IEEE Trans Pattern Anal Mach Intell. 2024 Dec;46(12):11200-11215. doi: 10.1109/TPAMI.2024.3455252. Epub 2024 Nov 6.

DICCR: Double-gated intervention and confounder causal reasoning for vision-language navigation.DICCR：用于视觉语言导航的双门干预与混杂因素因果推理

Neural Netw. 2025 Apr;184:107078. doi: 10.1016/j.neunet.2024.107078. Epub 2024 Dec 30.

Object-Based Reliable Visual Navigation for Mobile Robot.基于目标的移动机器人可靠视觉导航

Sensors (Basel). 2022 Mar 20;22(6):2387. doi: 10.3390/s22062387.

Embodied cognition for autonomous interactive robots.具身认知的自主交互机器人

Top Cogn Sci. 2012 Oct;4(4):759-72. doi: 10.1111/j.1756-8765.2012.01218.x. Epub 2012 Aug 14.

Both Default and Multiple-Demand Regions Represent Semantic Goal Information.默认网络和多重需求区域都代表语义目标信息。

J Neurosci. 2021 Apr 21;41(16):3679-3691. doi: 10.1523/JNEUROSCI.1782-20.2021. Epub 2021 Mar 4.

Configurable Graph Reasoning for Visual Relationship Detection.可配置图推理在视觉关系检测中的应用。

IEEE Trans Neural Netw Learn Syst. 2022 Jan;33(1):117-129. doi: 10.1109/TNNLS.2020.3027575. Epub 2022 Jan 5.

Path-based knowledge reasoning with textual semantic information for medical knowledge graph completion.基于路径的知识推理与文本语义信息融合的医疗知识图谱补全方法

BMC Med Inform Decis Mak. 2021 Nov 29;21(Suppl 9):335. doi: 10.1186/s12911-021-01622-7.

Research on obstacle avoidance optimization and path planning of autonomous vehicles based on attention mechanism combined with multimodal information decision-making thoughts of robots.基于注意力机制结合机器人多模态信息决策思想的自动驾驶车辆避障优化与路径规划研究

Front Neurorobot. 2023 Sep 22;17:1269447. doi: 10.3389/fnbot.2023.1269447. eCollection 2023.

Neural network architecture for cognitive navigation in dynamic environments.用于动态环境中认知导航的神经网络架构。

IEEE Trans Neural Netw Learn Syst. 2013 Dec;24(12):2075-87. doi: 10.1109/TNNLS.2013.2271645.

文献AI研究员

20分钟写一篇综述，助力文献阅读效率提升50倍。

立即体验

用中文搜PubMed

大模型驱动的PubMed中文搜索引擎

马上搜索

文档翻译

学术文献翻译模型，支持多种主流文档格式。

立即体验

具有空间记忆和语义推理认知的具身机器人的视觉运动导航

Visuomotor Navigation for Embodied Robots With Spatial Memory and Semantic Reasoning Cognition.

作者信息

出版信息

相似文献

文献AI研究员

用中文搜PubMed

文档翻译

Suppr 超能文献

相似文献