Suppr超能文献

具有空间记忆和语义推理认知的具身机器人的视觉运动导航

Visuomotor Navigation for Embodied Robots With Spatial Memory and Semantic Reasoning Cognition.

作者信息

Liu Qiming, Wang Guangzhan, Liu Zhe, Wang Hesheng

出版信息

IEEE Trans Neural Netw Learn Syst. 2025 May;36(5):9512-9523. doi: 10.1109/TNNLS.2024.3418857. Epub 2025 May 2.

Abstract

The fundamental prerequisite for embodied agents to make intelligent decisions lies in autonomous cognition. Typically, agents optimize decision-making by leveraging extensive spatiotemporal information from episodic memory. Concurrently, they utilize long-term experience for task reasoning and foster conscious behavioral tendencies. However, due to the significant disparities in the heterogeneous modalities of these two cognitive abilities, existing literature falls short in designing effective coupling mechanisms, thus failing to endow robots with comprehensive intelligence. This article introduces a navigation framework, the hierarchical topology-semantic cognitive navigation (HTSCN), which seamlessly integrates both memory and reasoning abilities within a singular end-to-end system. Specifically, we represent memory and reasoning abilities with a topological map and a semantic relation graph, respectively, within a unified dual-layer graph structure. Additionally, we incorporate a neural-based cognition extraction process to capture cross-modal relationships between hierarchical graphs. HTSCN forges a link between two different cognitive modalities, thus further enhancing decision-making performance and the overall level of intelligence. Experimental results demonstrate that in comparison to existing cognitive structures, HTSCN significantly enhances the performance and path efficiency of image-goal navigation. Visualization and interpretability experiments further corroborate the promoting role of memory, reasoning, as well as their online learned relationships, on intelligent behavioral patterns. Furthermore, we deploy HTSCN in real-world scenarios to further verify its feasibility and adaptability.

摘要

具身智能体做出智能决策的基本前提在于自主认知。通常情况下,智能体通过利用来自情景记忆的广泛时空信息来优化决策。同时,它们利用长期经验进行任务推理并培养有意识的行为倾向。然而,由于这两种认知能力在异构模态上存在显著差异,现有文献在设计有效的耦合机制方面存在不足,从而无法赋予机器人全面的智能。本文介绍了一种导航框架,即分层拓扑 - 语义认知导航(HTSCN),它在一个单一的端到端系统中无缝集成了记忆和推理能力。具体而言,我们在统一的双层图结构中,分别用拓扑地图和语义关系图来表示记忆和推理能力。此外,我们纳入了基于神经网络的认知提取过程,以捕捉分层图之间的跨模态关系。HTSCN在两种不同的认知模态之间建立了联系,从而进一步提高了决策性能和整体智能水平。实验结果表明,与现有认知结构相比,HTSCN显著提高了图像目标导航的性能和路径效率。可视化和可解释性实验进一步证实了记忆、推理以及它们在线学习的关系对智能行为模式的促进作用。此外,我们在实际场景中部署HTSCN以进一步验证其可行性和适应性。

文献AI研究员

20分钟写一篇综述,助力文献阅读效率提升50倍。

立即体验

用中文搜PubMed

大模型驱动的PubMed中文搜索引擎

马上搜索

文档翻译

学术文献翻译模型,支持多种主流文档格式。

立即体验