Suppr超能文献

使用主动推理的以对象为中心的场景表示

Object-Centric Scene Representations Using Active Inference.

作者信息

Van de Maele Toon, Verbelen Tim, Mazzaglia Pietro, Ferraro Stefano, Dhoedt Bart

机构信息

Ghent University, 9000 Ghent, Belgium

VERSES AI Research Lab, Los Angeles, CA 90016, U.S.A.

出版信息

Neural Comput. 2024 Mar 21;36(4):677-704. doi: 10.1162/neco_a_01637.

Abstract

Representing a scene and its constituent objects from raw sensory data is a core ability for enabling robots to interact with their environment. In this letter, we propose a novel approach for scene understanding, leveraging an object-centric generative model that enables an agent to infer object category and pose in an allocentric reference frame using active inference, a neuro-inspired framework for action and perception. For evaluating the behavior of an active vision agent, we also propose a new benchmark where, given a target viewpoint of a particular object, the agent needs to find the best matching viewpoint given a workspace with randomly positioned objects in 3D. We demonstrate that our active inference agent is able to balance epistemic foraging and goal-driven behavior, and quantitatively outperforms both supervised and reinforcement learning baselines by more than a factor of two in terms of success rate.

摘要

从原始感官数据中表征一个场景及其组成对象是使机器人能够与环境交互的核心能力。在这封信中,我们提出了一种用于场景理解的新方法,利用以对象为中心的生成模型,该模型使智能体能够使用主动推理(一种受神经启发的行动和感知框架)在以自我为中心的参考系中推断对象类别和姿态。为了评估主动视觉智能体的行为,我们还提出了一个新的基准测试,在给定特定对象的目标视点的情况下,智能体需要在一个三维空间中随机放置对象的工作空间中找到最佳匹配视点。我们证明,我们的主动推理智能体能够平衡认知觅食和目标驱动行为,并且在成功率方面,在数量上比监督学习和强化学习基线高出两倍多。

文献AI研究员

20分钟写一篇综述,助力文献阅读效率提升50倍。

立即体验

用中文搜PubMed

大模型驱动的PubMed中文搜索引擎

马上搜索

文档翻译

学术文献翻译模型,支持多种主流文档格式。

立即体验