以对象为中心的深度主动推理模型中的对称性与复杂性。

Symmetry and complexity in object-centric deep active inference models.

作者信息

Ferraro Stefano, Van de Maele Toon, Verbelen Tim, Dhoedt Bart

机构信息

IDLab, Department of Information Technology, Ghent University-imec, Ghent, Belgium.

出版信息

Interface Focus. 2023 Apr 14;13(3):20220077. doi: 10.1098/rsfs.2022.0077. eCollection 2023 Jun 6.

DOI:10.1098/rsfs.2022.0077

PMID:37065264

原文链接:https://pmc.ncbi.nlm.nih.gov/articles/PMC10102726/

Abstract

Humans perceive and interact with hundreds of objects every day. In doing so, they need to employ mental models of these objects and often exploit symmetries in the object's shape and appearance in order to learn generalizable and transferable skills. Active inference is a first principles approach to understanding and modelling sentient agents. It states that agents entertain a generative model of their environment, and learn and act by minimizing an upper bound on their surprisal, i.e. their free energy. The free energy decomposes into an accuracy and complexity term, meaning that agents favour the least complex model that can accurately explain their sensory observations. In this paper, we investigate how inherent symmetries of particular objects also emerge as symmetries in the latent state space of the generative model learnt under deep active inference. In particular, we focus on object-centric representations, which are trained from pixels to predict novel object views as the agent moves its viewpoint. First, we investigate the relation between model complexity and symmetry exploitation in the state space. Second, we do a principal component analysis to demonstrate how the model encodes the principal axis of symmetry of the object in the latent space. Finally, we also demonstrate how more symmetrical representations can be exploited for better generalization in the context of manipulation.

摘要

人类每天会感知数百个物体并与之交互。在此过程中，他们需要运用这些物体的心理模型，并且常常利用物体形状和外观中的对称性来学习可推广和可迁移的技能。主动推理是一种理解和建模有感知能力的智能体的第一性原理方法。它指出，智能体持有其环境的生成模型，并通过最小化其惊奇度（即其自由能）的上限来学习和行动。自由能分解为一个准确性项和一个复杂性项，这意味着智能体倾向于选择能够准确解释其感官观察的最简单模型。在本文中，我们研究了特定物体的固有对称性如何也作为在深度主动推理下学习的生成模型的潜在状态空间中的对称性而出现。特别是，我们关注以物体为中心的表示，这些表示是从像素进行训练的，以便在智能体移动其视点时预测新颖的物体视图。首先，我们研究状态空间中模型复杂性与对称性利用之间的关系。其次，我们进行主成分分析以展示模型如何在潜在空间中编码物体的对称轴。最后，我们还展示了在操纵的背景下，如何利用更对称的表示来实现更好的泛化。

相似文献

Symmetry and complexity in object-centric deep active inference models.以对象为中心的深度主动推理模型中的对称性与复杂性。

Interface Focus. 2023 Apr 14;13(3):20220077. doi: 10.1098/rsfs.2022.0077. eCollection 2023 Jun 6.

Object-Centric Scene Representations Using Active Inference.使用主动推理的以对象为中心的场景表示

Neural Comput. 2024 Mar 21;36(4):677-704. doi: 10.1162/neco_a_01637.

Learning Generative State Space Models for Active Inference.学习用于主动推理的生成状态空间模型。

Front Comput Neurosci. 2020 Nov 16;14:574372. doi: 10.3389/fncom.2020.574372. eCollection 2020.

Embodied Object Representation Learning and Recognition.具身物体表征学习与识别

Front Neurorobot. 2022 Apr 14;16:840658. doi: 10.3389/fnbot.2022.840658. eCollection 2022.

Unsupervised Object-Centric Learning From Multiple Unspecified Viewpoints.从多个未指定视角进行无监督的以对象为中心的学习。

IEEE Trans Pattern Anal Mach Intell. 2024 May;46(5):3897-3909. doi: 10.1109/TPAMI.2023.3349174. Epub 2024 Apr 3.

Deep active inference.深度主动推理

Biol Cybern. 2018 Dec;112(6):547-573. doi: 10.1007/s00422-018-0785-7. Epub 2018 Oct 22.

Model Reduction Through Progressive Latent Space Pruning in Deep Active Inference.通过深度主动推理中的渐进式潜在空间剪枝进行模型简化

Front Neurorobot. 2022 Mar 11;16:795846. doi: 10.3389/fnbot.2022.795846. eCollection 2022.

Viewpoint dependency in object representation and recognition.物体表征与识别中的视角依赖性

Spat Vis. 1996;9(4):491-521. doi: 10.1163/156856896x00222.

Visual perception of shape altered by inferred causal history.形状的视觉感知因推断出的因果历史而改变。

Sci Rep. 2016 Nov 8;6:36245. doi: 10.1038/srep36245.

Unsupervised learning reveals interpretable latent representations for translucency perception.无监督学习揭示了透明度感知的可解释潜在表示。

PLoS Comput Biol. 2023 Feb 8;19(2):e1010878. doi: 10.1371/journal.pcbi.1010878. eCollection 2023 Feb.

引用本文的文献

FOCUS: object-centric world models for robotic manipulation.聚焦：用于机器人操作的以物体为中心的世界模型。

Front Neurorobot. 2025 Apr 30;19:1585386. doi: 10.3389/fnbot.2025.1585386. eCollection 2025.

本文引用的文献

Embodied Object Representation Learning and Recognition.具身物体表征学习与识别

Front Neurorobot. 2022 Apr 14;16:840658. doi: 10.3389/fnbot.2022.840658. eCollection 2022.

Symmetry-Based Representations for Artificial and Biological General Intelligence.用于人工通用智能和生物通用智能的基于对称性的表示。

Front Comput Neurosci. 2022 Apr 14;16:836498. doi: 10.3389/fncom.2022.836498. eCollection 2022.

The Free Energy Principle for Perception and Action: A Deep Learning Perspective.感知与行动的自由能量原理：深度学习视角

Entropy (Basel). 2022 Feb 21;24(2):301. doi: 10.3390/e24020301.

Generative Models for Active Vision.主动视觉的生成模型。

Front Neurorobot. 2021 Apr 13;15:651432. doi: 10.3389/fnbot.2021.651432. eCollection 2021.

Learning Generative State Space Models for Active Inference.学习用于主动推理的生成状态空间模型。

Front Comput Neurosci. 2020 Nov 16;14:574372. doi: 10.3389/fncom.2020.574372. eCollection 2020.

Self-generated variability in object images predicts vocabulary growth.自我生成的物体图像变化可预测词汇量增长。

Dev Sci. 2019 Nov;22(6):e12816. doi: 10.1111/desc.12816. Epub 2019 Apr 3.

A Framework for Intelligence and Cortical Function Based on Grid Cells in the Neocortex.基于新皮层网格细胞的智能和皮质功能框架。

Front Neural Circuits. 2019 Jan 11;12:121. doi: 10.3389/fncir.2018.00121. eCollection 2018.

The Developing Infant Creates a Curriculum for Statistical Learning.发展中的婴儿为统计学习制定课程。

Trends Cogn Sci. 2018 Apr;22(4):325-336. doi: 10.1016/j.tics.2018.02.004. Epub 2018 Mar 5.

The Code for Facial Identity in the Primate Brain.灵长类大脑中的面部识别编码

Cell. 2017 Jun 1;169(6):1013-1028.e14. doi: 10.1016/j.cell.2017.05.011.

Active inference and learning.主动推理与学习

Neurosci Biobehav Rev. 2016 Sep;68:862-879. doi: 10.1016/j.neubiorev.2016.06.022. Epub 2016 Jun 29.

文献检索

告别复杂PubMed语法，用中文像聊天一样搜索，搜遍4000万医学文献。AI智能推荐，让科研检索更轻松。

立即免费搜索

文件翻译

保留排版，准确专业，支持PDF/Word/PPT等文件格式，支持 12+语言互译。

免费翻译文档

深度研究

AI帮你快速写综述，25分钟生成高质量综述，智能提取关键信息，辅助科研写作。

立即免费体验