Suppr
超能文献

对物体的细致重建有助于实现可靠的物体识别。

The attentive reconstruction of objects facilitates robust object recognition.

作者信息

Ahn Seoyoung, Adeli Hossein, Zelinsky Gregory J

机构信息

Department of Molecular and Cell Biology, University of California, Berkeley, California, United States of America.

Zuckerman Mind Brain Behavior Institute, Columbia University, New York City, New York, United States of America.

出版信息

PLoS Comput Biol. 2024 Jun 13;20(6):e1012159. doi: 10.1371/journal.pcbi.1012159. eCollection 2024 Jun.

DOI:10.1371/journal.pcbi.1012159

PMID:38870125

原文链接:https://pmc.ncbi.nlm.nih.gov/articles/PMC11175536/

Abstract

Humans are extremely robust in our ability to perceive and recognize objects-we see faces in tea stains and can recognize friends on dark streets. Yet, neurocomputational models of primate object recognition have focused on the initial feed-forward pass of processing through the ventral stream and less on the top-down feedback that likely underlies robust object perception and recognition. Aligned with the generative approach, we propose that the visual system actively facilitates recognition by reconstructing the object hypothesized to be in the image. Top-down attention then uses this reconstruction as a template to bias feedforward processing to align with the most plausible object hypothesis. Building on auto-encoder neural networks, our model makes detailed hypotheses about the appearance and location of the candidate objects in the image by reconstructing a complete object representation from potentially incomplete visual input due to noise and occlusion. The model then leverages the best object reconstruction, measured by reconstruction error, to direct the bottom-up process of selectively routing low-level features, a top-down biasing that captures a core function of attention. We evaluated our model using the MNIST-C (handwritten digits under corruptions) and ImageNet-C (real-world objects under corruptions) datasets. Not only did our model achieve superior performance on these challenging tasks designed to approximate real-world noise and occlusion viewing conditions, but also better accounted for human behavioral reaction times and error patterns than a standard feedforward Convolutional Neural Network. Our model suggests that a complete understanding of object perception and recognition requires integrating top-down and attention feedback, which we propose is an object reconstruction.

摘要

人类在感知和识别物体方面具有极强的能力——我们能在茶渍中看到人脸，能在黑暗的街道上认出朋友。然而，灵长类动物物体识别的神经计算模型主要关注通过腹侧流进行的初始前馈处理过程，而较少关注可能是稳健物体感知和识别基础的自上而下的反馈。与生成方法一致，我们提出视觉系统通过重建假设存在于图像中的物体来积极促进识别。然后，自上而下的注意力将此重建作为模板，使前馈处理产生偏差，以与最合理的物体假设对齐。基于自动编码器神经网络，我们的模型通过从因噪声和遮挡而可能不完整的视觉输入中重建完整的物体表示，对图像中候选物体的外观和位置做出详细假设。然后，该模型利用以重建误差衡量的最佳物体重建结果，来指导选择性路由低级特征的自下而上过程，这种自上而下的偏差捕捉了注意力的核心功能。我们使用MNIST-C（受干扰的手写数字）和ImageNet-C（受干扰的真实世界物体）数据集对我们的模型进行了评估。我们的模型不仅在这些旨在近似真实世界噪声和遮挡观看条件的具有挑战性的任务上取得了卓越的性能，而且与标准的前馈卷积神经网络相比，能更好地解释人类行为反应时间和错误模式。我们的模型表明，对物体感知和识别的全面理解需要整合自上而下和注意力反馈，我们认为这是一种物体重建。

https://cdn.ncbi.nlm.nih.gov/pmc/blobs/ffcb/11175536/f7ab570a216b/pcbi.1012159.g001.jpg

相似文献

The attentive reconstruction of objects facilitates robust object recognition.

PLoS Comput Biol. 2024 Jun 13;20(6):e1012159. doi: 10.1371/journal.pcbi.1012159. eCollection 2024 Jun.

Large-Scale, High-Resolution Comparison of the Core Visual Object Recognition Behavior of Humans, Monkeys, and State-of-the-Art Deep Artificial Neural Networks.

J Neurosci. 2018 Aug 15;38(33):7255-7269. doi: 10.1523/JNEUROSCI.0388-18.2018. Epub 2018 Jul 13.

Beyond core object recognition: Recurrent processes account for object recognition under occlusion.

PLoS Comput Biol. 2019 May 15;15(5):e1007001. doi: 10.1371/journal.pcbi.1007001. eCollection 2019 May.

Unsupervised changes in core object recognition behavior are predicted by neural plasticity in inferior temporal cortex.

Elife. 2021 Jun 11;10:e60830. doi: 10.7554/eLife.60830.

Recurrent Connections in the Primate Ventral Visual Stream Mediate a Trade-Off Between Task Performance and Network Size During Core Object Recognition.

Neural Comput. 2022 Jul 14;34(8):1652-1675. doi: 10.1162/neco_a_01506.

Recurrent Convolutional Neural Networks: A Better Model of Biological Object Recognition.

Front Psychol. 2017 Sep 12;8:1551. doi: 10.3389/fpsyg.2017.01551. eCollection 2017.

A neurocomputational model of decision and confidence in object recognition task.

Neural Netw. 2024 Jul;175:106318. doi: 10.1016/j.neunet.2024.106318. Epub 2024 Apr 12.

Top-down attention based on object representation and incremental memory for knowledge building and inference.

Neural Netw. 2013 Oct;46:9-22. doi: 10.1016/j.neunet.2013.04.002. Epub 2013 Apr 8.

A brain-inspired object-based attention network for multiobject recognition and visual reasoning.

J Vis. 2023 May 2;23(5):16. doi: 10.1167/jov.23.5.16.

Capturing the objects of vision with neural networks.

Nat Hum Behav. 2021 Sep;5(9):1127-1144. doi: 10.1038/s41562-021-01194-6. Epub 2021 Sep 20.

本文引用的文献

A brain-inspired object-based attention network for multiobject recognition and visual reasoning.

J Vis. 2023 May 2;23(5):16. doi: 10.1167/jov.23.5.16.

The Architecture of Object-Based Attention.

Psychon Bull Rev. 2023 Oct;30(5):1643-1667. doi: 10.3758/s13423-023-02281-7. Epub 2023 Apr 20.

Subjective signal strength distinguishes reality from imagination.

Nat Commun. 2023 Mar 23;14(1):1627. doi: 10.1038/s41467-023-37322-1.

Linking global top-down views to first-person views in the brain.

Proc Natl Acad Sci U S A. 2022 Nov 8;119(45):e2202024119. doi: 10.1073/pnas.2202024119. Epub 2022 Nov 2.

A model of working memory for latent representations.

Nat Hum Behav. 2022 May;6(5):709-719. doi: 10.1038/s41562-021-01264-9. Epub 2022 Feb 3.

Use of superordinate labels yields more robust and human-like visual representations in convolutional neural networks.

J Vis. 2021 Dec 1;21(13):13. doi: 10.1167/jov.21.13.13.

Capturing the objects of vision with neural networks.

Nat Hum Behav. 2021 Sep;5(9):1127-1144. doi: 10.1038/s41562-021-01194-6. Epub 2021 Sep 20.

Reconstructing feedback representations in the ventral visual pathway with a generative adversarial autoencoder.

PLoS Comput Biol. 2021 Mar 24;17(3):e1008775. doi: 10.1371/journal.pcbi.1008775. eCollection 2021 Mar.

The Generative Adversarial Brain.

Front Artif Intell. 2019 Sep 18;2:18. doi: 10.3389/frai.2019.00018. eCollection 2019.

Recurrent neural networks can explain flexible trading of speed and accuracy in biological vision.

PLoS Comput Biol. 2020 Oct 2;16(10):e1008215. doi: 10.1371/journal.pcbi.1008215. eCollection 2020 Oct.

文献AI研究员

20分钟写一篇综述，助力文献阅读效率提升50倍。

立即体验

用中文搜PubMed

大模型驱动的PubMed中文搜索引擎

马上搜索

文档翻译

学术文献翻译模型，支持多种主流文档格式。

立即体验

Suppr超能文献

对物体的细致重建有助于实现可靠的物体识别。

The attentive reconstruction of objects facilitates robust object recognition.

作者信息

机构信息

出版信息

相似文献

本文引用的文献

文献AI研究员

用中文搜PubMed

文档翻译