深度网络在不变目标识别中可模拟人类前馈视觉。

Deep Networks Can Resemble Human Feed-forward Vision in Invariant Object Recognition.

机构信息

Department of Computer Science, School of Mathematics, Statistics, and Computer Science, University of Tehran, Tehran, Iran.

CERCO UMR 5549, CNRS - Université de Toulouse, F-31300, France.

出版信息

Sci Rep. 2016 Sep 7;6:32672. doi: 10.1038/srep32672.

DOI:10.1038/srep32672

PMID:27601096

原文链接:https://pmc.ncbi.nlm.nih.gov/articles/PMC5013454/

Abstract

Deep convolutional neural networks (DCNNs) have attracted much attention recently, and have shown to be able to recognize thousands of object categories in natural image databases. Their architecture is somewhat similar to that of the human visual system: both use restricted receptive fields, and a hierarchy of layers which progressively extract more and more abstracted features. Yet it is unknown whether DCNNs match human performance at the task of view-invariant object recognition, whether they make similar errors and use similar representations for this task, and whether the answers depend on the magnitude of the viewpoint variations. To investigate these issues, we benchmarked eight state-of-the-art DCNNs, the HMAX model, and a baseline shallow model and compared their results to those of humans with backward masking. Unlike in all previous DCNN studies, we carefully controlled the magnitude of the viewpoint variations to demonstrate that shallow nets can outperform deep nets and humans when variations are weak. When facing larger variations, however, more layers were needed to match human performance and error distributions, and to have representations that are consistent with human behavior. A very deep net with 18 layers even outperformed humans at the highest variation level, using the most human-like representations.

摘要

深度卷积神经网络（DCNN）最近受到了广泛关注，它们在自然图像数据库中能够识别数千种物体类别。其结构与人类视觉系统有些相似：两者都使用受限的感受野，并通过分层结构逐步提取越来越抽象的特征。然而，目前尚不清楚 DCNN 在视图不变性物体识别任务中的表现是否与人类相当，它们在该任务中是否会犯类似的错误并使用类似的表示，以及答案是否取决于视角变化的大小。为了研究这些问题，我们对八个最先进的 DCNN、HMAX 模型和一个基线浅层模型进行了基准测试，并将它们的结果与使用后向掩蔽的人类进行了比较。与之前所有的 DCNN 研究不同，我们仔细控制了视角变化的大小，以证明在变化较弱时，浅层网络可以胜过深层网络和人类。然而，当面临更大的变化时，需要更多的层才能匹配人类的表现和错误分布，并具有与人类行为一致的表示。一个具有 18 层的非常深的网络甚至在最高变化水平上超过了人类，使用了最像人类的表示。

https://cdn.ncbi.nlm.nih.gov/pmc/blobs/93a3/5013454/c9690ccc6645/srep32672-f1.jpg

相似文献

Deep Networks Can Resemble Human Feed-forward Vision in Invariant Object Recognition.深度网络在不变目标识别中可模拟人类前馈视觉。

Sci Rep. 2016 Sep 7;6:32672. doi: 10.1038/srep32672.

Humans and Deep Networks Largely Agree on Which Kinds of Variation Make Object Recognition Harder.人类和深度网络在哪些类型的变化会使目标识别更困难这一问题上基本达成一致。

Front Comput Neurosci. 2016 Aug 31;10:92. doi: 10.3389/fncom.2016.00092. eCollection 2016.

Face Recognition Depends on Specialized Mechanisms Tuned to View-Invariant Facial Features: Insights from Deep Neural Networks Optimized for Face or Object Recognition.人脸识别依赖于专门的机制，这些机制针对的是不变的面部特征：来自专门针对人脸或物体识别进行优化的深度神经网络的见解。

Cogn Sci. 2021 Sep;45(9):e13031. doi: 10.1111/cogs.13031.

Improved object recognition using neural networks trained to mimic the brain's statistical properties.利用模仿大脑统计特性的神经网络来提高物体识别能力。

Neural Netw. 2020 Nov;131:103-114. doi: 10.1016/j.neunet.2020.07.013. Epub 2020 Jul 29.

Robustness to Transformations Across Categories: Is Robustness Driven by Invariant Neural Representations?跨类别变换的鲁棒性：不变神经表示是否驱动鲁棒性？

Neural Comput. 2023 Nov 7;35(12):1910-1937. doi: 10.1162/neco_a_01621.

Invariant visual object recognition: biologically plausible approaches.不变视觉物体识别：生物学上可行的方法。

Biol Cybern. 2015 Oct;109(4-5):505-35. doi: 10.1007/s00422-015-0658-2. Epub 2015 Sep 3.

Understanding Human Object Vision: A Picture Is Worth a Thousand Representations.理解人类客体视觉：一张图片胜过千般表征。

Annu Rev Psychol. 2023 Jan 18;74:113-135. doi: 10.1146/annurev-psych-032720-041031. Epub 2022 Nov 15.

Invariant recognition drives neural representations of action sequences.不变性识别驱动动作序列的神经表征。

PLoS Comput Biol. 2017 Dec 18;13(12):e1005859. doi: 10.1371/journal.pcbi.1005859. eCollection 2017 Dec.

Invariant object recognition is a personalized selection of invariant features in humans, not simply explained by hierarchical feed-forward vision models.不变目标识别是人类对不变特征的个性化选择，不能简单地用分层前馈视觉模型来解释。

Sci Rep. 2017 Oct 31;7(1):14402. doi: 10.1038/s41598-017-13756-8.

Human Visual Cortex and Deep Convolutional Neural Network Care Deeply about Object Background.人类视觉皮层和深度卷积神经网络非常关注物体背景。

J Cogn Neurosci. 2024 Mar 1;36(3):551-566. doi: 10.1162/jocn_a_02098.

引用本文的文献

Machine Learning Techniques for Simulating Human Psychophysical Testing of Low-Resolution Phosphene Face Images in Artificial Vision.用于模拟人工视觉中低分辨率光幻视面部图像的人类心理物理测试的机器学习技术

Adv Sci (Weinh). 2025 Apr;12(15):e2405789. doi: 10.1002/advs.202405789. Epub 2025 Feb 22.

RTify: Aligning Deep Neural Networks with Human Behavioral Decisions.RTify：使深度神经网络与人类行为决策保持一致

ArXiv. 2024 Dec 26:arXiv:2411.03630v2.

Repeatability and reproducibility of deep learning features for lung adenocarcinoma subtypes with nodules less than 10 mm in size: a multicenter thin-slice computed tomography phantom and clinical validation study.小于10毫米结节的肺腺癌亚型深度学习特征的可重复性和再现性：一项多中心薄层计算机断层扫描体模和临床验证研究。

Quant Imaging Med Surg. 2024 Aug 1;14(8):5396-5407. doi: 10.21037/qims-24-77. Epub 2024 Jul 30.

Layerwise complexity-matched learning yields an improved model of cortical area V2.逐层复杂度匹配学习产生了一个改进的V2皮质区域模型。

ArXiv. 2024 Jul 18:arXiv:2312.11436v3.

Going beyond still images to improve input variance resilience in multi-stream vision understanding models.超越静态图像，以提高多流视觉理解模型中输入方差的弹性。

Sci Rep. 2024 Jul 4;14(1):15366. doi: 10.1038/s41598-024-66346-w.

Computational reconstruction of mental representations using human behavior.使用人类行为进行心理表象的计算重建。

Nat Commun. 2024 May 17;15(1):4183. doi: 10.1038/s41467-024-48114-6.

Deep learning in structural bioinformatics: current applications and future perspectives.结构生物信息学中的深度学习：当前应用与未来展望。

Brief Bioinform. 2024 Mar 27;25(3). doi: 10.1093/bib/bbae042.

Unsupervised learning on spontaneous retinal activity leads to efficient neural representation geometry.对自发视网膜活动进行无监督学习可产生高效的神经表征几何结构。

ArXiv. 2023 Dec 5:arXiv:2312.02791v1.

Challenging the Classical View: Recognition of Identity and Expression as Integrated Processes.挑战传统观点：将身份识别与表达视为整合过程

Brain Sci. 2023 Feb 10;13(2):296. doi: 10.3390/brainsci13020296.

Front Comput Neurosci. 2022 Dec 21;16:1057439. doi: 10.3389/fncom.2022.1057439. eCollection 2022.

本文引用的文献

Fast ventral stream neural activity enables rapid visual categorization.快速的腹侧流神经活动实现快速视觉分类。

Neuroimage. 2016 Jan 15;125:280-290. doi: 10.1016/j.neuroimage.2015.10.012. Epub 2015 Oct 20.

Simple Learned Weighted Sums of Inferior Temporal Neuronal Firing Rates Accurately Predict Human Core Object Recognition Performance.颞下回神经元放电率的简单学习加权和准确预测人类核心物体识别性能。

J Neurosci. 2015 Sep 30;35(39):13402-18. doi: 10.1523/JNEUROSCI.5181-14.2015.

Comparison of Object Recognition Behavior in Human and Monkey.人类与猴子物体识别行为的比较

J Neurosci. 2015 Sep 2;35(35):12127-36. doi: 10.1523/JNEUROSCI.0573-15.2015.

Deep Neural Networks Reveal a Gradient in the Complexity of Neural Representations across the Ventral Stream.深度神经网络揭示了腹侧流中神经表征复杂性的梯度变化。

J Neurosci. 2015 Jul 8;35(27):10005-14. doi: 10.1523/JNEUROSCI.5023-14.2015.

Deep learning.深度学习。

Nature. 2015 May 28;521(7553):436-44. doi: 10.1038/nature14539.

Does object view influence the scene consistency effect?物体视图会影响场景一致性效应吗？

Atten Percept Psychophys. 2015 Apr;77(3):856-66. doi: 10.3758/s13414-014-0817-x.

Deep neural networks rival the representation of primate IT cortex for core visual object recognition.深度神经网络在核心视觉目标识别方面可与灵长类动物的颞下皮质表征相媲美。

PLoS Comput Biol. 2014 Dec 18;10(12):e1003963. doi: 10.1371/journal.pcbi.1003963. eCollection 2014 Dec.

Deep learning in neural networks: an overview.神经网络中的深度学习：综述。

Neural Netw. 2015 Jan;61:85-117. doi: 10.1016/j.neunet.2014.09.003. Epub 2014 Oct 13.

Deep supervised, but not unsupervised, models may explain IT cortical representation.深度监督模型而非无监督模型可能解释IT皮层表征。

PLoS Comput Biol. 2014 Nov 6;10(11):e1003915. doi: 10.1371/journal.pcbi.1003915. eCollection 2014 Nov.

Neural networks and neuroscience-inspired computer vision.神经网络与受神经科学启发的计算机视觉。

Curr Biol. 2014 Sep 22;24(18):R921-R929. doi: 10.1016/j.cub.2014.08.026.

文献检索

告别复杂PubMed语法，用中文像聊天一样搜索，搜遍4000万医学文献。AI智能推荐，让科研检索更轻松。

立即免费搜索

文件翻译

保留排版，准确专业，支持PDF/Word/PPT等文件格式，支持 12+语言互译。

免费翻译文档

深度研究

AI帮你快速写综述，25分钟生成高质量综述，智能提取关键信息，辅助科研写作。

立即免费体验

深度网络在不变目标识别中可模拟人类前馈视觉。

Deep Networks Can Resemble Human Feed-forward Vision in Invariant Object Recognition.

机构信息

出版信息

相似文献

引用本文的文献

本文引用的文献

文献检索

文件翻译

深度研究

Suppr 超能文献

相似文献

引用本文的文献

本文引用的文献