使深度神经网络的目标识别策略与人类相协调。

Harmonizing the object recognition strategies of deep neural networks with humans.

作者信息

Fel Thomas, Felipe Ivan, Linsley Drew, Serre Thomas

机构信息

Department of Cognitive, Linguistic, & Psychological Sciences, Brown University, Providence, RI.

Artificial and Natural Intelligence Toulouse Institute (ANITI), Toulouse, France.

出版信息

Adv Neural Inf Process Syst. 2022 Dec;35:9432-9446.

PMID:37465369

原文链接:https://pmc.ncbi.nlm.nih.gov/articles/PMC10353762/

Abstract

The many successes of deep neural networks (DNNs) over the past decade have largely been driven by computational scale rather than insights from biological intelligence. Here, we explore if these trends have also carried concomitant improvements in explaining the visual strategies humans rely on for object recognition. We do this by comparing two related but distinct properties of visual strategies in humans and DNNs: they believe important visual features are in images and they use those features to categorize objects. Across 84 different DNNs trained on ImageNet and three independent datasets measuring the and the of human visual strategies for object recognition on those images, we find a systematic trade-off between DNN categorization accuracy and alignment with human visual strategies for object recognition. . We rectify this growing issue with our neural harmonizer: a general-purpose training routine that both aligns DNN and human visual strategies and improves categorization accuracy. Our work represents the first demonstration that the scaling laws [1-3] that are guiding the design of DNNs today have also produced worse models of human vision. We release our code and data at https://serre-lab.github.io/Harmonization to help the field build more human-like DNNs.

摘要

在过去十年中，深度神经网络（DNN）的诸多成功很大程度上是由计算规模驱动的，而非来自生物智能的见解。在此，我们探讨这些趋势是否也在解释人类用于物体识别的视觉策略方面带来了相应的改进。我们通过比较人类和DNN视觉策略的两个相关但不同的属性来做到这一点：它们认为图像中哪些视觉特征是重要的，以及它们如何利用这些特征对物体进行分类。在84个在ImageNet上训练的不同DNN以及三个独立数据集上，这些数据集测量了人类在这些图像上进行物体识别的视觉策略的[具体两个属性未明确写出]，我们发现DNN分类准确率与人类物体识别视觉策略的一致性之间存在系统的权衡。我们用我们的神经协调器纠正了这个日益严重的问题：这是一种通用的训练程序，既能使DNN和人类视觉策略保持一致，又能提高分类准确率。我们的工作首次证明，如今指导DNN设计的缩放定律[1 - 3]也产生了更差的人类视觉模型。我们在https://serre-lab.github.io/Harmonization上发布了我们的代码和数据，以帮助该领域构建更类人的DNN。

相似文献

Harmonizing the object recognition strategies of deep neural networks with humans.

Adv Neural Inf Process Syst. 2022 Dec;35:9432-9446.

The developmental trajectory of object recognition robustness: Children are like small adults but unlike big deep neural networks.

J Vis. 2023 Jul 3;23(7):4. doi: 10.1167/jov.23.7.4.

Deep Neural Networks and Visuo-Semantic Models Explain Complementary Components of Human Ventral-Stream Representational Dynamics.

J Neurosci. 2023 Mar 8;43(10):1731-1741. doi: 10.1523/JNEUROSCI.1424-22.2022. Epub 2023 Feb 9.

Deep Convolutional Neural Networks Outperform Feature-Based But Not Categorical Models in Explaining Object Similarity Judgments.

Front Psychol. 2017 Oct 9;8:1726. doi: 10.3389/fpsyg.2017.01726. eCollection 2017.

Are Deep Neural Networks Adequate Behavioral Models of Human Visual Perception?

Annu Rev Vis Sci. 2023 Sep 15;9:501-524. doi: 10.1146/annurev-vision-120522-031739. Epub 2023 Mar 31.

Deep problems with neural network models of human vision.

Behav Brain Sci. 2022 Dec 1;46:e385. doi: 10.1017/S0140525X22002813.

Noise-trained deep neural networks effectively predict human vision and its neural responses to challenging images.

PLoS Biol. 2021 Dec 9;19(12):e3001418. doi: 10.1371/journal.pbio.3001418. eCollection 2021 Dec.

Symbolic Deep Networks: A Psychologically Inspired Lightweight and Efficient Approach to Deep Learning.

Top Cogn Sci. 2022 Oct;14(4):702-717. doi: 10.1111/tops.12571. Epub 2021 Oct 5.

Multimodal deep neural decoding reveals highly resolved spatiotemporal profile of visual object representation in humans.

Neuroimage. 2023 Jul 15;275:120164. doi: 10.1016/j.neuroimage.2023.120164. Epub 2023 May 9.

Factorized visual representations in the primate visual system and deep neural networks.

Elife. 2024 Jul 5;13:RP91685. doi: 10.7554/eLife.91685.

引用本文的文献

Human-like monocular depth biases in deep neural networks.

PLoS Comput Biol. 2025 Aug 19;21(8):e1013020. doi: 10.1371/journal.pcbi.1013020. eCollection 2025 Aug.

Computational Urban Ecology of New York City Rats.

bioRxiv. 2025 Jul 24:2025.07.21.665423. doi: 10.1101/2025.07.21.665423.

Unraveling the complexity of rat object vision requires a full convolutional network and beyond.

Patterns (N Y). 2025 Jan 17;6(2):101149. doi: 10.1016/j.patter.2024.101149. eCollection 2025 Feb 14.

An image-computable model of speeded decision-making.

Elife. 2025 Feb 28;13:RP98351. doi: 10.7554/eLife.98351.

Advances in Neuroimaging and Deep Learning for Emotion Detection: A Systematic Review of Cognitive Neuroscience and Algorithmic Innovations.

Diagnostics (Basel). 2025 Feb 13;15(4):456. doi: 10.3390/diagnostics15040456.

RTify: Aligning Deep Neural Networks with Human Behavioral Decisions.

ArXiv. 2024 Dec 26:arXiv:2411.03630v2.

Scaling models of visual working memory to natural images.

Commun Psychol. 2024 Jan 3;2(1):3. doi: 10.1038/s44271-023-00048-3.

Bird song comparison using deep learning trained from avian perceptual judgments.

PLoS Comput Biol. 2024 Aug 7;20(8):e1012329. doi: 10.1371/journal.pcbi.1012329. eCollection 2024 Aug.

Layerwise complexity-matched learning yields an improved model of cortical area V2.

ArXiv. 2024 Jul 18:arXiv:2312.11436v3.

What I Cannot Predict, I Do Not Understand: A Human-Centered Evaluation Framework for Explainability Methods.

Adv Neural Inf Process Syst. 2022;35:2832-2845.

本文引用的文献

What I Cannot Predict, I Do Not Understand: A Human-Centered Evaluation Framework for Explainability Methods.

Adv Neural Inf Process Syst. 2022;35:2832-2845.

When Pigs Fly: Contextual Reasoning in Synthetic and Natural Scenes.

IEEE Int Conf Comput Vis Workshops. 2021 Oct;2021:255-264. doi: 10.1109/iccv48922.2021.00032. Epub 2022 Feb 28.

Texture-like representation of objects in human visual cortex.

Proc Natl Acad Sci U S A. 2022 Apr 26;119(17):e2115302119. doi: 10.1073/pnas.2115302119. Epub 2022 Apr 19.

Superhuman cell death detection with biomarker-optimized neural networks.

Sci Adv. 2021 Dec 10;7(50):eabf8142. doi: 10.1126/sciadv.abf8142. Epub 2021 Dec 8.

Oculo-retinal dynamics can explain the perception of minimal recognizable configurations.

Proc Natl Acad Sci U S A. 2021 Aug 24;118(34). doi: 10.1073/pnas.2022792118.

Controversial stimuli: Pitting neural networks against each other as models of human cognition.

Proc Natl Acad Sci U S A. 2020 Nov 24;117(47):29330-29337. doi: 10.1073/pnas.1912334117.

Integrative Benchmarking to Advance Neurally Mechanistic Models of Human Intelligence.

Neuron. 2020 Nov 11;108(3):413-423. doi: 10.1016/j.neuron.2020.07.040. Epub 2020 Sep 11.

GOT-10k: A Large High-Diversity Benchmark for Generic Object Tracking in the Wild.

IEEE Trans Pattern Anal Mach Intell. 2021 May;43(5):1562-1577. doi: 10.1109/TPAMI.2019.2957464. Epub 2021 Apr 1.

Recurrence is required to capture the representational dynamics of the human visual system.

Proc Natl Acad Sci U S A. 2019 Oct 22;116(43):21854-21863. doi: 10.1073/pnas.1905544116. Epub 2019 Oct 7.

Res2Net: A New Multi-Scale Backbone Architecture.

IEEE Trans Pattern Anal Mach Intell. 2021 Feb;43(2):652-662. doi: 10.1109/TPAMI.2019.2938758. Epub 2021 Jan 8.

文献AI研究员

20分钟写一篇综述，助力文献阅读效率提升50倍。

立即体验

用中文搜PubMed

大模型驱动的PubMed中文搜索引擎

马上搜索

文档翻译

学术文献翻译模型，支持多种主流文档格式。

立即体验

使深度神经网络的目标识别策略与人类相协调。

Harmonizing the object recognition strategies of deep neural networks with humans.

作者信息

机构信息

出版信息

相似文献

引用本文的文献

本文引用的文献

文献AI研究员

用中文搜PubMed

文档翻译

Suppr 超能文献