• 文献检索
  • 文档翻译
  • 深度研究
  • 学术资讯
  • Suppr Zotero 插件Zotero 插件
  • 邀请有礼
  • 套餐&价格
  • 历史记录
应用&插件
Suppr Zotero 插件Zotero 插件浏览器插件Mac 客户端Windows 客户端微信小程序
定价
高级版会员购买积分包购买API积分包
服务
文献检索文档翻译深度研究API 文档MCP 服务
关于我们
关于 Suppr公司介绍联系我们用户协议隐私条款
关注我们

Suppr 超能文献

核心技术专利:CN118964589B侵权必究
粤ICP备2023148730 号-1Suppr @ 2026

文献检索

告别复杂PubMed语法,用中文像聊天一样搜索,搜遍4000万医学文献。AI智能推荐,让科研检索更轻松。

立即免费搜索

文件翻译

保留排版,准确专业,支持PDF/Word/PPT等文件格式,支持 12+语言互译。

免费翻译文档

深度研究

AI帮你快速写综述,25分钟生成高质量综述,智能提取关键信息,辅助科研写作。

立即免费体验

大规模考察在大脑和机器中塑造高级视觉表示的归纳偏差。

A large-scale examination of inductive biases shaping high-level visual representation in brains and machines.

机构信息

Department of Psychology, Harvard University, Cambridge, MA, USA.

Center for Magnetic Resonance Research, Department of Radiology, University of Minnesota, Minneapolis, MN, USA.

出版信息

Nat Commun. 2024 Oct 30;15(1):9383. doi: 10.1038/s41467-024-53147-y.

DOI:10.1038/s41467-024-53147-y
PMID:39477923
原文链接:https://pmc.ncbi.nlm.nih.gov/articles/PMC11526138/
Abstract

The rapid release of high-performing computer vision models offers new potential to study the impact of different inductive biases on the emergent brain alignment of learned representations. Here, we perform controlled comparisons among a curated set of 224 diverse models to test the impact of specific model properties on visual brain predictivity - a process requiring over 1.8 billion regressions and 50.3 thousand representational similarity analyses. We find that models with qualitatively different architectures (e.g. CNNs versus Transformers) and task objectives (e.g. purely visual contrastive learning versus vision- language alignment) achieve near equivalent brain predictivity, when other factors are held constant. Instead, variation across visual training diets yields the largest, most consistent effect on brain predictivity. Many models achieve similarly high brain predictivity, despite clear variation in their underlying representations - suggesting that standard methods used to link models to brains may be too flexible. Broadly, these findings challenge common assumptions about the factors underlying emergent brain alignment, and outline how we can leverage controlled model comparison to probe the common computational principles underlying biological and artificial visual systems.

摘要

高性能计算机视觉模型的快速发布为研究不同归纳偏差对学习表示的新兴大脑对齐的影响提供了新的潜力。在这里,我们对一组经过精心挑选的 224 个不同模型进行了对照比较,以测试特定模型属性对视觉大脑预测性的影响——这一过程需要超过 18 亿次回归和 50300 次表示相似性分析。我们发现,当其他因素保持不变时,具有定性不同架构(例如卷积神经网络与变压器)和任务目标(例如纯粹的视觉对比学习与视觉语言对齐)的模型可以实现近乎等效的大脑预测性。相反,视觉训练方案的变化对大脑预测性产生最大、最一致的影响。尽管模型的基础表示存在明显差异,但许多模型仍能达到相似的高大脑预测性,这表明用于将模型与大脑联系起来的标准方法可能过于灵活。总的来说,这些发现挑战了新兴大脑对齐背后因素的常见假设,并概述了我们如何利用受控模型比较来探究生物和人工视觉系统背后的共同计算原理。

https://cdn.ncbi.nlm.nih.gov/pmc/blobs/aadc/11526138/65fdb4e3ffc1/41467_2024_53147_Fig6_HTML.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/aadc/11526138/32f4222a8a1f/41467_2024_53147_Fig1_HTML.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/aadc/11526138/90c75e561c15/41467_2024_53147_Fig2_HTML.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/aadc/11526138/ba0c61d8fee3/41467_2024_53147_Fig3_HTML.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/aadc/11526138/3708366abf60/41467_2024_53147_Fig4_HTML.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/aadc/11526138/89556b08d158/41467_2024_53147_Fig5_HTML.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/aadc/11526138/65fdb4e3ffc1/41467_2024_53147_Fig6_HTML.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/aadc/11526138/32f4222a8a1f/41467_2024_53147_Fig1_HTML.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/aadc/11526138/90c75e561c15/41467_2024_53147_Fig2_HTML.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/aadc/11526138/ba0c61d8fee3/41467_2024_53147_Fig3_HTML.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/aadc/11526138/3708366abf60/41467_2024_53147_Fig4_HTML.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/aadc/11526138/89556b08d158/41467_2024_53147_Fig5_HTML.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/aadc/11526138/65fdb4e3ffc1/41467_2024_53147_Fig6_HTML.jpg

相似文献

1
A large-scale examination of inductive biases shaping high-level visual representation in brains and machines.大规模考察在大脑和机器中塑造高级视觉表示的归纳偏差。
Nat Commun. 2024 Oct 30;15(1):9383. doi: 10.1038/s41467-024-53147-y.
2
Invariant recognition drives neural representations of action sequences.不变性识别驱动动作序列的神经表征。
PLoS Comput Biol. 2017 Dec 18;13(12):e1005859. doi: 10.1371/journal.pcbi.1005859. eCollection 2017 Dec.
3
Brain2GAN: Feature-disentangled neural encoding and decoding of visual perception in the primate brain.脑到生成对抗网络:灵长类动物大脑中视觉感知的特征解缠神经编码和解码。
PLoS Comput Biol. 2024 May 6;20(5):e1012058. doi: 10.1371/journal.pcbi.1012058. eCollection 2024 May.
4
Factorized visual representations in the primate visual system and deep neural networks.灵长类视觉系统和深度神经网络中的因子化视觉表示。
Elife. 2024 Jul 5;13:RP91685. doi: 10.7554/eLife.91685.
5
Manipulating and measuring variation in deep neural network (DNN) representations of objects.操作和测量物体的深度神经网络(DNN)表示中的变异性。
Cognition. 2024 Nov;252:105920. doi: 10.1016/j.cognition.2024.105920. Epub 2024 Aug 19.
6
An ecologically motivated image dataset for deep learning yields better models of human vision.一个受生态学启发的图像数据集,可用于深度学习,从而更好地模拟人类视觉。
Proc Natl Acad Sci U S A. 2021 Feb 23;118(8). doi: 10.1073/pnas.2011417118.
7
Limits to visual representational correspondence between convolutional neural networks and the human brain.卷积神经网络与人类大脑之间视觉表示对应关系的局限性。
Nat Commun. 2021 Apr 6;12(1):2065. doi: 10.1038/s41467-021-22244-7.
8
Probing the link between vision and language in material perception using psychophysics and unsupervised learning.使用心理物理学和无监督学习探究物质感知中视觉和语言之间的联系。
PLoS Comput Biol. 2024 Oct 3;20(10):e1012481. doi: 10.1371/journal.pcbi.1012481. eCollection 2024 Oct.
9
Understanding transformation tolerant visual object representations in the human brain and convolutional neural networks.理解人类大脑和卷积神经网络中对变换具有容忍度的视觉对象表示。
Neuroimage. 2022 Nov;263:119635. doi: 10.1016/j.neuroimage.2022.119635. Epub 2022 Sep 15.
10
Enhancing neural encoding models for naturalistic perception with a multi-level integration of deep neural networks and cortical networks.利用深度神经网络和皮质网络的多层次集成来增强自然感知的神经编码模型。
Sci Bull (Beijing). 2024 Jun 15;69(11):1738-1747. doi: 10.1016/j.scib.2024.02.035. Epub 2024 Feb 29.

引用本文的文献

1
High-level visual representations in the human brain are aligned with large language models.人类大脑中的高级视觉表征与大语言模型相一致。
Nat Mach Intell. 2025;7(8):1220-1234. doi: 10.1038/s42256-025-01072-0. Epub 2025 Aug 7.
2
Universal dimensions of visual representation.视觉表征的通用维度。
Sci Adv. 2025 Jul 4;11(27):eadw7697. doi: 10.1126/sciadv.adw7697. Epub 2025 Jul 2.
3
A simplified minimodel of visual cortical neurons.视觉皮层神经元的简化微模型。

本文引用的文献

1
Contrastive learning explains the emergence and function of visual category-selective regions.对比学习解释了视觉类别选择性区域的出现和功能。
Sci Adv. 2024 Sep 27;10(39):eadl1776. doi: 10.1126/sciadv.adl1776. Epub 2024 Sep 25.
2
Digital Twin Studies for Reverse Engineering the Origins of Visual Intelligence.数字孪生研究用于反向工程视觉智能的起源。
Annu Rev Vis Sci. 2024 Sep;10(1):145-170. doi: 10.1146/annurev-vision-101322-103628.
3
Brain encoding models based on multimodal transformers can transfer across language and vision.
Nat Commun. 2025 Jul 1;16(1):5724. doi: 10.1038/s41467-025-61171-9.
4
Dimensions underlying the representational alignment of deep neural networks with humans.深度神经网络与人类表征对齐背后的维度。
Nat Mach Intell. 2025;7(6):848-859. doi: 10.1038/s42256-025-01041-7. Epub 2025 Jun 23.
5
Representation of locomotive action affordances in human behavior, brains, and deep neural networks.人类行为、大脑和深度神经网络中机车动作可供性的表征。
Proc Natl Acad Sci U S A. 2025 Jun 17;122(24):e2414005122. doi: 10.1073/pnas.2414005122. Epub 2025 Jun 12.
6
Quantifying the roles of visual, linguistic, and visual-linguistic complexity in noun and verb acquisition.量化视觉、语言以及视觉-语言复杂性在名词和动词习得中的作用。
PLoS One. 2025 May 23;20(5):e0321973. doi: 10.1371/journal.pone.0321973. eCollection 2025.
7
Net2Brain: a toolbox to compare artificial vision models with human brain responses.Net2Brain:一个用于比较人工视觉模型与人类大脑反应的工具箱。
Front Neuroinform. 2025 May 6;19:1515873. doi: 10.3389/fninf.2025.1515873. eCollection 2025.
8
Unsupervised alignment reveals structural commonalities and differences in neural representations of natural scenes across individuals and brain areas.无监督对齐揭示了个体和脑区之间自然场景神经表征的结构共性与差异。
iScience. 2025 Apr 15;28(5):112427. doi: 10.1016/j.isci.2025.112427. eCollection 2025 May 16.
9
Contrastive-Equivariant Self-Supervised Learning Improves Alignment with Primate Visual Area IT.对比等变自监督学习改善了与灵长类动物颞下视觉区域的对齐。
Adv Neural Inf Process Syst. 2024;37:96045-96070.
10
Approximating Human-Level 3D Visual Inferences With Deep Neural Networks.利用深度神经网络逼近人类水平的3D视觉推理
Open Mind (Camb). 2025 Feb 16;9:305-324. doi: 10.1162/opmi_a_00189. eCollection 2025.
基于多模态变换器的脑编码模型可以跨语言和视觉进行迁移。
Adv Neural Inf Process Syst. 2023 Dec;36:29654-29666.
4
How well do models of visual cortex generalize to out of distribution samples?视觉皮层模型对分布外样本的泛化能力如何?
PLoS Comput Biol. 2024 May 31;20(5):e1011145. doi: 10.1371/journal.pcbi.1011145. eCollection 2024 May.
5
Diverse task-driven modeling of macaque V4 reveals functional specialization towards semantic tasks.猴 V4 的多样化任务驱动建模揭示了其在语义任务上的功能专业化。
PLoS Comput Biol. 2024 May 23;20(5):e1012056. doi: 10.1371/journal.pcbi.1012056. eCollection 2024 May.
6
Grounded language acquisition through the eyes and ears of a single child.通过单个儿童的眼睛和耳朵进行基础语言习得。
Science. 2024 Feb 2;383(6682):504-511. doi: 10.1126/science.adi1374. Epub 2024 Feb 1.
7
High-performing neural network models of visual cortex benefit from high latent dimensionality.高表现的视觉皮层神经网络模型受益于高潜在维度。
PLoS Comput Biol. 2024 Jan 10;20(1):e1011792. doi: 10.1371/journal.pcbi.1011792. eCollection 2024 Jan.
8
Generalized Shape Metrics on Neural Representations.神经表征上的广义形状度量
Adv Neural Inf Process Syst. 2021 Dec;34:4738-4750.
9
Model metamers reveal divergent invariances between biological and artificial neural networks.模型同型揭示了生物神经网络和人工神经网络之间的不同不变性。
Nat Neurosci. 2023 Nov;26(11):2017-2034. doi: 10.1038/s41593-023-01442-0. Epub 2023 Oct 16.
10
Cortical topographic motifs emerge in a self-organized map of object space.皮质拓扑模式出现在物体空间的自组织映射中。
Sci Adv. 2023 Jun 23;9(25):eade8187. doi: 10.1126/sciadv.ade8187. Epub 2023 Jun 21.