Chen Zirui, Bonner Michael F
Department of Cognitive Science, Johns Hopkins University, Baltimore 21218, USA.
Sci Adv. 2025 Jul 4;11(27):eadw7697. doi: 10.1126/sciadv.adw7697. Epub 2025 Jul 2.
Do visual neural networks learn brain-aligned representations because they share architectural constraints and task objectives with biological vision or because they share universal features of natural image processing? We characterized the universality of hundreds of thousands of representational dimensions from networks with different architectures, tasks, and training data. We found that diverse networks learn to represent natural images using a shared set of latent dimensions, despite having highly distinct designs. Next, by comparing these networks with human brain representations measured with functional magnetic resonance imaging, we found that the most brain-aligned representations in neural networks are those that are universal and independent of a network's specific characteristics. Each network can be reduced to fewer than 10 of its most universal dimensions with little impact on its representational similarity to the brain. These results suggest that the underlying similarities between artificial and biological vision are primarily governed by a core set of universal representations that are convergently learned by diverse systems.
视觉神经网络学习与大脑对齐的表征,是因为它们与生物视觉共享架构约束和任务目标,还是因为它们共享自然图像处理的通用特征?我们对来自具有不同架构、任务和训练数据的网络的数十万表征维度的普遍性进行了表征。我们发现,尽管设计截然不同,但各种网络都使用一组共享的潜在维度来学习表征自然图像。接下来,通过将这些网络与通过功能磁共振成像测量的人类大脑表征进行比较,我们发现神经网络中与大脑最对齐的表征是那些通用的且独立于网络特定特征的表征。每个网络都可以简化为少于10个最通用的维度,而对其与大脑的表征相似性几乎没有影响。这些结果表明,人工视觉和生物视觉之间的潜在相似性主要由一组核心的通用表征所支配,这些表征是由不同系统趋同学习得到的。