Suppr超能文献

深度神经网络与人类表征对齐背后的维度。

Dimensions underlying the representational alignment of deep neural networks with humans.

作者信息

Mahner Florian P, Muttenthaler Lukas, Güçlü Umut, Hebart Martin N

机构信息

Vision and Computational Cognition Group, Max Planck Institute for Human Cognitive and Brain Sciences, Leipzig, Germany.

Donders Institute for Brain, Cognition and Behaviour, Nijmegen, The Netherlands.

出版信息

Nat Mach Intell. 2025;7(6):848-859. doi: 10.1038/s42256-025-01041-7. Epub 2025 Jun 23.

Abstract

Determining the similarities and differences between humans and artificial intelligence (AI) is an important goal in both computational cognitive neuroscience and machine learning, promising a deeper understanding of human cognition and safer, more reliable AI systems. Much previous work comparing representations in humans and AI has relied on global, scalar measures to quantify their alignment. However, without explicit hypotheses, these measures only inform us about the degree of alignment, not the factors that determine it. To address this challenge, we propose a generic framework to compare human and AI representations, based on identifying latent representational dimensions underlying the same behaviour in both domains. Applying this framework to humans and a deep neural network (DNN) model of natural images revealed a low-dimensional DNN embedding of both visual and semantic dimensions. In contrast to humans, DNNs exhibited a clear dominance of visual over semantic properties, indicating divergent strategies for representing images. Although in silico experiments showed seemingly consistent interpretability of DNN dimensions, a direct comparison between human and DNN representations revealed substantial differences in how they process images. By making representations directly comparable, our results reveal important challenges for representational alignment and offer a means for improving their comparability.

摘要

确定人类与人工智能(AI)之间的异同是计算认知神经科学和机器学习领域的一个重要目标,有望加深对人类认知的理解,并打造更安全、更可靠的人工智能系统。此前许多比较人类与人工智能表征的研究都依赖全局标量指标来量化二者的一致性。然而,在没有明确假设的情况下,这些指标只能告诉我们一致性程度,却无法揭示决定这种一致性的因素。为应对这一挑战,我们提出了一个通用框架,通过识别两个领域中相同行为背后的潜在表征维度,来比较人类与人工智能的表征。将该框架应用于人类和自然图像的深度神经网络(DNN)模型,发现了一个同时包含视觉和语义维度的低维DNN嵌入。与人类不同,DNN表现出视觉属性对语义属性的明显主导,这表明二者在图像表征策略上存在差异。尽管计算机模拟实验显示DNN维度具有看似一致的可解释性,但人类与DNN表征的直接比较揭示了它们在处理图像方式上的显著差异。通过使表征具有直接可比性,我们的研究结果揭示了表征对齐面临的重要挑战,并提供了一种提高其可比性的方法。

https://cdn.ncbi.nlm.nih.gov/pmc/blobs/9b4b/12185338/ef40032fb5dc/42256_2025_1041_Fig1_HTML.jpg

文献AI研究员

20分钟写一篇综述,助力文献阅读效率提升50倍。

立即体验

用中文搜PubMed

大模型驱动的PubMed中文搜索引擎

马上搜索

文档翻译

学术文献翻译模型,支持多种主流文档格式。

立即体验