Suppr超能文献

人类行为、大脑和深度神经网络中机车动作可供性的表征。

Representation of locomotive action affordances in human behavior, brains, and deep neural networks.

作者信息

Bartnik Clemens G, Sartzetaki Christina, Sanchez Abel Puigseslloses, Molenkamp Elijah, Bommer Steven, Vukšić Nikolina, Groen Iris I A

机构信息

Informatics Institute, Video and Image Sense Lab, University of Amsterdam, Amsterdam 1098 XH, The Netherlands.

Department of Brain and Cognition, Psychology Research Institute, University of Amsterdam, Amsterdam 1018 WS, The Netherlands.

出版信息

Proc Natl Acad Sci U S A. 2025 Jun 17;122(24):e2414005122. doi: 10.1073/pnas.2414005122. Epub 2025 Jun 12.

Abstract

To decide how to move around the world, we must determine which locomotive actions (e.g., walking, swimming, or climbing) are afforded by the immediate visual environment. The neural basis of our ability to recognize locomotive affordances is unknown. Here, we compare human behavioral annotations, functional MRI (fMRI) measurements, and deep neural network (DNN) activations to both indoor and outdoor real-world images to demonstrate that the human visual cortex represents locomotive action affordances in complex visual scenes. Hierarchical clustering of behavioral annotations of six possible locomotive actions show that humans group environments into distinct affordance clusters using at least three separate dimensions. Representational similarity analysis of multivoxel fMRI responses in the scene-selective visual cortex shows that perceived locomotive affordances are represented independently from other scene properties such as objects, surface materials, scene category, or global properties and independent of the task performed in the scanner. Visual feature activations from DNNs trained on object or scene classification as well as a range of other visual understanding tasks correlate comparatively lower with behavioral and neural representations of locomotive affordances than with object representations. Training DNNs directly on affordance labels or using affordance-centered language embeddings increases alignment with human behavior, but none of the tested models fully captures locomotive action affordance perception. These results uncover a type of representation in the human brain that reflects locomotive action affordances.

摘要

为了决定如何在世界各地移动,我们必须确定当前视觉环境提供了哪些运动行为(例如,行走、游泳或攀爬)。我们识别运动行为可能性的神经基础尚不清楚。在这里,我们将人类行为注释、功能磁共振成像(fMRI)测量以及深度神经网络(DNN)激活与室内和室外真实世界图像进行比较,以证明人类视觉皮层在复杂视觉场景中表征运动行为可能性。对六种可能的运动行为的行为注释进行层次聚类表明,人类至少使用三个独立维度将环境分组为不同的可能性集群。对场景选择性视觉皮层中多体素fMRI反应的表征相似性分析表明,感知到的运动行为可能性与其他场景属性(如物体、表面材料、场景类别或全局属性)以及扫描仪中执行的任务无关,是独立表征的。在物体或场景分类以及一系列其他视觉理解任务上训练的DNN的视觉特征激活与运动行为可能性的行为和神经表征的相关性相对低于与物体表征的相关性。直接在可能性标签上训练DNN或使用以可能性为中心的语言嵌入会增加与人类行为的一致性,但没有一个测试模型能完全捕捉运动行为可能性感知。这些结果揭示了人类大脑中一种反映运动行为可能性的表征类型。

https://cdn.ncbi.nlm.nih.gov/pmc/blobs/93f8/12184334/edc2a172671d/pnas.2414005122fig01.jpg

文献AI研究员

20分钟写一篇综述,助力文献阅读效率提升50倍。

立即体验

用中文搜PubMed

大模型驱动的PubMed中文搜索引擎

马上搜索

文档翻译

学术文献翻译模型,支持多种主流文档格式。

立即体验