Department of Cognitive Science, Johns Hopkins University, Baltimore, USA.
Department of Computer Science, Western University, London, Canada.
Sci Rep. 2023 Mar 30;13(1):5171. doi: 10.1038/s41598-023-32192-5.
Understanding actions performed by others requires us to integrate different types of information about people, scenes, objects, and their interactions. What organizing dimensions does the mind use to make sense of this complex action space? To address this question, we collected intuitive similarity judgments across two large-scale sets of naturalistic videos depicting everyday actions. We used cross-validated sparse non-negative matrix factorization to identify the structure underlying action similarity judgments. A low-dimensional representation, consisting of nine to ten dimensions, was sufficient to accurately reconstruct human similarity judgments. The dimensions were robust to stimulus set perturbations and reproducible in a separate odd-one-out experiment. Human labels mapped these dimensions onto semantic axes relating to food, work, and home life; social axes relating to people and emotions; and one visual axis related to scene setting. While highly interpretable, these dimensions did not share a clear one-to-one correspondence with prior hypotheses of action-relevant dimensions. Together, our results reveal a low-dimensional set of robust and interpretable dimensions that organize intuitive action similarity judgments and highlight the importance of data-driven investigations of behavioral representations.
理解他人的行为需要我们整合关于人、场景、物体及其相互作用的不同类型的信息。思维使用什么组织维度来理解这个复杂的动作空间?为了解决这个问题,我们在两个大规模的自然主义视频数据集之间收集了直观的相似性判断。我们使用交叉验证稀疏非负矩阵分解来识别动作相似性判断的基础结构。一个由九到十个维度组成的低维表示足以准确地重建人类的相似性判断。这些维度对刺激集的扰动具有鲁棒性,并且在单独的异常值实验中具有可重现性。人类标签将这些维度映射到与食物、工作和家庭生活相关的语义轴上;与人及情绪相关的社会轴;以及一个与场景设置相关的视觉轴。虽然这些维度具有高度的可解释性,但它们与之前关于与动作相关维度的假设没有明确的一一对应关系。总之,我们的研究结果揭示了一组低维的、稳健的、可解释的维度,这些维度可以组织直观的动作相似性判断,并强调了对行为表示进行数据驱动研究的重要性。