Department of Psychology, Vanderbilt University, USA.
Department of Psychology, Vanderbilt University, USA.
Cognition. 2024 Nov;252:105920. doi: 10.1016/j.cognition.2024.105920. Epub 2024 Aug 19.
We explore how DNNs can be used to develop a computational understanding of individual differences in high-level visual cognition given their ability to generate rich meaningful object representations informed by their architecture, experience, and training protocols. As a first step to quantifying individual differences in DNN representations, we systematically explored the robustness of a variety of representational similarity measures: Representational Similarity Analysis (RSA), Centered Kernel Alignment (CKA), and Projection-Weighted Canonical Correlation Analysis (PWCCA), with an eye to how these measures are used in cognitive science, cognitive neuroscience, and vision science. To manipulate object representations, we next created a large set of models varying in random initial weights and random training image order, training image frequencies, training category frequencies, and model size and architecture and measured the representational variation caused by each manipulation. We examined both small (All-CNN-C) and commonly-used large (VGG and ResNet) DNN architectures. To provide a comparison for the magnitude of representational differences, we established a baseline based on the representational variation caused by image-augmentation techniques used to train those DNNs. We found that variation in model randomization and model size never exceeded baseline. By contrast, differences in training image frequency and training category frequencies caused representational variation that exceeded baseline, with training category frequency manipulations exceeding baseline earlier in the networks. These findings provide insights into the magnitude of representational variations that can be expected with a range of manipulations and provide a springboard for further exploration of systematic model variations aimed at modeling individual differences in high-level visual cognition.
我们探索了如何利用 DNN 来开发对高级视觉认知个体差异的计算理解,考虑到 DNN 能够生成丰富而有意义的物体表示,这些表示受到其架构、经验和训练协议的影响。作为量化 DNN 表示个体差异的第一步,我们系统地探索了各种表示相似性度量的稳健性:表示相似性分析(RSA)、中心核对准(CKA)和投影加权典范相关分析(PWCCA),着眼于这些度量在认知科学、认知神经科学和视觉科学中的应用。为了操纵物体表示,我们接下来创建了一组大型模型,这些模型在随机初始权重和随机训练图像顺序、训练图像频率、训练类别频率以及模型大小和架构方面有所不同,并测量了每种操作引起的表示变化。我们研究了小型(All-CNN-C)和常用的大型(VGG 和 ResNet)DNN 架构。为了提供表示差异幅度的比较,我们基于用于训练那些 DNN 的图像增强技术引起的表示变化建立了一个基线。我们发现,模型随机化和模型大小的变化从未超过基线。相比之下,训练图像频率和训练类别频率的差异引起了超过基线的表示变化,网络中较早的训练类别频率操作超过了基线。这些发现深入了解了可以预期的各种操作引起的表示变化幅度,并为进一步探索旨在模拟高级视觉认知个体差异的系统模型变化提供了跳板。