Yerxa Thomas, Feather Jenelle, Simoncelli Eero P, Chung SueYeon
Center for Neural Science, New York University.
Center for Computational Neuroscience, Flatiron Institute, Simons Foundation.
Adv Neural Inf Process Syst. 2024;37:96045-96070.
Models trained with self-supervised learning objectives have recently matched or surpassed models trained with traditional supervised object recognition in their ability to predict neural responses of object-selective neurons in the primate visual system. A self-supervised learning objective is arguably a more biologically plausible organizing principle, as the optimization does not require a large number of labeled examples. However, typical self-supervised objectives may result in network representations that are overly invariant to changes in the input. Here, we show that a representation with structured variability to input transformations is better aligned with known features of visual perception and neural computation. We introduce a novel framework for converting standard invariant SSL losses into "contrastive-equivariant" versions that encourage preservation of input transformations without supervised access to the transformation parameters. We demonstrate that our proposed method systematically increases the ability of models to predict responses in macaque inferior temporal cortex. Our results demonstrate the promise of incorporating known features of neural computation into task-optimization for building better models of visual cortex.
最近,使用自监督学习目标训练的模型在预测灵长类视觉系统中对象选择性神经元的神经反应能力方面,已经达到或超过了使用传统监督对象识别训练的模型。自监督学习目标可以说是一种更符合生物学原理的组织原则,因为优化过程不需要大量带标签的示例。然而,典型的自监督目标可能会导致网络表征对输入变化过度不变。在这里,我们表明,具有结构化可变性的输入变换表征与视觉感知和神经计算的已知特征更相符。我们引入了一个新颖的框架,将标准不变的自监督学习损失转换为“对比等变”版本,该版本鼓励在无监督访问变换参数的情况下保留输入变换。我们证明,我们提出的方法系统地提高了模型预测猕猴颞下皮质反应的能力。我们的结果表明,将神经计算的已知特征纳入任务优化以构建更好的视觉皮层模型具有前景。