Department of Psychology, Stanford University, Stanford, CA 94305;
Department of Computer Science, The University of Texas at Austin, Austin, TX 78712.
Proc Natl Acad Sci U S A. 2021 Jan 19;118(3). doi: 10.1073/pnas.2014196118.
Deep neural networks currently provide the best quantitative models of the response patterns of neurons throughout the primate ventral visual stream. However, such networks have remained implausible as a model of the development of the ventral stream, in part because they are trained with supervised methods requiring many more labels than are accessible to infants during development. Here, we report that recent rapid progress in unsupervised learning has largely closed this gap. We find that neural network models learned with deep unsupervised contrastive embedding methods achieve neural prediction accuracy in multiple ventral visual cortical areas that equals or exceeds that of models derived using today's best supervised methods and that the mapping of these neural network models' hidden layers is neuroanatomically consistent across the ventral stream. Strikingly, we find that these methods produce brain-like representations even when trained solely with real human child developmental data collected from head-mounted cameras, despite the fact that these datasets are noisy and limited. We also find that semisupervised deep contrastive embeddings can leverage small numbers of labeled examples to produce representations with substantially improved error-pattern consistency to human behavior. Taken together, these results illustrate a use of unsupervised learning to provide a quantitative model of a multiarea cortical brain system and present a strong candidate for a biologically plausible computational theory of primate sensory learning.
深度神经网络目前为灵长类动物腹侧视觉流中神经元反应模式提供了最佳的定量模型。然而,由于这些网络是通过需要比婴儿在发育过程中可获得的多得多的标签进行监督训练的方法进行训练的,因此它们作为腹侧流发育的模型仍然不太可信。在这里,我们报告说,最近无监督学习的快速进展在很大程度上缩小了这一差距。我们发现,使用深度无监督对比嵌入方法学习的神经网络模型在多个腹侧视觉皮质区域中的神经预测准确性等于或超过了使用当今最佳监督方法得出的模型,并且这些神经网络模型的隐藏层的映射在腹侧流中是神经解剖学一致的。引人注目的是,我们发现这些方法甚至在仅使用从头戴式摄像机收集的真实人类儿童发育数据进行训练时也能产生类似大脑的表示,尽管这些数据集存在噪音且有限。我们还发现,半监督深度对比嵌入可以利用少量的标记示例来产生表示,这些表示的错误模式一致性大大提高,与人类行为一致。综上所述,这些结果说明了无监督学习在提供多区域皮质脑系统的定量模型方面的应用,并为灵长类动物感觉学习的生物上合理的计算理论提供了强有力的候选。