腹侧视觉流的无监督神经网络模型。

Unsupervised neural network models of the ventral visual stream.

机构信息

Department of Psychology, Stanford University, Stanford, CA 94305;

Department of Computer Science, The University of Texas at Austin, Austin, TX 78712.

出版信息

Proc Natl Acad Sci U S A. 2021 Jan 19;118(3). doi: 10.1073/pnas.2014196118.

DOI:10.1073/pnas.2014196118

PMID:33431673

原文链接:https://pmc.ncbi.nlm.nih.gov/articles/PMC7826371/

Abstract

Deep neural networks currently provide the best quantitative models of the response patterns of neurons throughout the primate ventral visual stream. However, such networks have remained implausible as a model of the development of the ventral stream, in part because they are trained with supervised methods requiring many more labels than are accessible to infants during development. Here, we report that recent rapid progress in unsupervised learning has largely closed this gap. We find that neural network models learned with deep unsupervised contrastive embedding methods achieve neural prediction accuracy in multiple ventral visual cortical areas that equals or exceeds that of models derived using today's best supervised methods and that the mapping of these neural network models' hidden layers is neuroanatomically consistent across the ventral stream. Strikingly, we find that these methods produce brain-like representations even when trained solely with real human child developmental data collected from head-mounted cameras, despite the fact that these datasets are noisy and limited. We also find that semisupervised deep contrastive embeddings can leverage small numbers of labeled examples to produce representations with substantially improved error-pattern consistency to human behavior. Taken together, these results illustrate a use of unsupervised learning to provide a quantitative model of a multiarea cortical brain system and present a strong candidate for a biologically plausible computational theory of primate sensory learning.

摘要

深度神经网络目前为灵长类动物腹侧视觉流中神经元反应模式提供了最佳的定量模型。然而，由于这些网络是通过需要比婴儿在发育过程中可获得的多得多的标签进行监督训练的方法进行训练的，因此它们作为腹侧流发育的模型仍然不太可信。在这里，我们报告说，最近无监督学习的快速进展在很大程度上缩小了这一差距。我们发现，使用深度无监督对比嵌入方法学习的神经网络模型在多个腹侧视觉皮质区域中的神经预测准确性等于或超过了使用当今最佳监督方法得出的模型，并且这些神经网络模型的隐藏层的映射在腹侧流中是神经解剖学一致的。引人注目的是，我们发现这些方法甚至在仅使用从头戴式摄像机收集的真实人类儿童发育数据进行训练时也能产生类似大脑的表示，尽管这些数据集存在噪音且有限。我们还发现，半监督深度对比嵌入可以利用少量的标记示例来产生表示，这些表示的错误模式一致性大大提高，与人类行为一致。综上所述，这些结果说明了无监督学习在提供多区域皮质脑系统的定量模型方面的应用，并为灵长类动物感觉学习的生物上合理的计算理论提供了强有力的候选。

https://cdn.ncbi.nlm.nih.gov/pmc/blobs/3964/7826371/332ed8a67c9a/pnas.2014196118fig01.jpg

相似文献

Unsupervised neural network models of the ventral visual stream.

Proc Natl Acad Sci U S A. 2021 Jan 19;118(3). doi: 10.1073/pnas.2014196118.

Non-accidental properties, metric invariance, and encoding by neurons in a model of ventral stream visual object recognition, VisNet.

Neurobiol Learn Mem. 2018 Jul;152:20-31. doi: 10.1016/j.nlm.2018.04.017. Epub 2018 May 1.

A Visual Encoding Model Based on Contrastive Self-Supervised Learning for Human Brain Activity along the Ventral Visual Stream.

Brain Sci. 2021 Jul 29;11(8):1004. doi: 10.3390/brainsci11081004.

Neural population control via deep image synthesis.

Science. 2019 May 3;364(6439). doi: 10.1126/science.aav9436.

Variational autoencoder: An unsupervised model for encoding and decoding fMRI activity in visual cortex.

Neuroimage. 2019 Sep;198:125-136. doi: 10.1016/j.neuroimage.2019.05.039. Epub 2019 May 16.

Performance-optimized hierarchical models predict neural responses in higher visual cortex.

Proc Natl Acad Sci U S A. 2014 Jun 10;111(23):8619-24. doi: 10.1073/pnas.1403112111. Epub 2014 May 8.

Recurrent Connections in the Primate Ventral Visual Stream Mediate a Trade-Off Between Task Performance and Network Size During Core Object Recognition.

Neural Comput. 2022 Jul 14;34(8):1652-1675. doi: 10.1162/neco_a_01506.

View-Tolerant Face Recognition and Hebbian Learning Imply Mirror-Symmetric Neural Tuning to Head Orientation.

Curr Biol. 2017 Jan 9;27(1):62-67. doi: 10.1016/j.cub.2016.10.015. Epub 2016 Dec 1.

The Ventral Visual Pathway Represents Animal Appearance over Animacy, Unlike Human Behavior and Deep Neural Networks.

J Neurosci. 2019 Aug 14;39(33):6513-6525. doi: 10.1523/JNEUROSCI.1714-18.2019. Epub 2019 Jun 13.

Coding of visual objects in the ventral stream.

Curr Opin Neurobiol. 2006 Aug;16(4):408-14. doi: 10.1016/j.conb.2006.06.004. Epub 2006 Jul 7.

引用本文的文献

The detection of algebraic auditory structures emerges with self-supervised learning.

PLoS Comput Biol. 2025 Sep 5;21(9):e1013271. doi: 10.1371/journal.pcbi.1013271. eCollection 2025 Sep.

High-level visual representations in the human brain are aligned with large language models.

Nat Mach Intell. 2025;7(8):1220-1234. doi: 10.1038/s42256-025-01072-0. Epub 2025 Aug 7.

The coming decade of digital brain research: A vision for neuroscience at the intersection of technology and computing.

Imaging Neurosci (Camb). 2024 Apr 18;2. doi: 10.1162/imag_a_00137. eCollection 2024.

Computational models reveal that intuitive physics underlies visual processing of soft objects.

Nat Commun. 2025 Jul 9;16(1):6303. doi: 10.1038/s41467-025-61458-x.

Multiarea processing in body patches of the primate inferotemporal cortex implements inverse graphics.

Proc Natl Acad Sci U S A. 2025 Jul 15;122(28):e2420287122. doi: 10.1073/pnas.2420287122. Epub 2025 Jul 8.

Self-supervised predictive learning accounts for cortical layer-specificity.

Nat Commun. 2025 Jul 4;16(1):6178. doi: 10.1038/s41467-025-61399-5.

Fast and robust visual object recognition in young children.

Sci Adv. 2025 Jul 4;11(27):eads6821. doi: 10.1126/sciadv.ads6821. Epub 2025 Jul 2.

A Multi-Region Brain Model to Elucidate the Role of Hippocampus in Spatially Embedded Decision-Making.

bioRxiv. 2025 May 29:2025.05.29.656671. doi: 10.1101/2025.05.29.656671.

End-to-end topographic networks as models of cortical map formation and human visual behaviour.

Nat Hum Behav. 2025 Jun 6. doi: 10.1038/s41562-025-02220-7.

Net2Brain: a toolbox to compare artificial vision models with human brain responses.

Front Neuroinform. 2025 May 6;19:1515873. doi: 10.3389/fninf.2025.1515873. eCollection 2025.

本文引用的文献

SAYCam: A Large, Longitudinal Audiovisual Dataset Recorded From the Infant's Perspective.

Open Mind (Camb). 2021 May 26;5:20-29. doi: 10.1162/opmi_a_00039. eCollection 2021.

A neural network trained for prediction mimics diverse features of biological neurons and perception.

Nat Mach Intell. 2020 Apr;2(4):210-219. doi: 10.1038/s42256-020-0170-9. Epub 2020 Apr 20.

Attention to Maternal Multimodal Naming by 6- to 8-Month-Old Infants and Learning of Word-Object Relations.

Infancy. 2006 May;9(3):259-288. doi: 10.1207/s15327078in0903_1. Epub 2006 May 1.

Controversial stimuli: Pitting neural networks against each other as models of human cognition.

Proc Natl Acad Sci U S A. 2020 Nov 24;117(47):29330-29337. doi: 10.1073/pnas.1912334117.

A deep learning framework for neuroscience.

Nat Neurosci. 2019 Nov;22(11):1761-1770. doi: 10.1038/s41593-019-0520-2. Epub 2019 Oct 28.

A critique of pure learning and what artificial neural networks can learn from animal brains.

Nat Commun. 2019 Aug 21;10(1):3770. doi: 10.1038/s41467-019-11786-6.

Nonmonotonic Plasticity: How Memory Retrieval Drives Learning.

Trends Cogn Sci. 2019 Sep;23(9):726-742. doi: 10.1016/j.tics.2019.06.007. Epub 2019 Jul 26.

Targeted Memory Reactivation during Sleep Elicits Neural Signals Related to Learning Content.

J Neurosci. 2019 Aug 21;39(34):6728-6736. doi: 10.1523/JNEUROSCI.2798-18.2019. Epub 2019 Jun 24.

Towards a rational constructivist theory of cognitive development.

Psychol Rev. 2019 Nov;126(6):841-864. doi: 10.1037/rev0000153. Epub 2019 Jun 10.

Accurate Estimation of Neural Population Dynamics without Spike Sorting.

Neuron. 2019 Jul 17;103(2):292-308.e4. doi: 10.1016/j.neuron.2019.05.003. Epub 2019 Jun 3.

文献AI研究员

20分钟写一篇综述，助力文献阅读效率提升50倍。

立即体验

用中文搜PubMed

大模型驱动的PubMed中文搜索引擎

马上搜索

文档翻译

学术文献翻译模型，支持多种主流文档格式。

立即体验

腹侧视觉流的无监督神经网络模型。

Unsupervised neural network models of the ventral visual stream.

机构信息

出版信息

相似文献

引用本文的文献

本文引用的文献

文献AI研究员

用中文搜PubMed

文档翻译

Suppr 超能文献

相似文献

引用本文的文献

本文引用的文献