• 文献检索
  • 文档翻译
  • 深度研究
  • 学术资讯
  • Suppr Zotero 插件Zotero 插件
  • 邀请有礼
  • 套餐&价格
  • 历史记录
应用&插件
Suppr Zotero 插件Zotero 插件浏览器插件Mac 客户端Windows 客户端微信小程序
定价
高级版会员购买积分包购买API积分包
服务
文献检索文档翻译深度研究API 文档MCP 服务
关于我们
关于 Suppr公司介绍联系我们用户协议隐私条款
关注我们

Suppr 超能文献

核心技术专利:CN118964589B侵权必究
粤ICP备2023148730 号-1Suppr @ 2026

文献检索

告别复杂PubMed语法,用中文像聊天一样搜索,搜遍4000万医学文献。AI智能推荐,让科研检索更轻松。

立即免费搜索

文件翻译

保留排版,准确专业,支持PDF/Word/PPT等文件格式,支持 12+语言互译。

免费翻译文档

深度研究

AI帮你快速写综述,25分钟生成高质量综述,智能提取关键信息,辅助科研写作。

立即免费体验

生物面部处理中的高效反向图形。

Efficient inverse graphics in biological face processing.

机构信息

Department of Brain and Cognitive Sciences, MIT, Cambridge, MA, USA.

Department of Psychology, Yale University, New Haven, CT, USA.

出版信息

Sci Adv. 2020 Mar 4;6(10):eaax5979. doi: 10.1126/sciadv.aax5979. eCollection 2020 Mar.

DOI:10.1126/sciadv.aax5979
PMID:32181338
原文链接:https://pmc.ncbi.nlm.nih.gov/articles/PMC7056304/
Abstract

Vision not only detects and recognizes objects, but performs rich inferences about the underlying scene structure that causes the patterns of light we see. Inverting generative models, or "analysis-by-synthesis", presents a possible solution, but its mechanistic implementations have typically been too slow for online perception, and their mapping to neural circuits remains unclear. Here we present a neurally plausible efficient inverse graphics model and test it in the domain of face recognition. The model is based on a deep neural network that learns to invert a three-dimensional face graphics program in a single fast feedforward pass. It explains human behavior qualitatively and quantitatively, including the classic "hollow face" illusion, and it maps directly onto a specialized face-processing circuit in the primate brain. The model fits both behavioral and neural data better than state-of-the-art computer vision models, and suggests an interpretable reverse-engineering account of how the brain transforms images into percepts.

摘要

视觉不仅能检测和识别物体,还能对导致我们所看到的光图案的潜在场景结构进行丰富的推断。生成模型的反转,或“分析-综合”,提供了一个可能的解决方案,但它的机械实现通常对于在线感知来说太慢了,并且其映射到神经回路仍然不清楚。在这里,我们提出了一个神经上合理的高效反向图形模型,并在人脸识别领域进行了测试。该模型基于一个深度神经网络,它可以在单个快速前馈过程中学习反转三维人脸图形程序。它从定性和定量两个方面解释了人类的行为,包括经典的“空心脸”错觉,并且它可以直接映射到灵长类动物大脑中的专门面部处理电路上。该模型比最先进的计算机视觉模型更能拟合行为和神经数据,并为大脑如何将图像转化为感知提供了一种可解释的逆向工程解释。

https://cdn.ncbi.nlm.nih.gov/pmc/blobs/b25a/7056304/339a061ddd83/aax5979-F6.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/b25a/7056304/cc5889491cba/aax5979-F1.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/b25a/7056304/ba18f3a7a5b3/aax5979-F2.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/b25a/7056304/ad623aa6d69b/aax5979-F3.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/b25a/7056304/77973e1b5136/aax5979-F4.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/b25a/7056304/f42c649270d6/aax5979-F5.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/b25a/7056304/339a061ddd83/aax5979-F6.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/b25a/7056304/cc5889491cba/aax5979-F1.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/b25a/7056304/ba18f3a7a5b3/aax5979-F2.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/b25a/7056304/ad623aa6d69b/aax5979-F3.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/b25a/7056304/77973e1b5136/aax5979-F4.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/b25a/7056304/f42c649270d6/aax5979-F5.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/b25a/7056304/339a061ddd83/aax5979-F6.jpg

相似文献

1
Efficient inverse graphics in biological face processing.生物面部处理中的高效反向图形。
Sci Adv. 2020 Mar 4;6(10):eaax5979. doi: 10.1126/sciadv.aax5979. eCollection 2020 Mar.
2
Large-Scale, High-Resolution Comparison of the Core Visual Object Recognition Behavior of Humans, Monkeys, and State-of-the-Art Deep Artificial Neural Networks.大规模、高分辨率的人类、猴子和最先进的深度人工神经网络核心视觉对象识别行为比较。
J Neurosci. 2018 Aug 15;38(33):7255-7269. doi: 10.1523/JNEUROSCI.0388-18.2018. Epub 2018 Jul 13.
3
Vision as temporal trace.视觉作为时间痕迹。
Spat Vis. 2000;13(2-3):215-29. doi: 10.1163/156856800741225.
4
A single glance at natural face images generate larger and qualitatively different category-selective spatio-temporal signatures than other ecologically-relevant categories in the human brain.与人类大脑中其他与生态相关的类别相比,对自然面部图像的单次扫视会产生更大且在质量上不同的类别选择性时空特征。
Neuroimage. 2016 Aug 15;137:21-33. doi: 10.1016/j.neuroimage.2016.04.045. Epub 2016 Apr 30.
5
Deep neural networks rival the representation of primate IT cortex for core visual object recognition.深度神经网络在核心视觉目标识别方面可与灵长类动物的颞下皮质表征相媲美。
PLoS Comput Biol. 2014 Dec 18;10(12):e1003963. doi: 10.1371/journal.pcbi.1003963. eCollection 2014 Dec.
6
The fusiform face area responds equivalently to faces and abstract shapes in the left and central visual fields.梭状回面孔区对左视野和中央视野中的面孔及抽象形状有同等反应。
Neuroimage. 2013 Dec;83:408-17. doi: 10.1016/j.neuroimage.2013.06.032. Epub 2013 Jun 15.
7
Visual Object Recognition: Do We (Finally) Know More Now Than We Did?视觉物体识别:我们(终于)比以前知道得更多了吗?
Annu Rev Vis Sci. 2016 Oct 14;2:377-396. doi: 10.1146/annurev-vision-111815-114621. Epub 2016 Aug 3.
8
Unsupervised deep learning identifies semantic disentanglement in single inferotemporal face patch neurons.无监督深度学习可识别单个下颞叶面孔神经元中的语义分离。
Nat Commun. 2021 Nov 9;12(1):6456. doi: 10.1038/s41467-021-26751-5.
9
Single-unit activity during natural vision: diversity, consistency, and spatial sensitivity among AF face patch neurons.自然视觉过程中的单神经元活动:AF 面部区域神经元的多样性、一致性和空间敏感性
J Neurosci. 2015 Apr 8;35(14):5537-48. doi: 10.1523/JNEUROSCI.3825-14.2015.
10
The speed of sight.视觉的速度。
J Cogn Neurosci. 2001 Jan 1;13(1):90-101. doi: 10.1162/089892901564199.

引用本文的文献

1
LGD_Net: Capsule network with extreme learning machine for classification of lung diseases using CT scans.LGD_Net:用于使用CT扫描对肺部疾病进行分类的带极限学习机的胶囊网络。
PLoS One. 2025 Aug 8;20(8):e0327419. doi: 10.1371/journal.pone.0327419. eCollection 2025.
2
Computational models reveal that intuitive physics underlies visual processing of soft objects.计算模型表明,直观物理学是软物体视觉处理的基础。
Nat Commun. 2025 Jul 9;16(1):6303. doi: 10.1038/s41467-025-61458-x.
3
Multiarea processing in body patches of the primate inferotemporal cortex implements inverse graphics.

本文引用的文献

1
Modelling face memory reveals task-generalizable representations.建模面部记忆揭示了任务通用性的表示。
Nat Hum Behav. 2019 Aug;3(8):817-826. doi: 10.1038/s41562-019-0625-3. Epub 2019 Jun 17.
2
GANimation: Anatomically-aware Facial Animation from a Single Image.GANimation:基于单张图像的解剖学感知面部动画
Comput Vis ECCV. 2018 Sep;11214:835-851. doi: 10.1007/978-3-030-01249-6_50. Epub 2018 Oct 6.
3
A Simple, Fast and Highly-Accurate Algorithm to Recover 3D Shape from 2D Landmarks on a Single Image.一种简单、快速且高精度的算法,可从单张图像上的 2D 地标中恢复 3D 形状。
灵长类动物颞下皮质身体区域的多区域处理实现了逆图形。
Proc Natl Acad Sci U S A. 2025 Jul 15;122(28):e2420287122. doi: 10.1073/pnas.2420287122. Epub 2025 Jul 8.
4
Approximating Human-Level 3D Visual Inferences With Deep Neural Networks.利用深度神经网络逼近人类水平的3D视觉推理
Open Mind (Camb). 2025 Feb 16;9:305-324. doi: 10.1162/opmi_a_00189. eCollection 2025.
5
A novel, rapid, quantitative method for face discrimination.一种新颖、快速的面部识别定量方法。
PLoS One. 2024 Dec 23;19(12):e0315998. doi: 10.1371/journal.pone.0315998. eCollection 2024.
6
Building machines that learn and think with people.与人类一起学习和思考的机器。
Nat Hum Behav. 2024 Oct;8(10):1851-1863. doi: 10.1038/s41562-024-01991-9. Epub 2024 Oct 22.
7
Contrastive learning explains the emergence and function of visual category-selective regions.对比学习解释了视觉类别选择性区域的出现和功能。
Sci Adv. 2024 Sep 27;10(39):eadl1776. doi: 10.1126/sciadv.adl1776. Epub 2024 Sep 25.
8
Drawing as a versatile cognitive tool.绘画作为一种多功能的认知工具。
Nat Rev Psychol. 2023 Sep;2(9):556-568. doi: 10.1038/s44159-023-00212-w. Epub 2023 Jul 17.
9
The attentive reconstruction of objects facilitates robust object recognition.对物体的细致重建有助于实现可靠的物体识别。
PLoS Comput Biol. 2024 Jun 13;20(6):e1012159. doi: 10.1371/journal.pcbi.1012159. eCollection 2024 Jun.
10
Computational reconstruction of mental representations using human behavior.使用人类行为进行心理表象的计算重建。
Nat Commun. 2024 May 17;15(1):4183. doi: 10.1038/s41467-024-48114-6.
IEEE Trans Pattern Anal Mach Intell. 2018 Dec;40(12):3059-3066. doi: 10.1109/TPAMI.2017.2772922. Epub 2017 Nov 13.
4
Neural scene representation and rendering.神经场景表示与渲染。
Science. 2018 Jun 15;360(6394):1204-1210. doi: 10.1126/science.aar6170.
5
A generative vision model that trains with high data efficiency and breaks text-based CAPTCHAs.一种具有高效数据利用能力的生成式视觉模型,可破解基于文本的验证码。
Science. 2017 Dec 8;358(6368). doi: 10.1126/science.aag2612. Epub 2017 Oct 26.
6
Visual shape perception as Bayesian inference of 3D object-centered shape representations.基于 3D 以物体为中心的形状表示的贝叶斯推断的视觉形状感知。
Psychol Rev. 2017 Nov;124(6):740-761. doi: 10.1037/rev0000086. Epub 2017 Sep 14.
7
Two areas for familiar face recognition in the primate brain.灵长类大脑中用于识别熟悉面孔的两个区域。
Science. 2017 Aug 11;357(6351):591-595. doi: 10.1126/science.aan1139.
8
The Code for Facial Identity in the Primate Brain.灵长类大脑中的面部识别编码
Cell. 2017 Jun 1;169(6):1013-1028.e14. doi: 10.1016/j.cell.2017.05.011.
9
Representational models: A common framework for understanding encoding, pattern-component, and representational-similarity analysis.表征模型:理解编码、模式成分和表征相似性分析的通用框架。
PLoS Comput Biol. 2017 Apr 24;13(4):e1005508. doi: 10.1371/journal.pcbi.1005508. eCollection 2017 Apr.
10
View-Tolerant Face Recognition and Hebbian Learning Imply Mirror-Symmetric Neural Tuning to Head Orientation.视角容忍人脸识别与赫布学习意味着头部朝向的镜像对称神经调节。
Curr Biol. 2017 Jan 9;27(1):62-67. doi: 10.1016/j.cub.2016.10.015. Epub 2016 Dec 1.