• 文献检索
  • 文档翻译
  • 深度研究
  • 学术资讯
  • Suppr Zotero 插件Zotero 插件
  • 邀请有礼
  • 套餐&价格
  • 历史记录
应用&插件
Suppr Zotero 插件Zotero 插件浏览器插件Mac 客户端Windows 客户端微信小程序
定价
高级版会员购买积分包购买API积分包
服务
文献检索文档翻译深度研究API 文档MCP 服务
关于我们
关于 Suppr公司介绍联系我们用户协议隐私条款
关注我们

Suppr 超能文献

核心技术专利:CN118964589B侵权必究
粤ICP备2023148730 号-1Suppr @ 2026

文献检索

告别复杂PubMed语法,用中文像聊天一样搜索,搜遍4000万医学文献。AI智能推荐,让科研检索更轻松。

立即免费搜索

文件翻译

保留排版,准确专业,支持PDF/Word/PPT等文件格式,支持 12+语言互译。

免费翻译文档

深度研究

AI帮你快速写综述,25分钟生成高质量综述,智能提取关键信息,辅助科研写作。

立即免费体验

用神经网络捕捉视觉对象。

Capturing the objects of vision with neural networks.

作者信息

Peters Benjamin, Kriegeskorte Nikolaus

机构信息

Mortimer B. Zuckerman Mind Brain Behavior Institute, Columbia University, New York, NY, USA.

Department of Psychology, Columbia University, New York, NY, USA.

出版信息

Nat Hum Behav. 2021 Sep;5(9):1127-1144. doi: 10.1038/s41562-021-01194-6. Epub 2021 Sep 20.

DOI:10.1038/s41562-021-01194-6
PMID:34545237
Abstract

Human visual perception carves a scene at its physical joints, decomposing the world into objects, which are selectively attended, tracked and predicted as we engage our surroundings. Object representations emancipate perception from the sensory input, enabling us to keep in mind that which is out of sight and to use perceptual content as a basis for action and symbolic cognition. Human behavioural studies have documented how object representations emerge through grouping, amodal completion, proto-objects and object files. By contrast, deep neural network models of visual object recognition remain largely tethered to sensory input, despite achieving human-level performance at labelling objects. Here, we review related work in both fields and examine how these fields can help each other. The cognitive literature provides a starting point for the development of new experimental tasks that reveal mechanisms of human object perception and serve as benchmarks driving the development of deep neural network models that will put the object into object recognition.

摘要

人类视觉感知在物理节点处剖析场景,将世界分解为物体,在我们与周围环境互动时,这些物体被有选择地关注、追踪和预测。物体表征使感知从感官输入中解放出来,使我们能够记住不在视线范围内的事物,并将感知内容作为行动和符号认知的基础。人类行为研究记录了物体表征是如何通过分组、非模态完成、原型物体和物体档案而出现的。相比之下,视觉物体识别的深度神经网络模型尽管在物体标注方面达到了人类水平的表现,但在很大程度上仍与感官输入相关联。在这里,我们回顾这两个领域的相关工作,并研究这些领域如何相互帮助。认知文献为开发新的实验任务提供了一个起点,这些任务揭示了人类物体感知的机制,并作为推动深度神经网络模型发展的基准,这些模型将使物体识别名副其实。

相似文献

1
Capturing the objects of vision with neural networks.用神经网络捕捉视觉对象。
Nat Hum Behav. 2021 Sep;5(9):1127-1144. doi: 10.1038/s41562-021-01194-6. Epub 2021 Sep 20.
2
Unsupervised changes in core object recognition behavior are predicted by neural plasticity in inferior temporal cortex.无监督的核心物体识别行为变化可由下颞叶皮层的神经可塑性预测。
Elife. 2021 Jun 11;10:e60830. doi: 10.7554/eLife.60830.
3
Large-Scale, High-Resolution Comparison of the Core Visual Object Recognition Behavior of Humans, Monkeys, and State-of-the-Art Deep Artificial Neural Networks.大规模、高分辨率的人类、猴子和最先进的深度人工神经网络核心视觉对象识别行为比较。
J Neurosci. 2018 Aug 15;38(33):7255-7269. doi: 10.1523/JNEUROSCI.0388-18.2018. Epub 2018 Jul 13.
4
Accuracy of Rats in Discriminating Visual Objects Is Explained by the Complexity of Their Perceptual Strategy.老鼠辨别视觉物体的准确性可以用其感知策略的复杂性来解释。
Curr Biol. 2018 Apr 2;28(7):1005-1015.e5. doi: 10.1016/j.cub.2018.02.037. Epub 2018 Mar 15.
5
Common Object Representations for Visual Production and Recognition.常见的视觉生成和识别的目标表示。
Cogn Sci. 2018 Nov;42(8):2670-2698. doi: 10.1111/cogs.12676. Epub 2018 Aug 20.
6
Visual Object Recognition: Do We (Finally) Know More Now Than We Did?视觉物体识别:我们(终于)比以前知道得更多了吗?
Annu Rev Vis Sci. 2016 Oct 14;2:377-396. doi: 10.1146/annurev-vision-111815-114621. Epub 2016 Aug 3.
7
The Ventral Visual Pathway Represents Animal Appearance over Animacy, Unlike Human Behavior and Deep Neural Networks.腹侧视觉通路代表动物的外观而不是能动性,与人类行为和深度神经网络不同。
J Neurosci. 2019 Aug 14;39(33):6513-6525. doi: 10.1523/JNEUROSCI.1714-18.2019. Epub 2019 Jun 13.
8
Atoms of recognition in human and computer vision.人类视觉与计算机视觉中的识别原子。
Proc Natl Acad Sci U S A. 2016 Mar 8;113(10):2744-9. doi: 10.1073/pnas.1513198113. Epub 2016 Feb 16.
9
Invariant recognition drives neural representations of action sequences.不变性识别驱动动作序列的神经表征。
PLoS Comput Biol. 2017 Dec 18;13(12):e1005859. doi: 10.1371/journal.pcbi.1005859. eCollection 2017 Dec.
10
Relating Visual Production and Recognition of Objects in Human Visual Cortex.人类视觉皮层中物体的视觉产生与识别的关系。
J Neurosci. 2020 Feb 19;40(8):1710-1721. doi: 10.1523/JNEUROSCI.1843-19.2019. Epub 2019 Dec 23.

引用本文的文献

1
Grouping signals in primate visual cortex.灵长类动物视觉皮层中的信号分组
Neuron. 2025 Aug 6;113(15):2508-2520.e5. doi: 10.1016/j.neuron.2025.05.003. Epub 2025 Jun 2.
2
Parallelizing analog in-sensor visual processing with arrays of gate-tunable silicon photodetectors.利用栅极可调硅光电探测器阵列并行化模拟传感器内视觉处理
Nat Commun. 2025 May 21;16(1):4728. doi: 10.1038/s41467-025-60006-x.
3
Lightweight error-tolerant edge detection using memristor-enabled stochastic computing.使用忆阻器实现的随机计算进行轻量级容错边缘检测。

本文引用的文献

1
SAYCam: A Large, Longitudinal Audiovisual Dataset Recorded From the Infant's Perspective.SAYCam:一个从婴儿视角记录的大型纵向视听数据集。
Open Mind (Camb). 2021 May 26;5:20-29. doi: 10.1162/opmi_a_00039. eCollection 2021.
2
Controversial stimuli: Pitting neural networks against each other as models of human cognition.有争议的刺激:将神经网络作为人类认知模型进行相互竞争。
Proc Natl Acad Sci U S A. 2020 Nov 24;117(47):29330-29337. doi: 10.1073/pnas.1912334117.
3
Rapid trial-and-error learning with simulation supports flexible tool use and physical reasoning.
Nat Commun. 2025 May 16;16(1):4550. doi: 10.1038/s41467-025-59872-2.
4
Beyond binding: from modular to natural vision.超越绑定:从模块化视觉到自然视觉。
Trends Cogn Sci. 2025 Jun;29(6):505-515. doi: 10.1016/j.tics.2025.03.002. Epub 2025 Apr 14.
5
The influence of a moving object's location on object identity judgments.移动物体的位置对物体身份判断的影响。
J Exp Psychol Hum Percept Perform. 2025 Jun;51(6):764-780. doi: 10.1037/xhp0001311. Epub 2025 Mar 27.
6
Hyperbolic vision language representation learning on chest radiology images.胸部X光图像的双曲视觉语言表征学习
Health Inf Sci Syst. 2025 Mar 9;13(1):27. doi: 10.1007/s13755-025-00341-x. eCollection 2025 Dec.
7
Unraveling the complexity of rat object vision requires a full convolutional network and beyond.剖析大鼠物体视觉的复杂性需要一个全卷积网络及其他技术。
Patterns (N Y). 2025 Jan 17;6(2):101149. doi: 10.1016/j.patter.2024.101149. eCollection 2025 Feb 14.
8
Orthogonal neural representations support perceptual judgments of natural stimuli.正交神经表征支持对自然刺激的感知判断。
Sci Rep. 2025 Feb 13;15(1):5316. doi: 10.1038/s41598-025-88910-8.
9
A robotics-inspired scanpath model reveals the importance of uncertainty and semantic object cues for gaze guidance in dynamic scenes.一种受机器人启发的扫描路径模型揭示了不确定性和语义对象线索对动态场景中注视引导的重要性。
J Vis. 2025 Feb 3;25(2):6. doi: 10.1167/jov.25.2.6.
10
Efficient Carbon-Based Optoelectronic Synapses for Dynamic Visual Recognition.用于动态视觉识别的高效碳基光电突触
Adv Sci (Weinh). 2025 Mar;12(11):e2414319. doi: 10.1002/advs.202414319. Epub 2025 Jan 22.
快速试错学习与模拟支持灵活的工具使用和物理推理。
Proc Natl Acad Sci U S A. 2020 Nov 24;117(47):29302-29310. doi: 10.1073/pnas.1912341117.
4
The Tolman-Eichenbaum Machine: Unifying Space and Relational Memory through Generalization in the Hippocampal Formation.托尔曼-埃克恩鲍姆机器:通过海马结构中的泛化统一空间和关系记忆。
Cell. 2020 Nov 25;183(5):1249-1263.e23. doi: 10.1016/j.cell.2020.10.024. Epub 2020 Nov 11.
5
Learning ambidextrous robot grasping policies.学习双手机器人抓取策略。
Sci Robot. 2019 Jan 16;4(26). doi: 10.1126/scirobotics.aau4984.
6
Recurrent neural networks can explain flexible trading of speed and accuracy in biological vision.递归神经网络可以解释生物视觉中速度和精度的灵活交易。
PLoS Comput Biol. 2020 Oct 2;16(10):e1008215. doi: 10.1371/journal.pcbi.1008215. eCollection 2020 Oct.
7
Integrative Benchmarking to Advance Neurally Mechanistic Models of Human Intelligence.综合基准测试以推进人类智能的神经机制模型。
Neuron. 2020 Nov 11;108(3):413-423. doi: 10.1016/j.neuron.2020.07.040. Epub 2020 Sep 11.
8
Context information supports serial dependence of multiple visual objects across memory episodes.语境信息支持多个视觉对象在记忆片段间的序列依赖关系。
Nat Commun. 2020 Apr 22;11(1):1932. doi: 10.1038/s41467-020-15874-w.
9
Backpropagation and the brain.反向传播与大脑。
Nat Rev Neurosci. 2020 Jun;21(6):335-346. doi: 10.1038/s41583-020-0277-3. Epub 2020 Apr 17.
10
A deep learning framework for neuroscience.深度学习在神经科学中的应用框架。
Nat Neurosci. 2019 Nov;22(11):1761-1770. doi: 10.1038/s41593-019-0520-2. Epub 2019 Oct 28.