• 文献检索
  • 文档翻译
  • 深度研究
  • 学术资讯
  • Suppr Zotero 插件Zotero 插件
  • 邀请有礼
  • 套餐&价格
  • 历史记录
应用&插件
Suppr Zotero 插件Zotero 插件浏览器插件Mac 客户端Windows 客户端微信小程序
定价
高级版会员购买积分包购买API积分包
服务
文献检索文档翻译深度研究API 文档MCP 服务
关于我们
关于 Suppr公司介绍联系我们用户协议隐私条款
关注我们

Suppr 超能文献

核心技术专利:CN118964589B侵权必究
粤ICP备2023148730 号-1Suppr @ 2026

文献检索

告别复杂PubMed语法,用中文像聊天一样搜索,搜遍4000万医学文献。AI智能推荐,让科研检索更轻松。

立即免费搜索

文件翻译

保留排版,准确专业,支持PDF/Word/PPT等文件格式,支持 12+语言互译。

免费翻译文档

深度研究

AI帮你快速写综述,25分钟生成高质量综述,智能提取关键信息,辅助科研写作。

立即免费体验

人类视觉中新奇物体的尺度和不变性。

Scale and translation-invariance for novel objects in human vision.

机构信息

Center for Brains, Minds and Machines, MIT, 77 Massachusetts Ave, Cambridge, MA, 02139, United States of America.

Computer Science Department, Goethe University Frankfurt, Frankfurt am Main, Germany.

出版信息

Sci Rep. 2020 Jan 29;10(1):1411. doi: 10.1038/s41598-019-57261-6.

DOI:10.1038/s41598-019-57261-6
PMID:31996698
原文链接:https://pmc.ncbi.nlm.nih.gov/articles/PMC6989457/
Abstract

Though the range of invariance in recognition of novel objects is a basic aspect of human vision, its characterization has remained surprisingly elusive. Here we report tolerance to scale and position changes in one-shot learning by measuring recognition accuracy of Korean letters presented in a flash to non-Korean subjects who had no previous experience with Korean letters. We found that humans have significant scale-invariance after only a single exposure to a novel object. The range of translation-invariance is limited, depending on the size and position of presented objects. To understand the underlying brain computation associated with the invariance properties, we compared experimental data with computational modeling results. Our results suggest that to explain invariant recognition of objects by humans, neural network models should explicitly incorporate built-in scale-invariance, by encoding different scale channels as well as eccentricity-dependent representations captured by neurons' receptive field sizes and sampling density that change with eccentricity. Our psychophysical experiments and related simulations strongly suggest that the human visual system uses a computational strategy that differs in some key aspects from current deep learning architectures, being more data efficient and relying more critically on eye-movements.

摘要

尽管对新物体的识别不变性范围是人类视觉的一个基本方面,但对其特征的描述仍然出人意料地难以捉摸。在这里,我们通过测量非韩国人在一次闪现中对韩国字母的识别准确率来报告对尺度和位置变化的容忍度,这些非韩国人以前没有接触过韩国字母。我们发现,人类在仅接触一次新物体后就具有显著的尺度不变性。平移不变性的范围是有限的,取决于呈现物体的大小和位置。为了理解与不变性特征相关的大脑计算基础,我们将实验数据与计算建模结果进行了比较。我们的结果表明,为了解释人类对物体的不变识别,神经网络模型应该通过编码不同的尺度通道以及由神经元感受野大小和采样密度捕获的与离轴有关的表示来明确纳入内置的尺度不变性,这些尺度通道和表示会随着离轴而变化。我们的心理物理学实验和相关模拟强烈表明,人类视觉系统使用的计算策略在某些关键方面与当前的深度学习架构不同,它更注重数据效率,更依赖于眼球运动。

https://cdn.ncbi.nlm.nih.gov/pmc/blobs/02e9/6989457/e99581b3e693/41598_2019_57261_Fig7_HTML.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/02e9/6989457/cc0b7eb46715/41598_2019_57261_Fig1_HTML.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/02e9/6989457/e47ec96c8415/41598_2019_57261_Fig2_HTML.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/02e9/6989457/d734f0e493b0/41598_2019_57261_Fig3_HTML.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/02e9/6989457/783face69165/41598_2019_57261_Fig4_HTML.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/02e9/6989457/33267aee90b2/41598_2019_57261_Fig5_HTML.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/02e9/6989457/9ff8e55a90ac/41598_2019_57261_Fig6_HTML.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/02e9/6989457/e99581b3e693/41598_2019_57261_Fig7_HTML.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/02e9/6989457/cc0b7eb46715/41598_2019_57261_Fig1_HTML.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/02e9/6989457/e47ec96c8415/41598_2019_57261_Fig2_HTML.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/02e9/6989457/d734f0e493b0/41598_2019_57261_Fig3_HTML.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/02e9/6989457/783face69165/41598_2019_57261_Fig4_HTML.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/02e9/6989457/33267aee90b2/41598_2019_57261_Fig5_HTML.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/02e9/6989457/9ff8e55a90ac/41598_2019_57261_Fig6_HTML.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/02e9/6989457/e99581b3e693/41598_2019_57261_Fig7_HTML.jpg

相似文献

1
Scale and translation-invariance for novel objects in human vision.人类视觉中新奇物体的尺度和不变性。
Sci Rep. 2020 Jan 29;10(1):1411. doi: 10.1038/s41598-019-57261-6.
2
There Is a "U" in Clutter: Evidence for Robust Sparse Codes Underlying Clutter Tolerance in Human Vision.视觉中的杂讯中有个“U”:人类视觉中杂讯容忍背后强大稀疏编码的证据。
J Neurosci. 2015 Oct 21;35(42):14148-59. doi: 10.1523/JNEUROSCI.1211-15.2015.
3
Atoms of recognition in human and computer vision.人类视觉与计算机视觉中的识别原子。
Proc Natl Acad Sci U S A. 2016 Mar 8;113(10):2744-9. doi: 10.1073/pnas.1513198113. Epub 2016 Feb 16.
4
Unsupervised changes in core object recognition behavior are predicted by neural plasticity in inferior temporal cortex.无监督的核心物体识别行为变化可由下颞叶皮层的神经可塑性预测。
Elife. 2021 Jun 11;10:e60830. doi: 10.7554/eLife.60830.
5
Some results on translation invariance in the human visual system.关于人类视觉系统中平移不变性的一些研究结果。
Spat Vis. 1990;5(2):81-100. doi: 10.1163/156856890x00011.
6
Invariance of visual operations at the level of receptive fields.视野水平上视觉运算的不变性。
PLoS One. 2013 Jul 19;8(7):e66990. doi: 10.1371/journal.pone.0066990. Print 2013.
7
Developmental visual perception deficits with no indications of prosopagnosia in a child with abnormal eye movements.一名眼动异常儿童存在发育性视觉感知缺陷,无面孔失认症迹象。
Neuropsychologia. 2017 Jun;100:64-78. doi: 10.1016/j.neuropsychologia.2017.04.014. Epub 2017 Apr 9.
8
Motion Extrapolation for Eye Movements Predicts Perceived Motion-Induced Position Shifts.眼球运动的运动外推预测感知运动引起的位置移动。
J Neurosci. 2018 Sep 19;38(38):8243-8250. doi: 10.1523/JNEUROSCI.0736-18.2018. Epub 2018 Aug 13.
9
When does the visual system use viewpoint-invariant representations during recognition?视觉系统在识别过程中何时使用视角不变表征?
Brain Res Cogn Brain Res. 2003 May;16(3):399-415. doi: 10.1016/s0926-6410(03)00054-5.
10
'Breaking' position-invariant object recognition.“突破性”位置不变目标识别。
Nat Neurosci. 2005 Sep;8(9):1145-7. doi: 10.1038/nn1519. Epub 2005 Aug 7.

引用本文的文献

1
Relationships between the degrees of freedom in the affine Gaussian derivative model for visual receptive fields and 2-D affine image transformations with application to covariance properties of simple cells in the primary visual cortex.视觉感受野的仿射高斯导数模型中的自由度与二维仿射图像变换之间的关系及其在初级视觉皮层简单细胞协方差特性中的应用。
Biol Cybern. 2025 Jun 18;119(2-3):15. doi: 10.1007/s00422-025-01014-4.
2
Target identification under high levels of amplitude, size, orientation and background uncertainty.在高幅度、尺寸、方向和背景不确定性水平下的目标识别。
J Vis. 2025 Feb 3;25(2):3. doi: 10.1167/jov.25.2.3.
3

本文引用的文献

1
Invariant object recognition is a personalized selection of invariant features in humans, not simply explained by hierarchical feed-forward vision models.不变目标识别是人类对不变特征的个性化选择,不能简单地用分层前馈视觉模型来解释。
Sci Rep. 2017 Oct 31;7(1):14402. doi: 10.1038/s41598-017-13756-8.
2
Humans, but Not Deep Neural Networks, Often Miss Giant Targets in Scenes.人类,而非深度神经网络,常常会错过场景中的大目标。
Curr Biol. 2017 Sep 25;27(18):2827-2832.e3. doi: 10.1016/j.cub.2017.07.068. Epub 2017 Sep 7.
3
Building machines that learn and think like people.
Coordinate-aware three-dimensional neural network for lower extremity arterial stenosis classification in CT angiography.
用于CT血管造影下肢动脉狭窄分类的坐标感知三维神经网络
Heliyon. 2024 Jul 9;10(14):e34309. doi: 10.1016/j.heliyon.2024.e34309. eCollection 2024 Jul 30.
4
Beyond Conventional Monitoring: A Semantic Segmentation Approach to Quantifying Traffic-Induced Dust on Unsealed Roads.超越传统监测:一种用于量化未封道路上交通扬尘的语义分割方法。
Sensors (Basel). 2024 Jan 14;24(2):0. doi: 10.3390/s24020510.
5
How well do rudimentary plasticity rules predict adult visual object learning?基本的可塑性规则在多大程度上可以预测成人的视觉物体学习?
PLoS Comput Biol. 2023 Dec 11;19(12):e1011713. doi: 10.1371/journal.pcbi.1011713. eCollection 2023 Dec.
6
Face detection based on a human attention guided multi-scale model.基于人类注意力引导的多尺度模型的人脸检测。
Biol Cybern. 2023 Dec;117(6):453-466. doi: 10.1007/s00422-023-00978-5. Epub 2023 Dec 1.
7
What determines location specificity or generalization of transsaccadic learning?眼跳间学习的位置特异性或泛化性由什么决定?
J Vis. 2023 Jan 3;23(1):8. doi: 10.1167/jov.23.1.8.
8
Understanding transformation tolerant visual object representations in the human brain and convolutional neural networks.理解人类大脑和卷积神经网络中对变换具有容忍度的视觉对象表示。
Neuroimage. 2022 Nov;263:119635. doi: 10.1016/j.neuroimage.2022.119635. Epub 2022 Sep 15.
9
Fluorescently labeled nuclear morphology is highly informative of neurotoxicity.荧光标记的核形态对神经毒性具有高度的指示作用。
Front Toxicol. 2022 Aug 24;4:935438. doi: 10.3389/ftox.2022.935438. eCollection 2022.
10
Written Language Acquisition Is Both Shaped by and Has an Impact on Brain Functioning and Cognition.书面语言习得既受大脑功能和认知的影响,也对其产生影响。
Front Hum Neurosci. 2022 Jun 10;16:819956. doi: 10.3389/fnhum.2022.819956. eCollection 2022.
建造像人一样学习和思考的机器。
Behav Brain Sci. 2017 Jan;40:e253. doi: 10.1017/S0140525X16001837. Epub 2016 Nov 24.
4
Deep Networks Can Resemble Human Feed-forward Vision in Invariant Object Recognition.深度网络在不变目标识别中可模拟人类前馈视觉。
Sci Rep. 2016 Sep 7;6:32672. doi: 10.1038/srep32672.
5
Faster R-CNN: Towards Real-Time Object Detection with Region Proposal Networks.更快的 R-CNN:基于区域建议网络的实时目标检测。
IEEE Trans Pattern Anal Mach Intell. 2017 Jun;39(6):1137-1149. doi: 10.1109/TPAMI.2016.2577031. Epub 2016 Jun 6.
6
Comparison of deep neural networks to spatio-temporal cortical dynamics of human visual object recognition reveals hierarchical correspondence.将深度神经网络与人类视觉物体识别的时空皮层动力学进行比较,揭示了层级对应关系。
Sci Rep. 2016 Jun 10;6:27755. doi: 10.1038/srep27755.
7
Variability and Correlations in Primary Visual Cortical Neurons Driven by Fixational Eye Movements.由注视性眼动驱动的初级视觉皮层神经元的变异性和相关性
J Neurosci. 2016 Jun 8;36(23):6225-41. doi: 10.1523/JNEUROSCI.4660-15.2016.
8
Using goal-driven deep learning models to understand sensory cortex.利用目标驱动的深度学习模型理解感觉皮层。
Nat Neurosci. 2016 Mar;19(3):356-65. doi: 10.1038/nn.4244.
9
Explicit information for category-orthogonal object properties increases along the ventral stream.明确的类别正交物体属性信息沿腹侧流增加。
Nat Neurosci. 2016 Apr;19(4):613-22. doi: 10.1038/nn.4247. Epub 2016 Feb 22.
10
Atoms of recognition in human and computer vision.人类视觉与计算机视觉中的识别原子。
Proc Natl Acad Sci U S A. 2016 Mar 8;113(10):2744-9. doi: 10.1073/pnas.1513198113. Epub 2016 Feb 16.