• 文献检索
  • 文档翻译
  • 深度研究
  • 学术资讯
  • Suppr Zotero 插件Zotero 插件
  • 邀请有礼
  • 套餐&价格
  • 历史记录
应用&插件
Suppr Zotero 插件Zotero 插件浏览器插件Mac 客户端Windows 客户端微信小程序
定价
高级版会员购买积分包购买API积分包
服务
文献检索文档翻译深度研究API 文档MCP 服务
关于我们
关于 Suppr公司介绍联系我们用户协议隐私条款
关注我们

Suppr 超能文献

核心技术专利:CN118964589B侵权必究
粤ICP备2023148730 号-1Suppr @ 2026

文献检索

告别复杂PubMed语法,用中文像聊天一样搜索,搜遍4000万医学文献。AI智能推荐,让科研检索更轻松。

立即免费搜索

文件翻译

保留排版,准确专业,支持PDF/Word/PPT等文件格式,支持 12+语言互译。

免费翻译文档

深度研究

AI帮你快速写综述,25分钟生成高质量综述,智能提取关键信息,辅助科研写作。

立即免费体验

通过具有中央凹视觉系统的搜索进行目标检测。

Object detection through search with a foveated visual system.

作者信息

Akbas Emre, Eckstein Miguel P

机构信息

Department of Psychological and Brain Sciences, University of California, Santa Barbara, Santa Barbara, California, United States of America.

Department of Computer Engineering, Middle East Technical University, Ankara, Turkey.

出版信息

PLoS Comput Biol. 2017 Oct 9;13(10):e1005743. doi: 10.1371/journal.pcbi.1005743. eCollection 2017 Oct.

DOI:10.1371/journal.pcbi.1005743
PMID:28991906
原文链接:https://pmc.ncbi.nlm.nih.gov/articles/PMC5669499/
Abstract

Humans and many other species sense visual information with varying spatial resolution across the visual field (foveated vision) and deploy eye movements to actively sample regions of interests in scenes. The advantage of such varying resolution architecture is a reduced computational, hence metabolic cost. But what are the performance costs of such processing strategy relative to a scheme that processes the visual field at high spatial resolution? Here we first focus on visual search and combine object detectors from computer vision with a recent model of peripheral pooling regions found at the V1 layer of the human visual system. We develop a foveated object detector that processes the entire scene with varying resolution, uses retino-specific object detection classifiers to guide eye movements, aligns its fovea with regions of interest in the input image and integrates observations across multiple fixations. We compared the foveated object detector against a non-foveated version of the same object detector which processes the entire image at homogeneous high spatial resolution. We evaluated the accuracy of the foveated and non-foveated object detectors identifying 20 different objects classes in scenes from a standard computer vision data set (the PASCAL VOC 2007 dataset). We show that the foveated object detector can approximate the performance of the object detector with homogeneous high spatial resolution processing while bringing significant computational cost savings. Additionally, we assessed the impact of foveation on the computation of bottom-up saliency. An implementation of a simple foveated bottom-up saliency model with eye movements showed agreement in the selection of top salient regions of scenes with those selected by a non-foveated high resolution saliency model. Together, our results might help explain the evolution of foveated visual systems with eye movements as a solution that preserves perceptual performance in visual search while resulting in computational and metabolic savings to the brain.

摘要

人类和许多其他物种在整个视野范围内以不同的空间分辨率感知视觉信息(中央凹视觉),并通过眼球运动主动对场景中的感兴趣区域进行采样。这种分辨率变化的架构的优势在于计算量减少,从而降低了代谢成本。但是,相对于以高空间分辨率处理视野的方案,这种处理策略的性能成本是什么呢?在这里,我们首先关注视觉搜索,并将计算机视觉中的目标检测器与人类视觉系统V1层中发现的外周池化区域的最新模型相结合。我们开发了一种中央凹目标检测器,它以不同的分辨率处理整个场景,使用视网膜特异性目标检测分类器来指导眼球运动,将其中央凹与输入图像中的感兴趣区域对齐,并整合多个注视点的观察结果。我们将中央凹目标检测器与同一目标检测器的非中央凹版本进行了比较,后者以均匀的高空间分辨率处理整个图像。我们评估了中央凹和非中央凹目标检测器在从标准计算机视觉数据集(PASCAL VOC 2007数据集)的场景中识别20种不同物体类别的准确性。我们表明,中央凹目标检测器在带来显著计算成本节省的同时,可以近似于具有均匀高空间分辨率处理的目标检测器的性能。此外,我们评估了中央凹对自下而上显著性计算的影响。一个带有眼球运动的简单中央凹自下而上显著性模型的实现表明,在场景的顶级显著区域的选择上,与非中央凹高分辨率显著性模型选择的区域一致。总之,我们的结果可能有助于解释具有眼球运动的中央凹视觉系统的进化,作为一种在视觉搜索中保持感知性能同时为大脑节省计算和代谢成本的解决方案。

https://cdn.ncbi.nlm.nih.gov/pmc/blobs/5d8b/5669499/33ff5b3e1fb5/pcbi.1005743.g011.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/5d8b/5669499/f4ddf9f0ade5/pcbi.1005743.g001.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/5d8b/5669499/f72d5dc9ad42/pcbi.1005743.g002.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/5d8b/5669499/60b2f6718d80/pcbi.1005743.g003.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/5d8b/5669499/81665236b77e/pcbi.1005743.g004.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/5d8b/5669499/d3fceac49396/pcbi.1005743.g005.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/5d8b/5669499/24d7806b9d82/pcbi.1005743.g006.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/5d8b/5669499/d59199883d07/pcbi.1005743.g007.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/5d8b/5669499/f01fca96dc57/pcbi.1005743.g008.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/5d8b/5669499/e662a6b93c04/pcbi.1005743.g009.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/5d8b/5669499/4a685f4a09fe/pcbi.1005743.g010.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/5d8b/5669499/33ff5b3e1fb5/pcbi.1005743.g011.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/5d8b/5669499/f4ddf9f0ade5/pcbi.1005743.g001.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/5d8b/5669499/f72d5dc9ad42/pcbi.1005743.g002.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/5d8b/5669499/60b2f6718d80/pcbi.1005743.g003.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/5d8b/5669499/81665236b77e/pcbi.1005743.g004.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/5d8b/5669499/d3fceac49396/pcbi.1005743.g005.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/5d8b/5669499/24d7806b9d82/pcbi.1005743.g006.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/5d8b/5669499/d59199883d07/pcbi.1005743.g007.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/5d8b/5669499/f01fca96dc57/pcbi.1005743.g008.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/5d8b/5669499/e662a6b93c04/pcbi.1005743.g009.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/5d8b/5669499/4a685f4a09fe/pcbi.1005743.g010.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/5d8b/5669499/33ff5b3e1fb5/pcbi.1005743.g011.jpg

相似文献

1
Object detection through search with a foveated visual system.通过具有中央凹视觉系统的搜索进行目标检测。
PLoS Comput Biol. 2017 Oct 9;13(10):e1005743. doi: 10.1371/journal.pcbi.1005743. eCollection 2017 Oct.
2
Automatic foveation for video compression using a neurobiological model of visual attention.使用视觉注意力神经生物学模型进行视频压缩的自动中央凹定位
IEEE Trans Image Process. 2004 Oct;13(10):1304-18. doi: 10.1109/tip.2004.834657.
3
What stands out in a scene? A study of human explicit saliency judgment.场景中突出的是什么?一项关于人类显性显著性判断的研究。
Vision Res. 2013 Oct 18;91:62-77. doi: 10.1016/j.visres.2013.07.016. Epub 2013 Aug 15.
4
How do the regions of the visual field contribute to object search in real-world scenes? Evidence from eye movements.视野区域如何在现实场景中助力目标搜索?来自眼动的证据。
J Exp Psychol Hum Percept Perform. 2014 Feb;40(1):342-60. doi: 10.1037/a0033854. Epub 2013 Aug 12.
5
Saliency, attention, and visual search: an information theoretic approach.显著性、注意力与视觉搜索:一种信息论方法。
J Vis. 2009 Mar 13;9(3):5.1-24. doi: 10.1167/9.3.5.
6
A proto-object-based computational model for visual saliency.一种基于原型对象的视觉显著性计算模型。
J Vis. 2013 Nov 26;13(13):27. doi: 10.1167/13.13.27.
7
Left, right, left, right, eyes to the front! Müller-Lyer bias in grasping is not a function of hand used, hand preferred or visual hemifield, but foveation does matter.左、右、左、右,向前看!抓握中的穆勒-莱尔错觉不是使用手、惯用手或视觉半视野的功能,而是注视点很重要。
Exp Brain Res. 2012 Apr;218(1):91-8. doi: 10.1007/s00221-012-3007-x. Epub 2012 Jan 26.
8
Natural scene statistics at the centre of gaze.注视中心的自然场景统计数据。
Network. 1999 Nov;10(4):341-50.
9
Contrast statistics for foveated visual systems: fixation selection by minimizing contrast entropy.中央凹视觉系统的对比度统计:通过最小化对比度熵进行注视选择。
J Opt Soc Am A Opt Image Sci Vis. 2005 Oct;22(10):2039-49. doi: 10.1364/josaa.22.002039.
10
A proto-object based saliency model in three-dimensional space.一种基于原物体的三维空间显著模型。
Vision Res. 2016 Feb;119:42-9. doi: 10.1016/j.visres.2015.12.004. Epub 2016 Jan 19.

引用本文的文献

1
Predictive processing of scenes and objects.场景和物体的预测性处理。
Nat Rev Psychol. 2024 Jan;3:13-26. doi: 10.1038/s44159-023-00254-0. Epub 2023 Nov 23.
2
A dual foveal-peripheral visual processing model implements efficient saccade selection.一种双中央凹-周边视觉处理模型实现了高效的扫视选择。
J Vis. 2020 Aug 3;20(8):22. doi: 10.1167/jov.20.8.22.
3
The effects of eccentricity on attentional capture.偏心率对注意力捕获的影响。

本文引用的文献

1
Humans, but Not Deep Neural Networks, Often Miss Giant Targets in Scenes.人类,而非深度神经网络,常常会错过场景中的大目标。
Curr Biol. 2017 Sep 25;27(18):2827-2832.e3. doi: 10.1016/j.cub.2017.07.068. Epub 2017 Sep 7.
2
Probabilistic Computations for Attention, Eye Movements, and Search.注意力、眼球运动和搜索的概率计算。
Annu Rev Vis Sci. 2017 Sep 15;3:319-342. doi: 10.1146/annurev-vision-102016-061220. Epub 2017 Jul 26.
3
Feedback from higher to lower visual areas for visual recognition may be weaker in the periphery: Glimpses from the perception of brief dichoptic stimuli.
Atten Percept Psychophys. 2024 Feb;86(2):422-438. doi: 10.3758/s13414-023-02735-z. Epub 2023 May 31.
4
Maximizing valid eye-tracking data in human and macaque infants by optimizing calibration and adjusting areas of interest.通过优化校准和调整感兴趣区域,最大限度地提高人类和猕猴婴儿的有效眼动追踪数据。
Behav Res Methods. 2024 Feb;56(2):881-907. doi: 10.3758/s13428-022-02056-3. Epub 2023 Mar 8.
5
Bibliometric analysis of artificial intelligence for biotechnology and applied microbiology: Exploring research hotspots and frontiers.生物技术与应用微生物学领域人工智能的文献计量分析:探索研究热点与前沿
Front Bioeng Biotechnol. 2022 Oct 7;10:998298. doi: 10.3389/fbioe.2022.998298. eCollection 2022.
6
Human peripheral blur is optimal for object recognition.人类周边模糊度对物体识别最理想。
Vision Res. 2022 Nov;200:108083. doi: 10.1016/j.visres.2022.108083. Epub 2022 Jul 10.
7
Could simplified stimuli change how the brain performs visual search tasks? A deep neural network study.简化刺激能否改变大脑执行视觉搜索任务的方式?一项深度神经网络研究。
J Vis. 2022 Jun 1;22(7):3. doi: 10.1167/jov.22.7.3.
8
The Data Efficiency of Deep Learning Is Degraded by Unnecessary Input Dimensions.深度学习的数据效率会因不必要的输入维度而降低。
Front Comput Neurosci. 2022 Jan 31;16:760085. doi: 10.3389/fncom.2022.760085. eCollection 2022.
9
Biologically Inspired Deep Learning Model for Efficient Foveal-Peripheral Vision.用于高效中央凹-周边视觉的生物启发式深度学习模型
Front Comput Neurosci. 2021 Nov 22;15:746204. doi: 10.3389/fncom.2021.746204. eCollection 2021.
10
Medical image quality metrics for foveated model observers.用于中心凹注视模型观察者的医学图像质量指标。
J Med Imaging (Bellingham). 2021 Jul;8(4):041209. doi: 10.1117/1.JMI.8.4.041209. Epub 2021 Aug 16.
视觉识别中从较高视觉区域到较低视觉区域的反馈在外周可能较弱:来自短暂双眼视刺激感知的线索。
Vision Res. 2017 Jul;136:32-49. doi: 10.1016/j.visres.2017.05.002. Epub 2017 Jun 7.
4
Capabilities and Limitations of Peripheral Vision.周边视觉的能力与局限。
Annu Rev Vis Sci. 2016 Oct 14;2:437-457. doi: 10.1146/annurev-vision-082114-035733.
5
Beyond scene gist: Objects guide search more than scene background.超越场景要点:物体比场景背景更能引导搜索。
J Exp Psychol Hum Percept Perform. 2017 Jun;43(6):1177-1193. doi: 10.1037/xhp0000363. Epub 2017 Mar 13.
6
Faster R-CNN: Towards Real-Time Object Detection with Region Proposal Networks.更快的 R-CNN:基于区域建议网络的实时目标检测。
IEEE Trans Pattern Anal Mach Intell. 2017 Jun;39(6):1137-1149. doi: 10.1109/TPAMI.2016.2577031. Epub 2016 Jun 6.
7
What Makes for Effective Detection Proposals?什么因素能促成有效的检测提议?
IEEE Trans Pattern Anal Mach Intell. 2016 Apr;38(4):814-30. doi: 10.1109/TPAMI.2015.2465908.
8
Optimal and human eye movements to clustered low value cues to increase decision rewards during search.在搜索过程中,优化并进行人眼运动以关注聚集的低价值线索,从而增加决策奖励。
Vision Res. 2015 Aug;113(Pt B):137-54. doi: 10.1016/j.visres.2015.05.016. Epub 2015 Jun 17.
9
Retina-V1 model of detectability across the visual field.视网膜-V1视野可检测性模型。
J Vis. 2014 Oct 21;14(12):22. doi: 10.1167/14.12.22.
10
Shrimps that pay attention: saccadic eye movements in stomatopod crustaceans.注意的虾蛄:口足类甲壳动物的扫视眼动
Philos Trans R Soc Lond B Biol Sci. 2014 Jan 6;369(1636):20130042. doi: 10.1098/rstb.2013.0042. Print 2014.