文献检索文档翻译深度研究
Suppr Zotero 插件Zotero 插件
邀请有礼套餐&价格历史记录

新学期,新优惠

限时优惠:9月1日-9月22日

30天高级会员仅需29元

1天体验卡首发特惠仅需5.99元

了解详情
不再提醒
插件&应用
Suppr Zotero 插件Zotero 插件浏览器插件Mac 客户端Windows 客户端微信小程序
高级版
套餐订阅购买积分包
AI 工具
文献检索文档翻译深度研究
关于我们
关于 Suppr公司介绍联系我们用户协议隐私条款
关注我们

Suppr 超能文献

核心技术专利:CN118964589B侵权必究
粤ICP备2023148730 号-1Suppr @ 2025

通过具有中央凹视觉系统的搜索进行目标检测。

Object detection through search with a foveated visual system.

作者信息

Akbas Emre, Eckstein Miguel P

机构信息

Department of Psychological and Brain Sciences, University of California, Santa Barbara, Santa Barbara, California, United States of America.

Department of Computer Engineering, Middle East Technical University, Ankara, Turkey.

出版信息

PLoS Comput Biol. 2017 Oct 9;13(10):e1005743. doi: 10.1371/journal.pcbi.1005743. eCollection 2017 Oct.


DOI:10.1371/journal.pcbi.1005743
PMID:28991906
原文链接:https://pmc.ncbi.nlm.nih.gov/articles/PMC5669499/
Abstract

Humans and many other species sense visual information with varying spatial resolution across the visual field (foveated vision) and deploy eye movements to actively sample regions of interests in scenes. The advantage of such varying resolution architecture is a reduced computational, hence metabolic cost. But what are the performance costs of such processing strategy relative to a scheme that processes the visual field at high spatial resolution? Here we first focus on visual search and combine object detectors from computer vision with a recent model of peripheral pooling regions found at the V1 layer of the human visual system. We develop a foveated object detector that processes the entire scene with varying resolution, uses retino-specific object detection classifiers to guide eye movements, aligns its fovea with regions of interest in the input image and integrates observations across multiple fixations. We compared the foveated object detector against a non-foveated version of the same object detector which processes the entire image at homogeneous high spatial resolution. We evaluated the accuracy of the foveated and non-foveated object detectors identifying 20 different objects classes in scenes from a standard computer vision data set (the PASCAL VOC 2007 dataset). We show that the foveated object detector can approximate the performance of the object detector with homogeneous high spatial resolution processing while bringing significant computational cost savings. Additionally, we assessed the impact of foveation on the computation of bottom-up saliency. An implementation of a simple foveated bottom-up saliency model with eye movements showed agreement in the selection of top salient regions of scenes with those selected by a non-foveated high resolution saliency model. Together, our results might help explain the evolution of foveated visual systems with eye movements as a solution that preserves perceptual performance in visual search while resulting in computational and metabolic savings to the brain.

摘要

人类和许多其他物种在整个视野范围内以不同的空间分辨率感知视觉信息(中央凹视觉),并通过眼球运动主动对场景中的感兴趣区域进行采样。这种分辨率变化的架构的优势在于计算量减少,从而降低了代谢成本。但是,相对于以高空间分辨率处理视野的方案,这种处理策略的性能成本是什么呢?在这里,我们首先关注视觉搜索,并将计算机视觉中的目标检测器与人类视觉系统V1层中发现的外周池化区域的最新模型相结合。我们开发了一种中央凹目标检测器,它以不同的分辨率处理整个场景,使用视网膜特异性目标检测分类器来指导眼球运动,将其中央凹与输入图像中的感兴趣区域对齐,并整合多个注视点的观察结果。我们将中央凹目标检测器与同一目标检测器的非中央凹版本进行了比较,后者以均匀的高空间分辨率处理整个图像。我们评估了中央凹和非中央凹目标检测器在从标准计算机视觉数据集(PASCAL VOC 2007数据集)的场景中识别20种不同物体类别的准确性。我们表明,中央凹目标检测器在带来显著计算成本节省的同时,可以近似于具有均匀高空间分辨率处理的目标检测器的性能。此外,我们评估了中央凹对自下而上显著性计算的影响。一个带有眼球运动的简单中央凹自下而上显著性模型的实现表明,在场景的顶级显著区域的选择上,与非中央凹高分辨率显著性模型选择的区域一致。总之,我们的结果可能有助于解释具有眼球运动的中央凹视觉系统的进化,作为一种在视觉搜索中保持感知性能同时为大脑节省计算和代谢成本的解决方案。

https://cdn.ncbi.nlm.nih.gov/pmc/blobs/5d8b/5669499/33ff5b3e1fb5/pcbi.1005743.g011.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/5d8b/5669499/f4ddf9f0ade5/pcbi.1005743.g001.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/5d8b/5669499/f72d5dc9ad42/pcbi.1005743.g002.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/5d8b/5669499/60b2f6718d80/pcbi.1005743.g003.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/5d8b/5669499/81665236b77e/pcbi.1005743.g004.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/5d8b/5669499/d3fceac49396/pcbi.1005743.g005.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/5d8b/5669499/24d7806b9d82/pcbi.1005743.g006.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/5d8b/5669499/d59199883d07/pcbi.1005743.g007.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/5d8b/5669499/f01fca96dc57/pcbi.1005743.g008.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/5d8b/5669499/e662a6b93c04/pcbi.1005743.g009.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/5d8b/5669499/4a685f4a09fe/pcbi.1005743.g010.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/5d8b/5669499/33ff5b3e1fb5/pcbi.1005743.g011.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/5d8b/5669499/f4ddf9f0ade5/pcbi.1005743.g001.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/5d8b/5669499/f72d5dc9ad42/pcbi.1005743.g002.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/5d8b/5669499/60b2f6718d80/pcbi.1005743.g003.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/5d8b/5669499/81665236b77e/pcbi.1005743.g004.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/5d8b/5669499/d3fceac49396/pcbi.1005743.g005.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/5d8b/5669499/24d7806b9d82/pcbi.1005743.g006.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/5d8b/5669499/d59199883d07/pcbi.1005743.g007.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/5d8b/5669499/f01fca96dc57/pcbi.1005743.g008.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/5d8b/5669499/e662a6b93c04/pcbi.1005743.g009.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/5d8b/5669499/4a685f4a09fe/pcbi.1005743.g010.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/5d8b/5669499/33ff5b3e1fb5/pcbi.1005743.g011.jpg

相似文献

[1]
Object detection through search with a foveated visual system.

PLoS Comput Biol. 2017-10-9

[2]
Automatic foveation for video compression using a neurobiological model of visual attention.

IEEE Trans Image Process. 2004-10

[3]
What stands out in a scene? A study of human explicit saliency judgment.

Vision Res. 2013-10-18

[4]
How do the regions of the visual field contribute to object search in real-world scenes? Evidence from eye movements.

J Exp Psychol Hum Percept Perform. 2014-2

[5]
Saliency, attention, and visual search: an information theoretic approach.

J Vis. 2009-3-13

[6]
A proto-object-based computational model for visual saliency.

J Vis. 2013-11-26

[7]
Left, right, left, right, eyes to the front! Müller-Lyer bias in grasping is not a function of hand used, hand preferred or visual hemifield, but foveation does matter.

Exp Brain Res. 2012-1-26

[8]
Natural scene statistics at the centre of gaze.

Network. 1999-11

[9]
Contrast statistics for foveated visual systems: fixation selection by minimizing contrast entropy.

J Opt Soc Am A Opt Image Sci Vis. 2005-10

[10]
A proto-object based saliency model in three-dimensional space.

Vision Res. 2016-2

引用本文的文献

[1]
Predictive processing of scenes and objects.

Nat Rev Psychol. 2024-1

[2]
A dual foveal-peripheral visual processing model implements efficient saccade selection.

J Vis. 2020-8-3

[3]
The effects of eccentricity on attentional capture.

Atten Percept Psychophys. 2024-2

[4]
Maximizing valid eye-tracking data in human and macaque infants by optimizing calibration and adjusting areas of interest.

Behav Res Methods. 2024-2

[5]
Bibliometric analysis of artificial intelligence for biotechnology and applied microbiology: Exploring research hotspots and frontiers.

Front Bioeng Biotechnol. 2022-10-7

[6]
Human peripheral blur is optimal for object recognition.

Vision Res. 2022-11

[7]
Could simplified stimuli change how the brain performs visual search tasks? A deep neural network study.

J Vis. 2022-6-1

[8]
The Data Efficiency of Deep Learning Is Degraded by Unnecessary Input Dimensions.

Front Comput Neurosci. 2022-1-31

[9]
Biologically Inspired Deep Learning Model for Efficient Foveal-Peripheral Vision.

Front Comput Neurosci. 2021-11-22

[10]
Medical image quality metrics for foveated model observers.

J Med Imaging (Bellingham). 2021-7

本文引用的文献

[1]
Humans, but Not Deep Neural Networks, Often Miss Giant Targets in Scenes.

Curr Biol. 2017-9-7

[2]
Probabilistic Computations for Attention, Eye Movements, and Search.

Annu Rev Vis Sci. 2017-7-26

[3]
Feedback from higher to lower visual areas for visual recognition may be weaker in the periphery: Glimpses from the perception of brief dichoptic stimuli.

Vision Res. 2017-7

[4]
Capabilities and Limitations of Peripheral Vision.

Annu Rev Vis Sci. 2016-10-14

[5]
Beyond scene gist: Objects guide search more than scene background.

J Exp Psychol Hum Percept Perform. 2017-6

[6]
Faster R-CNN: Towards Real-Time Object Detection with Region Proposal Networks.

IEEE Trans Pattern Anal Mach Intell. 2016-6-6

[7]
What Makes for Effective Detection Proposals?

IEEE Trans Pattern Anal Mach Intell. 2016-4

[8]
Optimal and human eye movements to clustered low value cues to increase decision rewards during search.

Vision Res. 2015-8

[9]
Retina-V1 model of detectability across the visual field.

J Vis. 2014-10-21

[10]
Shrimps that pay attention: saccadic eye movements in stomatopod crustaceans.

Philos Trans R Soc Lond B Biol Sci. 2014-1-6

文献AI研究员

20分钟写一篇综述,助力文献阅读效率提升50倍

立即体验

用中文搜PubMed

大模型驱动的PubMed中文搜索引擎

马上搜索

推荐工具

医学文档翻译智能文献检索