• 文献检索
  • 文档翻译
  • 深度研究
  • 学术资讯
  • Suppr Zotero 插件Zotero 插件
  • 邀请有礼
  • 套餐&价格
  • 历史记录
应用&插件
Suppr Zotero 插件Zotero 插件浏览器插件Mac 客户端Windows 客户端微信小程序
定价
高级版会员购买积分包购买API积分包
服务
文献检索文档翻译深度研究API 文档MCP 服务
关于我们
关于 Suppr公司介绍联系我们用户协议隐私条款
关注我们

Suppr 超能文献

核心技术专利:CN118964589B侵权必究
粤ICP备2023148730 号-1Suppr @ 2026

文献检索

告别复杂PubMed语法,用中文像聊天一样搜索,搜遍4000万医学文献。AI智能推荐,让科研检索更轻松。

立即免费搜索

文件翻译

保留排版,准确专业,支持PDF/Word/PPT等文件格式,支持 12+语言互译。

免费翻译文档

深度研究

AI帮你快速写综述,25分钟生成高质量综述,智能提取关键信息,辅助科研写作。

立即免费体验

用于高效中央凹-周边视觉的生物启发式深度学习模型

Biologically Inspired Deep Learning Model for Efficient Foveal-Peripheral Vision.

作者信息

Lukanov Hristofor, König Peter, Pipa Gordon

机构信息

Department of Neuroinformatics, Institute of Cognitive Science, Osnabrück University, Osnabrück, Germany.

Department of Neurobiopsychology, Institute of Cognitive Science, Osnabrück University, Osnabrück, Germany.

出版信息

Front Comput Neurosci. 2021 Nov 22;15:746204. doi: 10.3389/fncom.2021.746204. eCollection 2021.

DOI:10.3389/fncom.2021.746204
PMID:34880741
原文链接:https://pmc.ncbi.nlm.nih.gov/articles/PMC8645638/
Abstract

While abundant in biology, foveated vision is nearly absent from computational models and especially deep learning architectures. Despite considerable hardware improvements, training deep neural networks still presents a challenge and constraints complexity of models. Here we propose an end-to-end neural model for foveal-peripheral vision, inspired by retino-cortical mapping in primates and humans. Our model has an efficient sampling technique for compressing the visual signal such that a small portion of the scene is perceived in high resolution while a large field of view is maintained in low resolution. An attention mechanism for performing "eye-movements" assists the agent in collecting detailed information incrementally from the observed scene. Our model achieves comparable results to a similar neural architecture trained on full-resolution data for image classification and outperforms it at video classification tasks. At the same time, because of the smaller size of its input, it can reduce computational effort tenfold and uses several times less memory. Moreover, we present an easy to implement bottom-up and top-down attention mechanism which relies on task-relevant features and is therefore a convenient byproduct of the main architecture. Apart from its computational efficiency, the presented work provides means for exploring active vision for agent training in simulated environments and anthropomorphic robotics.

摘要

虽然在生物学中很常见,但在计算模型尤其是深度学习架构中,中央凹视觉却几乎不存在。尽管硬件有了很大改进,但训练深度神经网络仍然是一项挑战,并且限制了模型的复杂性。在此,我们受灵长类动物和人类视网膜-皮质映射的启发,提出了一种用于中央凹-周边视觉的端到端神经模型。我们的模型有一种高效的采样技术来压缩视觉信号,使得在保持大视野低分辨率的同时,能以高分辨率感知场景的一小部分。一种用于执行“眼动”的注意力机制帮助智能体从观察到的场景中逐步收集详细信息。我们的模型在图像分类任务上取得了与在全分辨率数据上训练的类似神经架构相当的结果,并且在视频分类任务上优于该架构。同时,由于其输入尺寸较小,它可以将计算量减少到十分之一,并且使用的内存也减少几倍。此外,我们提出了一种易于实现的自下而上和自上而下的注意力机制,该机制依赖于与任务相关的特征,因此是主要架构的一个便利副产品。除了计算效率之外,本文所展示的工作还为在模拟环境和拟人机器人中探索用于智能体训练的主动视觉提供了方法。

https://cdn.ncbi.nlm.nih.gov/pmc/blobs/053e/8645638/06a072688625/fncom-15-746204-g0007.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/053e/8645638/b9600608c22c/fncom-15-746204-g0001.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/053e/8645638/e321fd5953ad/fncom-15-746204-g0002.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/053e/8645638/329e5907e572/fncom-15-746204-g0003.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/053e/8645638/332573d53690/fncom-15-746204-g0004.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/053e/8645638/0f2a5acaf21c/fncom-15-746204-g0005.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/053e/8645638/d0c1235effc1/fncom-15-746204-g0006.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/053e/8645638/06a072688625/fncom-15-746204-g0007.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/053e/8645638/b9600608c22c/fncom-15-746204-g0001.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/053e/8645638/e321fd5953ad/fncom-15-746204-g0002.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/053e/8645638/329e5907e572/fncom-15-746204-g0003.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/053e/8645638/332573d53690/fncom-15-746204-g0004.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/053e/8645638/0f2a5acaf21c/fncom-15-746204-g0005.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/053e/8645638/d0c1235effc1/fncom-15-746204-g0006.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/053e/8645638/06a072688625/fncom-15-746204-g0007.jpg

相似文献

1
Biologically Inspired Deep Learning Model for Efficient Foveal-Peripheral Vision.用于高效中央凹-周边视觉的生物启发式深度学习模型
Front Comput Neurosci. 2021 Nov 22;15:746204. doi: 10.3389/fncom.2021.746204. eCollection 2021.
2
Unsupervised foveal vision neural architecture with top-down attention.具有自上而下注意力的无监督中央凹视觉神经架构。
Neural Netw. 2021 Sep;141:145-159. doi: 10.1016/j.neunet.2021.03.003. Epub 2021 Apr 20.
3
Object detection through search with a foveated visual system.通过具有中央凹视觉系统的搜索进行目标检测。
PLoS Comput Biol. 2017 Oct 9;13(10):e1005743. doi: 10.1371/journal.pcbi.1005743. eCollection 2017 Oct.
4
Active vision at the foveal scale in the primate superior colliculus.灵长类动物上丘的中央凹尺度的主动视觉。
J Neurophysiol. 2021 Apr 1;125(4):1121-1138. doi: 10.1152/jn.00724.2020. Epub 2021 Feb 3.
5
Combining segmentation and attention: a new foveal attention model.结合分割与注意力:一种新的中央凹注意力模型。
Front Comput Neurosci. 2014 Aug 14;8:96. doi: 10.3389/fncom.2014.00096. eCollection 2014.
6
A Space-Variant Visual Pathway Model for Data Efficient Deep Learning.一种用于数据高效深度学习的空间可变视觉通路模型。
Front Cell Neurosci. 2019 Mar 26;13:36. doi: 10.3389/fncel.2019.00036. eCollection 2019.
7
Foveal Vision for Humanoid Robots类人机器人的中央凹视觉
8
Data-Driven Multiresolution Camera Using the Foveal Adaptive Pyramid.使用中央凹自适应金字塔的数据驱动多分辨率相机。
Sensors (Basel). 2016 Nov 26;16(12):2003. doi: 10.3390/s16122003.
9
Self-supervised learning for remote sensing scene classification under the few shot scenario.基于小样本场景的遥感场景分类的自监督学习。
Sci Rep. 2023 Jan 9;13(1):433. doi: 10.1038/s41598-022-27313-5.
10
Neuroscientific insights about computer vision models: a concise review.神经科学对视知觉模型的启示:简要综述。
Biol Cybern. 2024 Dec;118(5-6):331-348. doi: 10.1007/s00422-024-00998-9. Epub 2024 Oct 9.

引用本文的文献

1
Foveal vision reduces neural resources in agent-based game learning.中央凹视觉减少了基于智能体的游戏学习中的神经资源。
Front Neurosci. 2025 Mar 11;19:1547264. doi: 10.3389/fnins.2025.1547264. eCollection 2025.
2
Motion feature extraction using magnocellular-inspired spiking neural networks for drone detection.使用受大细胞启发的脉冲神经网络进行无人机检测的运动特征提取
Front Comput Neurosci. 2025 Jan 22;19:1452203. doi: 10.3389/fncom.2025.1452203. eCollection 2025.

本文引用的文献

1
A dual foveal-peripheral visual processing model implements efficient saccade selection.一种双中央凹-周边视觉处理模型实现了高效的扫视选择。
J Vis. 2020 Aug 3;20(8):22. doi: 10.1167/jov.20.8.22.
2
Lightweight Pyramid Networks for Image Deraining.用于图像去雨的轻量级金字塔网络。
IEEE Trans Neural Netw Learn Syst. 2020 Jun;31(6):1794-1807. doi: 10.1109/TNNLS.2019.2926481. Epub 2019 Jul 22.
3
A Space-Variant Visual Pathway Model for Data Efficient Deep Learning.一种用于数据高效深度学习的空间可变视觉通路模型。
Front Cell Neurosci. 2019 Mar 26;13:36. doi: 10.3389/fncel.2019.00036. eCollection 2019.
4
Rods progressively escape saturation to drive visual responses in daylight conditions.视杆细胞逐渐摆脱饱和状态,以在白天条件下驱动视觉反应。
Nat Commun. 2017 Nov 27;8(1):1813. doi: 10.1038/s41467-017-01816-6.
5
Object detection through search with a foveated visual system.通过具有中央凹视觉系统的搜索进行目标检测。
PLoS Comput Biol. 2017 Oct 9;13(10):e1005743. doi: 10.1371/journal.pcbi.1005743. eCollection 2017 Oct.
6
Multi-aperture foveated imaging.多孔径中央凹注视成像
Opt Lett. 2016 Apr 15;41(8):1869-72. doi: 10.1364/OL.41.001869.
7
Spatial Pyramid Pooling in Deep Convolutional Networks for Visual Recognition.空间金字塔池化在深度卷积网络中的视觉识别。
IEEE Trans Pattern Anal Mach Intell. 2015 Sep;37(9):1904-16. doi: 10.1109/TPAMI.2015.2389824.
8
Multi-column deep neural network for traffic sign classification.多列深度神经网络用于交通标志分类。
Neural Netw. 2012 Aug;32:333-8. doi: 10.1016/j.neunet.2012.02.023. Epub 2012 Feb 14.
9
Quantifying center bias of observers in free viewing of dynamic natural scenes.量化观察者在自由观看动态自然场景时的中心偏差。
J Vis. 2009 Jul 9;9(7):4. doi: 10.1167/9.7.4.
10
Foveated, wide field-of-view imaging system using a liquid crystal spatial light modulator.使用液晶空间光调制器的中央凹、宽视场成像系统。
Opt Express. 2001 May 7;8(10):555-60. doi: 10.1364/oe.8.000555.