• 文献检索
  • 文档翻译
  • 深度研究
  • 学术资讯
  • Suppr Zotero 插件Zotero 插件
  • 邀请有礼
  • 套餐&价格
  • 历史记录
应用&插件
Suppr Zotero 插件Zotero 插件浏览器插件Mac 客户端Windows 客户端微信小程序
定价
高级版会员购买积分包购买API积分包
服务
文献检索文档翻译深度研究API 文档MCP 服务
关于我们
关于 Suppr公司介绍联系我们用户协议隐私条款
关注我们

Suppr 超能文献

核心技术专利:CN118964589B侵权必究
粤ICP备2023148730 号-1Suppr @ 2026

文献检索

告别复杂PubMed语法,用中文像聊天一样搜索,搜遍4000万医学文献。AI智能推荐,让科研检索更轻松。

立即免费搜索

文件翻译

保留排版,准确专业,支持PDF/Word/PPT等文件格式,支持 12+语言互译。

免费翻译文档

深度研究

AI帮你快速写综述,25分钟生成高质量综述,智能提取关键信息,辅助科研写作。

立即免费体验

视觉词典作为人类大脑中的中间特征。

Visual dictionaries as intermediate features in the human brain.

作者信息

Ramakrishnan Kandan, Scholte H Steven, Groen Iris I A, Smeulders Arnold W M, Ghebreab Sennay

机构信息

Intelligent Systems Lab Amsterdam, Institute of Informatics, University of Amsterdam Amsterdam, Netherlands.

Cognitive Neuroscience Group, Department of Psychology, University of Amsterdam Amsterdam, Netherlands.

出版信息

Front Comput Neurosci. 2015 Jan 15;8:168. doi: 10.3389/fncom.2014.00168. eCollection 2014.

DOI:10.3389/fncom.2014.00168
PMID:25642183
原文链接:https://pmc.ncbi.nlm.nih.gov/articles/PMC4295527/
Abstract

The human visual system is assumed to transform low level visual features to object and scene representations via features of intermediate complexity. How the brain computationally represents intermediate features is still unclear. To further elucidate this, we compared the biologically plausible HMAX model and Bag of Words (BoW) model from computer vision. Both these computational models use visual dictionaries, candidate features of intermediate complexity, to represent visual scenes, and the models have been proven effective in automatic object and scene recognition. These models however differ in the computation of visual dictionaries and pooling techniques. We investigated where in the brain and to what extent human fMRI responses to short video can be accounted for by multiple hierarchical levels of the HMAX and BoW models. Brain activity of 20 subjects obtained while viewing a short video clip was analyzed voxel-wise using a distance-based variation partitioning method. Results revealed that both HMAX and BoW explain a significant amount of brain activity in early visual regions V1, V2, and V3. However, BoW exhibits more consistency across subjects in accounting for brain activity compared to HMAX. Furthermore, visual dictionary representations by HMAX and BoW explain significantly some brain activity in higher areas which are believed to process intermediate features. Overall our results indicate that, although both HMAX and BoW account for activity in the human visual system, the BoW seems to more faithfully represent neural responses in low and intermediate level visual areas of the brain.

摘要

人类视觉系统被认为是通过中等复杂度的特征,将低级视觉特征转换为物体和场景表征。大脑如何通过计算来表征中等特征仍不清楚。为了进一步阐明这一点,我们比较了计算机视觉中具有生物学合理性的HMAX模型和词袋(BoW)模型。这两种计算模型都使用视觉词典(中等复杂度的候选特征)来表征视觉场景,并且这些模型已被证明在自动物体和场景识别中是有效的。然而,这些模型在视觉词典的计算和池化技术方面存在差异。我们研究了大脑中的哪些区域以及在何种程度上,人类对短视频的功能磁共振成像(fMRI)反应可以由HMAX和BoW模型的多个层次水平来解释。使用基于距离的变异划分方法,对20名受试者观看短视频片段时获得的大脑活动进行了逐体素分析。结果显示,HMAX和BoW都能解释早期视觉区域V1、V2和V3中大量的大脑活动。然而,与HMAX相比,BoW在解释大脑活动方面在受试者之间表现出更高的一致性。此外,HMAX和BoW的视觉词典表征在一些被认为处理中等特征的更高区域中,也能显著解释部分大脑活动。总体而言,我们的结果表明,尽管HMAX和BoW都能解释人类视觉系统中的活动,但BoW似乎更忠实地代表了大脑低水平和中等水平视觉区域的神经反应。

https://cdn.ncbi.nlm.nih.gov/pmc/blobs/423e/4295527/262f80572a10/fncom-08-00168-g0007.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/423e/4295527/5cab4ef6aeca/fncom-08-00168-g0001.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/423e/4295527/c76da4e71082/fncom-08-00168-g0002.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/423e/4295527/e59bc54b6002/fncom-08-00168-g0003.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/423e/4295527/d933fcc1f040/fncom-08-00168-g0004.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/423e/4295527/b1bf2a9e6cae/fncom-08-00168-g0005.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/423e/4295527/99cd6aff627f/fncom-08-00168-g0006.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/423e/4295527/262f80572a10/fncom-08-00168-g0007.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/423e/4295527/5cab4ef6aeca/fncom-08-00168-g0001.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/423e/4295527/c76da4e71082/fncom-08-00168-g0002.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/423e/4295527/e59bc54b6002/fncom-08-00168-g0003.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/423e/4295527/d933fcc1f040/fncom-08-00168-g0004.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/423e/4295527/b1bf2a9e6cae/fncom-08-00168-g0005.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/423e/4295527/99cd6aff627f/fncom-08-00168-g0006.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/423e/4295527/262f80572a10/fncom-08-00168-g0007.jpg

相似文献

1
Visual dictionaries as intermediate features in the human brain.视觉词典作为人类大脑中的中间特征。
Front Comput Neurosci. 2015 Jan 15;8:168. doi: 10.3389/fncom.2014.00168. eCollection 2014.
2
Fourier power, subjective distance, and object categories all provide plausible models of BOLD responses in scene-selective visual areas.傅里叶功率、主观距离和物体类别都为场景选择性视觉区域的血氧水平依赖反应提供了合理的模型。
Front Comput Neurosci. 2015 Nov 5;9:135. doi: 10.3389/fncom.2015.00135. eCollection 2015.
3
Enhanced HMAX model with feedforward feature learning for multiclass categorization.用于多类分类的具有前馈特征学习的增强型HMAX模型。
Front Comput Neurosci. 2015 Oct 7;9:123. doi: 10.3389/fncom.2015.00123. eCollection 2015.
4
Fast neuromimetic object recognition using FPGA outperforms GPU implementations.使用 FPGA 实现的快速神经拟态目标识别优于 GPU 实现。
IEEE Trans Neural Netw Learn Syst. 2013 Aug;24(8):1239-52. doi: 10.1109/TNNLS.2013.2253563.
5
Spatio-Temporal Scale Coded Bag-of-Words.时空尺度编码词袋。
Sensors (Basel). 2020 Nov 9;20(21):6380. doi: 10.3390/s20216380.
6
Biologically Inspired Visual Model With Preliminary Cognition and Active Attention Adjustment.具有初步认知和主动注意调整的生物启发式视觉模型。
IEEE Trans Cybern. 2015 Nov;45(11):2612-24. doi: 10.1109/TCYB.2014.2377196. Epub 2014 Dec 18.
7
Multivariate Patterns in the Human Object-Processing Pathway Reveal a Shift from Retinotopic to Shape Curvature Representations in Lateral Occipital Areas, LO-1 and LO-2.人类物体处理通路中的多变量模式揭示了枕叶外侧区域LO-1和LO-2中从视网膜拓扑表征到形状曲率表征的转变。
J Neurosci. 2016 May 25;36(21):5763-74. doi: 10.1523/JNEUROSCI.3603-15.2016.
8
Modeling diverse responses to filled and outline shapes in macaque V4.猴 V4 中对填充和轮廓形状的多种反应的建模。
J Neurophysiol. 2019 Mar 1;121(3):1059-1077. doi: 10.1152/jn.00456.2018. Epub 2019 Jan 30.
9
Posterior Inferotemporal Cortex Cells Use Multiple Input Pathways for Shape Encoding.颞下后皮质细胞利用多种输入通路进行形状编码。
J Neurosci. 2017 May 10;37(19):5019-5034. doi: 10.1523/JNEUROSCI.2674-16.2017. Epub 2017 Apr 17.
10
Semantics-preserving bag-of-words models and applications.保留语义的词袋模型及其应用。
IEEE Trans Image Process. 2010 Jul;19(7):1908-20. doi: 10.1109/TIP.2010.2045169. Epub 2010 Mar 11.

引用本文的文献

1
Integrative processing in artificial and biological vision predicts the perceived beauty of natural images.人工和生物视觉中的综合处理预测了自然图像的感知美。
Sci Adv. 2024 Mar;10(9):eadi9294. doi: 10.1126/sciadv.adi9294. Epub 2024 Mar 1.
2
The Amsterdam Open MRI Collection, a set of multimodal MRI datasets for individual difference analyses.阿姆斯特丹开放式磁共振成像数据集,一组用于个体差异分析的多模态磁共振成像数据集。
Sci Data. 2021 Mar 19;8(1):85. doi: 10.1038/s41597-021-00870-6.
3
Distinct contributions of functional and deep neural network features to representational similarity of scenes in human brain and behavior.

本文引用的文献

1
Performance-optimized hierarchical models predict neural responses in higher visual cortex.性能优化的层次模型预测高级视觉皮层中的神经反应。
Proc Natl Acad Sci U S A. 2014 Jun 10;111(23):8619-24. doi: 10.1073/pnas.1403112111. Epub 2014 May 8.
2
Automatic denoising of functional MRI data: combining independent component analysis and hierarchical fusion of classifiers.功能磁共振成像数据的自动去噪:结合独立成分分析和分类器的分层融合
Neuroimage. 2014 Apr 15;90:449-68. doi: 10.1016/j.neuroimage.2013.11.046. Epub 2014 Jan 2.
3
Comparing visual representations across human fMRI and computational vision.
功能和深度神经网络特征对人类大脑和行为中场景表象相似性的独特贡献。
Elife. 2018 Mar 7;7:e32962. doi: 10.7554/eLife.32962.
4
Contributions of low- and high-level properties to neural processing of visual scenes in the human brain.低级和高级属性对人类大脑中视觉场景神经处理的贡献。
Philos Trans R Soc Lond B Biol Sci. 2017 Feb 19;372(1714). doi: 10.1098/rstb.2016.0102. Epub 2017 Jan 2.
5
Editorial: Hierarchical Object Representations in the Visual Cortex and Computer Vision.社论:视觉皮层与计算机视觉中的分层对象表示
Front Comput Neurosci. 2015 Nov 20;9:142. doi: 10.3389/fncom.2015.00142. eCollection 2015.
6
Enhanced HMAX model with feedforward feature learning for multiclass categorization.用于多类分类的具有前馈特征学习的增强型HMAX模型。
Front Comput Neurosci. 2015 Oct 7;9:123. doi: 10.3389/fncom.2015.00123. eCollection 2015.
比较人类功能磁共振成像和计算视觉中的视觉表征。
J Vis. 2013 Nov 22;13(13):25. doi: 10.1167/13.13.25.
4
Medial axis shape coding in macaque inferotemporal cortex.恒河猴下颞叶皮层中的中轴形状编码。
Neuron. 2012 Jun 21;74(6):1099-113. doi: 10.1016/j.neuron.2012.04.029.
5
Aggregating local image descriptors into compact codes.将局部图像描述符聚合到紧凑代码中。
IEEE Trans Pattern Anal Mach Intell. 2012 Sep;34(9):1704-16. doi: 10.1109/TPAMI.2011.235.
6
FSL.束流输送系统。
Neuroimage. 2012 Aug 15;62(2):782-90. doi: 10.1016/j.neuroimage.2011.09.015. Epub 2011 Sep 16.
7
Reconstructing visual experiences from brain activity evoked by natural movies.从自然电影诱发的大脑活动中重建视觉体验。
Curr Biol. 2011 Oct 11;21(19):1641-6. doi: 10.1016/j.cub.2011.08.031. Epub 2011 Sep 22.
8
Representational similarity analysis - connecting the branches of systems neuroscience.表象相似性分析——连接系统神经科学的分支。
Front Syst Neurosci. 2008 Nov 24;2:4. doi: 10.3389/neuro.06.004.2008. eCollection 2008.
9
Localized information is necessary for scene categorization, including the Natural/Man-made distinction.局部信息对于场景分类是必要的,包括自然/人造的区分。
J Vis. 2008 Jan 11;8(1):4.1-9. doi: 10.1167/8.1.4.
10
Variation partitioning of species data matrices: estimation and comparison of fractions.物种数据矩阵的变异分解:分数的估计与比较
Ecology. 2006 Oct;87(10):2614-25. doi: 10.1890/0012-9658(2006)87[2614:vposdm]2.0.co;2.