用于多类分类的具有前馈特征学习的增强型HMAX模型。

Enhanced HMAX model with feedforward feature learning for multiclass categorization.

作者信息

Li Yinlin, Wu Wei, Zhang Bo, Li Fengfu

机构信息

State Key Lab of Management and Control for Complex Systems, Institute of Automation, Chinese Academy of Sciences Beijing, China.

Institute of Applied Mathematics, Academy of Mathematics and Systems Science, Chinese Academy of Sciences Beijing, China.

出版信息

Front Comput Neurosci. 2015 Oct 7;9:123. doi: 10.3389/fncom.2015.00123. eCollection 2015.

DOI:10.3389/fncom.2015.00123

PMID:26500532

原文链接:https://pmc.ncbi.nlm.nih.gov/articles/PMC4595662/

Abstract

In recent years, the interdisciplinary research between neuroscience and computer vision has promoted the development in both fields. Many biologically inspired visual models are proposed, and among them, the Hierarchical Max-pooling model (HMAX) is a feedforward model mimicking the structures and functions of V1 to posterior inferotemporal (PIT) layer of the primate visual cortex, which could generate a series of position- and scale- invariant features. However, it could be improved with attention modulation and memory processing, which are two important properties of the primate visual cortex. Thus, in this paper, based on recent biological research on the primate visual cortex, we still mimic the first 100-150 ms of visual cognition to enhance the HMAX model, which mainly focuses on the unsupervised feedforward feature learning process. The main modifications are as follows: (1) To mimic the attention modulation mechanism of V1 layer, a bottom-up saliency map is computed in the S1 layer of the HMAX model, which can support the initial feature extraction for memory processing; (2) To mimic the learning, clustering and short-term memory to long-term memory conversion abilities of V2 and IT, an unsupervised iterative clustering method is used to learn clusters with multiscale middle level patches, which are taken as long-term memory; (3) Inspired by the multiple feature encoding mode of the primate visual cortex, information including color, orientation, and spatial position are encoded in different layers of the HMAX model progressively. By adding a softmax layer at the top of the model, multiclass categorization experiments can be conducted, and the results on Caltech101 show that the enhanced model with a smaller memory size exhibits higher accuracy than the original HMAX model, and could also achieve better accuracy than other unsupervised feature learning methods in multiclass categorization task.

摘要

近年来，神经科学与计算机视觉之间的跨学科研究推动了这两个领域的发展。人们提出了许多受生物启发的视觉模型，其中，分层最大池化模型（HMAX）是一种前馈模型，它模仿了灵长类动物视觉皮层从V1到后颞下叶（PIT）层的结构和功能，能够生成一系列位置和尺度不变特征。然而，它可以通过注意力调制和记忆处理来改进，而注意力调制和记忆处理是灵长类动物视觉皮层的两个重要特性。因此，在本文中，基于对灵长类动物视觉皮层的最新生物学研究，我们仍然模仿视觉认知的前100 - 150毫秒来增强HMAX模型，该模型主要关注无监督的前馈特征学习过程。主要修改如下：（1）为了模仿V1层的注意力调制机制，在HMAX模型的S1层计算一个自下而上的显著图，它可以支持用于记忆处理的初始特征提取；（2）为了模仿V2和IT的学习、聚类以及从短期记忆到长期记忆的转换能力，使用一种无监督的迭代聚类方法来学习具有多尺度中层补丁的聚类，这些聚类被视为长期记忆；（3）受灵长类动物视觉皮层的多特征编码模式启发，包括颜色、方向和空间位置的信息在HMAX模型的不同层中逐步编码。通过在模型顶部添加一个softmax层，可以进行多类分类实验，在Caltech101上的结果表明，具有较小内存大小的增强模型比原始HMAX模型表现出更高的准确率，并且在多类分类任务中也能比其他无监督特征学习方法获得更好的准确率。

https://cdn.ncbi.nlm.nih.gov/pmc/blobs/75b0/4595662/b1e240e97bd6/fncom-09-00123-g0001.jpg

相似文献

Enhanced HMAX model with feedforward feature learning for multiclass categorization.

Front Comput Neurosci. 2015 Oct 7;9:123. doi: 10.3389/fncom.2015.00123. eCollection 2015.

Biologically Inspired Visual Model With Preliminary Cognition and Active Attention Adjustment.

IEEE Trans Cybern. 2015 Nov;45(11):2612-24. doi: 10.1109/TCYB.2014.2377196. Epub 2014 Dec 18.

Introducing memory and association mechanism into a biologically inspired visual model.

IEEE Trans Cybern. 2014 Sep;44(9):1485-96. doi: 10.1109/TCYB.2013.2287014. Epub 2013 Oct 30.

Sparsity-regularized HMAX for visual recognition.

PLoS One. 2014 Jan 2;9(1):e81813. doi: 10.1371/journal.pone.0081813. eCollection 2014.

Biologically Inspired Model for Visual Cognition Achieving Unsupervised Episodic and Semantic Feature Learning.

IEEE Trans Cybern. 2016 Oct;46(10):2335-2347. doi: 10.1109/TCYB.2015.2476706. Epub 2015 Sep 18.

Posterior Inferotemporal Cortex Cells Use Multiple Input Pathways for Shape Encoding.

J Neurosci. 2017 May 10;37(19):5019-5034. doi: 10.1523/JNEUROSCI.2674-16.2017. Epub 2017 Apr 17.

Visual dictionaries as intermediate features in the human brain.

Front Comput Neurosci. 2015 Jan 15;8:168. doi: 10.3389/fncom.2014.00168. eCollection 2014.

Biologically plausible saliency mechanisms improve feedforward object recognition.

Vision Res. 2010 Oct 28;50(22):2295-307. doi: 10.1016/j.visres.2010.05.034. Epub 2010 Jun 2.

Top-down feedback in an HMAX-like cortical model of object perception based on hierarchical Bayesian networks and belief propagation.

PLoS One. 2012;7(11):e48216. doi: 10.1371/journal.pone.0048216. Epub 2012 Nov 5.

A stable biologically motivated learning mechanism for visual feature extraction to handle facial categorization.

PLoS One. 2012;7(6):e38478. doi: 10.1371/journal.pone.0038478. Epub 2012 Jun 13.

引用本文的文献

Emotional concepts shape the perceptual representation of body expressions.

Hum Brain Mapp. 2024 Aug 15;45(12):e26789. doi: 10.1002/hbm.26789.

The neural representation of facial-emotion categories reflects conceptual structure.

Proc Natl Acad Sci U S A. 2019 Aug 6;116(32):15861-15870. doi: 10.1073/pnas.1816408116. Epub 2019 Jul 22.

本文引用的文献

Visual dictionaries as intermediate features in the human brain.

Front Comput Neurosci. 2015 Jan 15;8:168. doi: 10.3389/fncom.2014.00168. eCollection 2014.

Biologically Inspired Visual Model With Preliminary Cognition and Active Attention Adjustment.

IEEE Trans Cybern. 2015 Nov;45(11):2612-24. doi: 10.1109/TCYB.2014.2377196. Epub 2014 Dec 18.

Introducing memory and association mechanism into a biologically inspired visual model.

IEEE Trans Cybern. 2014 Sep;44(9):1485-96. doi: 10.1109/TCYB.2013.2287014. Epub 2013 Oct 30.

Deep hierarchies in the primate visual cortex: what can we learn for computer vision?

IEEE Trans Pattern Anal Mach Intell. 2013 Aug;35(8):1847-71. doi: 10.1109/TPAMI.2012.272.

Objects and categories: feature statistics and object processing in the ventral stream.

J Cogn Neurosci. 2013 Oct;25(10):1723-35. doi: 10.1162/jocn_a_00419. Epub 2013 May 10.

Extended coding and pooling in the HMAX model.

IEEE Trans Image Process. 2013 Feb;22(2):764-77. doi: 10.1109/TIP.2012.2222900. Epub 2012 Oct 5.

Neural activities in v1 create a bottom-up saliency map.

Neuron. 2012 Jan 12;73(1):183-92. doi: 10.1016/j.neuron.2011.10.035.

Enhanced Biologically Inspired Model for Object Recognition.

IEEE Trans Syst Man Cybern B Cybern. 2011 Dec;41(6):1668-80. doi: 10.1109/TSMCB.2011.2158418. Epub 2011 Jul 14.

Stimulus saliency modulates pre-attentive processing speed in human visual cortex.

PLoS One. 2011 Jan 21;6(1):e16276. doi: 10.1371/journal.pone.0016276.

Top-down and bottom-up control of visual selection.

Acta Psychol (Amst). 2010 Oct;135(2):77-99. doi: 10.1016/j.actpsy.2010.02.006. Epub 2010 May 26.

文献AI研究员

20分钟写一篇综述，助力文献阅读效率提升50倍。

立即体验

用中文搜PubMed

大模型驱动的PubMed中文搜索引擎

马上搜索

文档翻译

学术文献翻译模型，支持多种主流文档格式。

立即体验

用于多类分类的具有前馈特征学习的增强型HMAX模型。

Enhanced HMAX model with feedforward feature learning for multiclass categorization.

作者信息

机构信息

出版信息

相似文献

引用本文的文献

本文引用的文献

文献AI研究员

用中文搜PubMed

文档翻译

Suppr 超能文献

相似文献

引用本文的文献

本文引用的文献