Digital Contents Research Institute, Sejong University, Seoul, Republic of Korea.
J Med Syst. 2017 Oct 30;41(12):196. doi: 10.1007/s10916-017-0836-y.
With the growing use of minimally invasive surgical procedures, endoscopic video archives are growing at a rapid pace. Efficient access to relevant content in such huge multimedia archives require compact and discriminative visual features for indexing and matching. In this paper, we present an effective method to represent images using salient convolutional features. Convolutional kernels from the first layer of a pre-trained convolutional neural network (CNN) are analyzed and clustered into multiple distinct groups, based on their sensitivity to colors and textures. Dominant features detected by each cluster are collected into a single, layout-preserving feature map using a spatial maximal activator pooling (SMAP) approach. A moving window based structured pooling method then captures spatial layout features and global shape information from the aggregated feature map to populate feature histograms. Finally, individual histograms for each cluster are combined into a single comprehensive feature histogram. Clustering convolutional feature space allow extraction of color and texture features of varying strengths. Further, the SMAP approach enable us to select dominant discriminative features. The proposed features are compact and capable of conveniently outperforming several existing features extraction approaches in retrieval and classification tasks on endoscopy images dataset.
随着微创外科手术的日益普及,内窥镜视频档案正以惊人的速度增长。在如此庞大的多媒体档案中,要高效地访问相关内容,就需要紧凑且有区别的视觉特征来进行索引和匹配。在本文中,我们提出了一种使用显著卷积特征来表示图像的有效方法。基于对颜色和纹理的敏感度,从预先训练的卷积神经网络 (CNN) 的第一层卷积核中分析和聚类成多个不同的组。使用空间最大激活池化 (SMAP) 方法,从每个聚类中收集检测到的主导特征,将其合并到一个保持布局的单一特征图中。然后,基于移动窗口的结构化池化方法从聚合特征图中捕获空间布局特征和全局形状信息,以填充特征直方图。最后,将每个聚类的单个直方图组合成一个综合特征直方图。聚类卷积特征空间允许提取不同强度的颜色和纹理特征。此外,SMAP 方法使我们能够选择主要的判别特征。所提出的特征紧凑且方便,在内窥镜图像数据集的检索和分类任务中,能够明显优于几种现有的特征提取方法。