Zhao Zhenbing, Qi Hongyu, Fan Xiaoqing, Xu Guozhi, Qi Yincheng, Zhai Yongjie, Zhang Ke
School of Electrical and Electronic Engineering, North China Electric Power University, Baoding 071003, China.
Hangzhou Institute, NetEase, Hangzhou 310052, China.
Entropy (Basel). 2020 Apr 8;22(4):419. doi: 10.3390/e22040419.
Deep convolutional neural networks (DCNNs) with alternating convolutional, pooling and decimation layers are widely used in computer vision, yet current works tend to focus on deeper networks with many layers and neurons, resulting in a high computational complexity. However, the recognition task is still challenging for insufficient and uncomprehensive object appearance and training sample types such as infrared insulators. In view of this, more attention is focused on the application of a pretrained network for image feature representation, but the rules on how to select the feature representation layer are scarce. In this paper, we proposed a new concept, the layer entropy and relative layer entropy, which can be referred to as an image representation method based on relative layer entropy (IRM_RLE). It was designed to excavate the most suitable convolution layer for image recognition. First, the image was fed into an ImageNet pretrained DCNN model, and deep convolutional activations were extracted. Then, the appropriate feature layer was selected by calculating the layer entropy and relative layer entropy of each convolution layer. Finally, the number of the feature map was selected according to the importance degree and the feature maps of the convolution layer, which were vectorized and pooled by VLAD (vector of locally aggregated descriptors) coding and quantifying for final image representation. The experimental results show that the proposed approach performs competitively against previous methods across all datasets. Furthermore, for the indoor scenes and actions datasets, the proposed approach outperforms the state-of-the-art methods.
具有交替卷积、池化和下采样层的深度卷积神经网络(DCNN)在计算机视觉中被广泛使用,但目前的工作往往侧重于具有许多层和神经元的更深层次网络,导致计算复杂度很高。然而,由于物体外观不足和不全面以及训练样本类型(如红外绝缘子)等原因,识别任务仍然具有挑战性。鉴于此,更多的注意力集中在预训练网络在图像特征表示方面的应用,但关于如何选择特征表示层的规则却很少。在本文中,我们提出了一个新的概念,即层熵和相对层熵,可将其称为基于相对层熵的图像表示方法(IRM_RLE)。它旨在挖掘最适合图像识别的卷积层。首先,将图像输入到在ImageNet上预训练的DCNN模型中,提取深度卷积激活。然后,通过计算每个卷积层的层熵和相对层熵来选择合适的特征层。最后,根据重要程度选择特征图的数量,并对卷积层的特征图进行矢量化,通过VLAD(局部聚合描述符向量)编码和量化进行池化,以实现最终的图像表示。实验结果表明,所提出的方法在所有数据集上与先前的方法相比具有竞争力。此外,对于室内场景和动作数据集,所提出的方法优于当前的先进方法。