Li Wenjuan, Li Bing, Yuan Chunfeng, Li Yangxi, Wu Haohao, Hu Weiming, Wang Fangshi
IEEE Trans Image Process. 2020 Apr 13. doi: 10.1109/TIP.2020.2985875.
Convolutional neural networks are built upon simple but useful convolution modules. The traditional convolution has a limitation on feature extraction and object localization due to its fixed scale and geometric structure. Besides, the loss of spatial information also restricts the networks' performance and depth. To overcome these limitations, this paper proposes a novel anisotropic convolution by adding a scale factor and a shape factor into the traditional convolution. The anisotropic convolution augments the receptive fields flexibly and dynamically depending on the valid sizes of objects. In addition, the anisotropic convolution is a generalized convolution. The traditional convolution, dilated convolution and deformable convolution can be viewed as its special cases. Furthermore, in order to improve the training efficiency and avoid falling into a local optimum, this paper introduces a simplified implementation of the anisotropic convolution. The anisotropic convolution can be applied to arbitrary convolutional networks and the enhanced networks are called ACNs (anisotropic convolutional networks). Experimental results show that ACNs achieve better performance than many state-of-the-art methods and the baseline networks in tasks of image classification and object localization, especially in classification task of tiny images.
卷积神经网络基于简单但有用的卷积模块构建。传统卷积由于其固定的尺度和几何结构,在特征提取和目标定位方面存在局限性。此外,空间信息的丢失也限制了网络的性能和深度。为了克服这些局限性,本文通过在传统卷积中添加一个尺度因子和一个形状因子,提出了一种新颖的各向异性卷积。各向异性卷积根据目标的有效大小灵活动态地扩大感受野。此外,各向异性卷积是一种广义卷积。传统卷积、空洞卷积和可变形卷积都可以看作是它的特殊情况。此外,为了提高训练效率并避免陷入局部最优,本文介绍了各向异性卷积的一种简化实现。各向异性卷积可以应用于任意卷积网络,增强后的网络称为ACN(各向异性卷积网络)。实验结果表明,在图像分类和目标定位任务中,ACN比许多现有方法和基线网络表现更好,尤其是在微小图像的分类任务中。