South China University of Technology, China; Pazhou Laboratory, China; Key Laboratory of Big Data and Intelligent Robot, Ministry of Education.
South China University of Technology, China.
Neural Netw. 2021 Nov;143:657-668. doi: 10.1016/j.neunet.2021.06.030. Epub 2021 Jul 12.
Convolutional Neural Networks (CNNs) have achieved great success due to the powerful feature learning ability of convolution layers. Specifically, the standard convolution traverses the input images/features using a sliding window scheme to extract features. However, not all the windows contribute equally to the prediction results of CNNs. In practice, the convolutional operation on some of the windows (e.g., smooth windows that contain very similar pixels) can be very redundant and may introduce noises into the computation. Such redundancy may not only deteriorate the performance but also incur the unnecessary computational cost. Thus, it is important to reduce the computational redundancy of convolution to improve the performance. To this end, we propose a Content-aware Convolution (CAC) that automatically detects the smooth windows and applies a 1 ×1 convolutional kernel to replace the original large kernel. In this sense, we are able to effectively avoid the redundant computation on similar pixels. By replacing the standard convolution in CNNs with our CAC, the resultant models yield significantly better performance and lower computational cost than the baseline models with the standard convolution. More critically, we are able to dynamically allocate suitable computation resources according to the data smoothness of different images, making it possible for content-aware computation. Extensive experiments on various computer vision tasks demonstrate the superiority of our method over existing methods.
卷积神经网络 (CNN) 由于卷积层强大的特征学习能力而取得了巨大的成功。具体来说,标准卷积使用滑动窗口方案遍历输入图像/特征,以提取特征。然而,并非所有窗口都对 CNN 的预测结果做出同等贡献。在实践中,对某些窗口(例如,包含非常相似像素的平滑窗口)的卷积操作可能非常冗余,并可能给计算带来噪声。这种冗余不仅会降低性能,还会导致不必要的计算成本。因此,减少卷积的计算冗余以提高性能非常重要。为此,我们提出了一种内容感知卷积 (CAC),它可以自动检测平滑窗口,并应用 1x1 卷积核来替代原始的大核。从这个意义上说,我们能够有效地避免在相似像素上进行冗余计算。通过将我们的 CAC 替代 CNN 中的标准卷积,所得到的模型比具有标准卷积的基线模型具有更好的性能和更低的计算成本。更重要的是,我们能够根据不同图像的数据平滑度动态分配合适的计算资源,实现内容感知计算。在各种计算机视觉任务上的广泛实验证明了我们的方法优于现有方法。