IEEE Trans Pattern Anal Mach Intell. 2019 Oct;41(10):2495-2510. doi: 10.1109/TPAMI.2018.2857824. Epub 2018 Jul 19.
Deep convolutional neural networks (CNNs) are successfully used in a number of applications. However, their storage and computational requirements have largely prevented their widespread use on mobile devices. Here we present a series of approaches for compressing and speeding up CNNs in the frequency domain, which focuses not only on smaller weights but on all the weights and their underlying connections. By treating convolution filters as images, we decompose their representations in the frequency domain as common parts (i.e., cluster centers) shared by other similar filters and their individual private parts (i.e., individual residuals). A large number of low-energy frequency coefficients in both parts can be discarded to produce high compression without significantly compression romising accuracy. Furthermore, we explore a data-driven method for removing redundancies in both spatial and frequency domains, which allows us to discard more useless weights by keeping similar accuracies. After obtaining the optimal sparse CNN in the frequency domain, we relax the computational burden of convolution operations in CNNs by linearly combining the convolution responses of discrete cosine transform (DCT) bases. The compression and speed-up ratios of the proposed algorithm are thoroughly analyzed and evaluated on benchmark image datasets to demonstrate its superiority over state-of-the-art methods.
深度卷积神经网络(CNNs)在许多应用中都得到了成功的应用。然而,其存储和计算需求在很大程度上阻止了它们在移动设备上的广泛应用。在这里,我们提出了一系列在频域中压缩和加速 CNN 的方法,这些方法不仅关注较小的权重,还关注所有的权重及其基础连接。通过将卷积滤波器视为图像,我们将它们在频域中的表示分解为常见部分(即,其他类似滤波器共享的聚类中心)和它们各自的私有部分(即,个体残差)。这两部分中大量的低能量频域系数可以被丢弃,从而在不显著降低精度的情况下实现高压缩率。此外,我们还探索了一种在空间域和频域中去除冗余的基于数据的方法,该方法通过保持相似的精度,允许我们丢弃更多无用的权重。在获得最优的频域稀疏 CNN 后,我们通过线性组合离散余弦变换(DCT)基的卷积响应,来减轻 CNN 中卷积操作的计算负担。该算法的压缩比和加速比在基准图像数据集上进行了深入的分析和评估,以证明其优于最先进的方法。