Institute of Microelectronics, Chinese Academy of Sciences, Beijing 100029, China.
University of Chinese Academy of Sciences, Beijing 100049, China.
Sensors (Basel). 2021 May 16;21(10):3464. doi: 10.3390/s21103464.
Convolutional neural networks (CNNs) have achieved significant breakthroughs in various domains, such as natural language processing (NLP), and computer vision. However, performance improvement is often accompanied by large model size and computation costs, which make it not suitable for resource-constrained devices. Consequently, there is an urgent need to compress CNNs, so as to reduce model size and computation costs. This paper proposes a layer-wise differentiable compression (LWDC) algorithm for compressing CNNs structurally. A differentiable selection operator OS is embedded in the model to compress and train the model simultaneously by gradient descent in one go. Instead of pruning parameters from redundant operators by contrast to most of the existing methods, our method replaces the original bulky operators with more lightweight ones directly, which only needs to specify the set of lightweight operators and the regularization factor in advance, rather than the compression rate for each layer. The compressed model produced by our method is generic and does not need any special hardware/software support. Experimental results on CIFAR-10, CIFAR-100 and ImageNet have demonstrated the effectiveness of our method. LWDC obtains more significant compression than state-of-the-art methods in most cases, while having lower performance degradation. The impact of lightweight operators and regularization factor on the compression rate and accuracy also is evaluated.
卷积神经网络 (CNN) 在自然语言处理 (NLP) 和计算机视觉等各个领域都取得了重大突破。然而,性能的提高往往伴随着模型尺寸和计算成本的增加,这使得它们不适合资源受限的设备。因此,迫切需要对 CNN 进行压缩,以减小模型尺寸和计算成本。本文提出了一种基于层的可微分压缩 (LWDC) 算法,用于对 CNN 进行结构压缩。在模型中嵌入一个可微分选择算子 OS,通过梯度下降在一次迭代中同时压缩和训练模型。与大多数现有方法通过从冗余算子中修剪参数不同,我们的方法直接用更轻量级的算子替换原始的庞大算子,只需要提前指定轻量级算子集和正则化因子,而不需要为每层指定压缩率。我们的方法生成的压缩模型是通用的,不需要任何特殊的硬件/软件支持。在 CIFAR-10、CIFAR-100 和 ImageNet 上的实验结果表明了我们方法的有效性。LWDC 在大多数情况下比最先进的方法获得了更大的压缩率,同时性能下降更小。还评估了轻量级算子和正则化因子对压缩率和准确性的影响。