IEEE Trans Cybern. 2020 Aug;50(8):3594-3604. doi: 10.1109/TCYB.2019.2933477. Epub 2019 Aug 27.
Deeper and wider convolutional neural networks (CNNs) achieve superior performance but bring expensive computation cost. Accelerating such overparameterized neural network has received increased attention. A typical pruning algorithm is a three-stage pipeline, i.e., training, pruning, and retraining. Prevailing approaches fix the pruned filters to zero during retraining and, thus, significantly reduce the optimization space. Besides, they directly prune a large number of filters at first, which would cause unrecoverable information loss. To solve these problems, we propose an asymptotic soft filter pruning (ASFP) method to accelerate the inference procedure of the deep neural networks. First, we update the pruned filters during the retraining stage. As a result, the optimization space of the pruned model would not be reduced but be the same as that of the original model. In this way, the model has enough capacity to learn from the training data. Second, we prune the network asymptotically. We prune few filters at first and asymptotically prune more filters during the training procedure. With asymptotic pruning, the information of the training set would be gradually concentrated in the remaining filters, so the subsequent training and pruning process would be stable. The experiments show the effectiveness of our ASFP on image classification benchmarks. Notably, on ILSVRC-2012, our ASFP reduces more than 40% FLOPs on ResNet-50 with only 0.14% top-5 accuracy degradation, which is higher than the soft filter pruning by 8%.
更深和更宽的卷积神经网络(CNNs)实现了卓越的性能,但带来了昂贵的计算成本。加速这种超参数化的神经网络已经引起了越来越多的关注。一种典型的剪枝算法是一个三阶段的流水线,即训练、剪枝和再训练。流行的方法在再训练过程中固定修剪的过滤器为零,从而显著减小了优化空间。此外,它们最初直接修剪大量的过滤器,这会导致不可恢复的信息丢失。为了解决这些问题,我们提出了一种渐近软滤波器剪枝(ASFP)方法来加速深度神经网络的推理过程。首先,我们在再训练阶段更新修剪的过滤器。因此,修剪模型的优化空间不会减小,而是与原始模型的优化空间相同。这样,模型就有足够的能力从训练数据中学习。其次,我们渐近地修剪网络。我们最初修剪少量的过滤器,并且在训练过程中渐近地修剪更多的过滤器。通过渐近剪枝,训练集的信息将逐渐集中在剩余的过滤器中,因此后续的训练和剪枝过程将是稳定的。实验表明,我们的 ASFP 在图像分类基准上是有效的。值得注意的是,在 ILSVRC-2012 上,我们的 ASFP 在 ResNet-50 上减少了 40%以上的 FLOPs,而 Top-5 准确率仅下降了 0.14%,比软滤波器剪枝高 8%。