School of Data and Computer Science and Guangdong Key Laboratory of Information Security Technology, Sun Yat-Sen University, Guangzhou, Guangdong 510006, China
Neural Comput. 2020 Dec;32(12):2532-2556. doi: 10.1162/neco_a_01330. Epub 2020 Oct 20.
Pruning is an effective way to slim and speed up convolutional neural networks. Generally previous work directly pruned neural networks in the original feature space without considering the correlation of neurons. We argue that such a way of pruning still keeps some redundancy in the pruned networks. In this letter, we proposed to prune in the intermediate space in which the correlation of neurons is eliminated. To achieve this goal, the input and output of a convolutional layer are first mapped to an intermediate space by orthogonal transformation. Then neurons are evaluated and pruned in the intermediate space. Extensive experiments have shown that our redundancy-aware pruning method surpasses state-of-the-art pruning methods on both efficiency and accuracy. Notably, using our redundancy-aware pruning method, ResNet models with three times the speed-up could achieve competitive performance with fewer floating point operations per second even compared to DenseNet.
剪枝是一种有效缩小和加速卷积神经网络的方法。通常,之前的工作直接在原始特征空间中剪枝神经网络,而没有考虑神经元的相关性。我们认为这种剪枝方式仍然在剪枝后的网络中保留了一些冗余。在这封信中,我们提出在中间空间中进行剪枝,以消除神经元之间的相关性。为了实现这一目标,我们首先通过正交变换将卷积层的输入和输出映射到中间空间。然后在中间空间中评估和剪枝神经元。大量实验表明,我们的冗余感知剪枝方法在效率和准确性方面都优于最先进的剪枝方法。值得注意的是,使用我们的冗余感知剪枝方法,加速三倍的 ResNet 模型甚至可以与 DenseNet 相比,在每秒钟浮点运算更少的情况下实现具有竞争力的性能。