Wu Jie, Zhu Dingshun, Fang Leyuan, Deng Yue, Zhong Zhun
IEEE Trans Image Process. 2023;32:4689-4700. doi: 10.1109/TIP.2023.3302519. Epub 2023 Aug 16.
Network pruning is one of the chief means for improving the computational efficiency of Deep Neural Networks (DNNs). Pruning-based methods generally discard network kernels, channels, or layers, which however inevitably will disrupt original well-learned network correlation and thus lead to performance degeneration. In this work, we propose an Efficient Layer Compression (ELC) approach to efficiently compress serial layers by decoupling and merging rather than pruning. Specifically, we first propose a novel decoupling module to decouple the layers, enabling us readily merge serial layers that include both nonlinear and convolutional layers. Then, the decoupled network is losslessly merged based on the equivalent conversion of the parameters. In this way, our ELC can effectively reduce the depth of the network without destroying the correlation of the convolutional layers. To our best knowledge, we are the first to exploit the mergeability of serial convolutional layers for lossless network layer compression. Experimental results conducted on two datasets demonstrate that our method retains superior performance with a FLOPs reduction of 74.1% for VGG-16 and 54.6% for ResNet-56, respectively. In addition, our ELC improves the inference speed by 2× on Jetson AGX Xavier edge device.
网络剪枝是提高深度神经网络(DNN)计算效率的主要手段之一。基于剪枝的方法通常会丢弃网络内核、通道或层,然而这不可避免地会破坏原本学习良好的网络相关性,从而导致性能退化。在这项工作中,我们提出了一种高效层压缩(ELC)方法,通过解耦和合并而不是剪枝来有效地压缩串行层。具体来说,我们首先提出了一种新颖的解耦模块来解耦层,使我们能够轻松合并包括非线性层和卷积层的串行层。然后,基于参数的等效转换对解耦后的网络进行无损合并。通过这种方式,我们的ELC可以有效降低网络深度,而不会破坏卷积层的相关性。据我们所知,我们是首个利用串行卷积层的可合并性进行无损网络层压缩的。在两个数据集上进行的实验结果表明,我们的方法分别在VGG - 16上实现了74.1%的浮点运算次数(FLOPs)减少,在ResNet - 56上实现了54.6%的FLOPs减少,同时保持了卓越的性能。此外,我们的ELC在Jetson AGX Xavier边缘设备上使推理速度提高了2倍。