Lan Weichao, Cheung Yiu-Ming, Lan Liang, Jiang Juyong, Hu Zhikai
IEEE Trans Neural Netw Learn Syst. 2025 Jun;36(6):10257-10270. doi: 10.1109/TNNLS.2024.3457943.
Convolutional neural networks (CNNs) have achieved significant performance on various real-life tasks. However, the large number of parameters in convolutional layers requires huge storage and computation resources, making it challenging to deploy CNNs on memory-constrained embedded devices. In this article, we propose a novel compression method that generates the convolution filters in each layer using a set of learnable low-dimensional quantized filter bases. The proposed method reconstructs the convolution filters by stacking the linear combinations of these filter bases. By using quantized values in weights, the compact filters can be represented using fewer bits so that the network can be highly compressed. Furthermore, we explore the sparsity of coefficients through $L_{1}$ -ball projection when conducting linear combination to further reduce the storage consumption and prevent overfitting. We also provide a detailed analysis of the compression performance of the proposed method. Evaluations of image classification and object detection tasks using various network structures demonstrate that the proposed method achieves a higher compression ratio with comparable accuracy compared with the existing state-of-the-art filter decomposition and network quantization methods.
卷积神经网络(CNN)在各种实际任务中都取得了显著的性能。然而,卷积层中的大量参数需要巨大的存储和计算资源,这使得在内存受限的嵌入式设备上部署CNN具有挑战性。在本文中,我们提出了一种新颖的压缩方法,该方法使用一组可学习的低维量化滤波器基来生成每一层的卷积滤波器。所提出的方法通过堆叠这些滤波器基的线性组合来重构卷积滤波器。通过在权重中使用量化值,可以用更少的比特来表示紧凑的滤波器,从而使网络能够被高度压缩。此外,我们在进行线性组合时通过$L_{1}$球投影来探索系数的稀疏性,以进一步减少存储消耗并防止过拟合。我们还对所提出方法的压缩性能进行了详细分析。使用各种网络结构对图像分类和目标检测任务进行的评估表明,与现有的最先进的滤波器分解和网络量化方法相比,所提出的方法在具有可比精度的情况下实现了更高的压缩率。