通过误差补偿实现深度网络量化

Deep Network Quantization via Error Compensation.

作者信息

Peng Hanyu, Wu Jiaxiang, Zhang Zhiwei, Chen Shifeng, Zhang Hai-Tao

出版信息

IEEE Trans Neural Netw Learn Syst. 2022 Sep;33(9):4960-4970. doi: 10.1109/TNNLS.2021.3064293. Epub 2022 Aug 31.

DOI:10.1109/TNNLS.2021.3064293

Abstract

For portable devices with limited resources, it is often difficult to deploy deep networks due to the prohibitive computational overhead. Numerous approaches have been proposed to quantize weights and/or activations to speed up the inference. Loss-aware quantization has been proposed to directly formulate the impact of weight quantization on the model's final loss. However, we discover that, under certain circumstances, such a method may not converge and end up oscillating. To tackle this issue, we introduce a novel loss-aware quantization algorithm to efficiently compress deep networks with low bit-width model weights. We provide a more accurate estimation of gradients by leveraging the Taylor expansion to compensate for the quantization error, which leads to better convergence behavior. Our theoretical analysis indicates that the gradient mismatch issue can be fixed by the newly introduced quantization error compensation term. Experimental results for both linear models and convolutional networks verify the effectiveness of our proposed method.

摘要

对于资源有限的便携式设备，由于计算开销过高，通常很难部署深度网络。已经提出了许多方法来量化权重和/或激活值以加速推理。有人提出了损失感知量化，以直接阐述权重量化对模型最终损失的影响。然而，我们发现，在某些情况下，这种方法可能不会收敛，最终会出现振荡。为了解决这个问题，我们引入了一种新颖的损失感知量化算法，以有效地压缩具有低比特宽度模型权重的深度网络。我们通过利用泰勒展开来补偿量化误差，从而提供更准确的梯度估计，这导致了更好的收敛行为。我们的理论分析表明，新引入的量化误差补偿项可以解决梯度不匹配问题。线性模型和卷积网络的实验结果验证了我们提出的方法的有效性。

相似文献

Deep Network Quantization via Error Compensation.通过误差补偿实现深度网络量化

IEEE Trans Neural Netw Learn Syst. 2022 Sep;33(9):4960-4970. doi: 10.1109/TNNLS.2021.3064293. Epub 2022 Aug 31.

A Novel Low-Bit Quantization Strategy for Compressing Deep Neural Networks.一种用于压缩深度神经网络的新型低比特量化策略。

Comput Intell Neurosci. 2020 Feb 18;2020:7839064. doi: 10.1155/2020/7839064. eCollection 2020.

General Bitwidth Assignment for Efficient Deep Convolutional Neural Network Quantization.高效深度卷积神经网络量化的通用位宽分配。

IEEE Trans Neural Netw Learn Syst. 2022 Oct;33(10):5253-5267. doi: 10.1109/TNNLS.2021.3069886. Epub 2022 Oct 5.

Transform Quantization for CNN Compression.用于卷积神经网络压缩的变换量化

IEEE Trans Pattern Anal Mach Intell. 2022 Sep;44(9):5700-5714. doi: 10.1109/TPAMI.2021.3084839. Epub 2022 Aug 4.

Training high-performance and large-scale deep neural networks with full 8-bit integers.用全 8 位整数训练高性能和大规模深度神经网络。

Neural Netw. 2020 May;125:70-82. doi: 10.1016/j.neunet.2019.12.027. Epub 2020 Jan 15.

A Hardware-Friendly Low-Bit Power-of-Two Quantization Method for CNNs and Its FPGA Implementation.一种面向硬件的 CNN 低比特数 2 的幂量化方法及其 FPGA 实现。

Sensors (Basel). 2022 Sep 1;22(17):6618. doi: 10.3390/s22176618.

Exploiting Retraining-Based Mixed-Precision Quantization for Low-Cost DNN Accelerator Design.利用基于再训练的混合精度量化进行低成本深度神经网络加速器设计。

IEEE Trans Neural Netw Learn Syst. 2021 Jul;32(7):2925-2938. doi: 10.1109/TNNLS.2020.3008996. Epub 2021 Jul 6.

Local Means Binary Networks for Image Super-Resolution.用于图像超分辨率的局部均值二值网络

IEEE Trans Neural Netw Learn Syst. 2024 May;35(5):6746-6756. doi: 10.1109/TNNLS.2022.3212827. Epub 2024 May 2.

Convolutional Neural Networks Quantization with Double-Stage Squeeze-and-Threshold.卷积神经网络的双阶段压缩-门限量化方法。

Int J Neural Syst. 2022 Dec;32(12):2250051. doi: 10.1142/S0129065722500514. Epub 2022 Sep 26.

Adaptive Global Power-of-Two Ternary Quantization Algorithm Based on Unfixed Boundary Thresholds.基于非固定边界阈值的自适应全局二的幂次三值量化算法

Sensors (Basel). 2023 Dec 28;24(1):181. doi: 10.3390/s24010181.

文献检索

告别复杂PubMed语法，用中文像聊天一样搜索，搜遍4000万医学文献。AI智能推荐，让科研检索更轻松。

立即免费搜索

文件翻译

保留排版，准确专业，支持PDF/Word/PPT等文件格式，支持 12+语言互译。

免费翻译文档

深度研究

AI帮你快速写综述，25分钟生成高质量综述，智能提取关键信息，辅助科研写作。

立即免费体验

通过误差补偿实现深度网络量化

Deep Network Quantization via Error Compensation.

作者信息

出版信息

相似文献

文献检索

文件翻译

深度研究

Suppr 超能文献

相似文献