School of Computer Science and Information Engineering, Hefei University of Technology, Hefei, China.
School of Computer Science and Information Engineering, Hefei University of Technology, Hefei, China; Shenzhen Research Institute of Big data, Shenzhen, China.
Neural Netw. 2023 Sep;166:683-691. doi: 10.1016/j.neunet.2023.07.042. Epub 2023 Aug 5.
Quantization approximates a deep network model with floating-point numbers by the model with low bit width numbers, thereby accelerating inference and reducing computation. Zero-shot quantization, which aims to quantize a model without access to the original data, can be achieved by fitting the real data distribution through data synthesis. However, it has been observed that zero-shot quantization leads to inferior performance compared to post-training quantization with real data for two primary reasons: 1) a normal generator has difficulty obtaining a high diversity of synthetic data since it lacks long-range information to allocate attention to global features, and 2) synthetic images aim to simulate the statistics of real data, which leads to weak intra-class heterogeneity and limited feature richness. To overcome these problems, we propose a novel deep network quantizer called long-range zero-shot generative deep network quantization (LRQ). Technically, we propose a long-range generator (LRG) to learn long-range information instead of simple local features. To incorporate more global features into the synthetic data, we use long-range attention with large-kernel convolution in the generator. In addition, we also present an adversarial margin add (AMA) module to force intra-class angular enlargement between the feature vector and class center. The AMA module forms an adversarial process that increases the convergence difficulty of the loss function, which is opposite to the training objective of the original loss function. Furthermore, to transfer knowledge from the full-precision network, we also utilize decoupled knowledge distillation. Extensive experiments demonstrate that LRQ obtains better performance than other competitors.
量化通过使用低比特位数的数字来近似具有浮点数的深度网络模型,从而加速推理和减少计算。零 shot 量化旨在在无法访问原始数据的情况下对模型进行量化,可以通过数据合成来拟合真实数据分布来实现。然而,已经观察到,零 shot 量化与使用真实数据进行后训练量化相比,性能较差,主要有两个原因:1)由于正常生成器缺乏分配注意力到全局特征的远程信息,因此难以获得具有高多样性的合成数据;2)合成图像旨在模拟真实数据的统计信息,这导致类内异质性弱且特征丰富度有限。为了解决这些问题,我们提出了一种名为长程零 shot 生成深度网络量化(LRQ)的新型深度网络量化器。从技术上讲,我们提出了一种长程生成器(LRG)来学习长程信息,而不是简单的局部特征。为了将更多的全局特征纳入合成数据中,我们在生成器中使用具有大核卷积的长程注意力。此外,我们还提出了对抗性边缘添加(AMA)模块,以迫使特征向量和类中心之间的类内角度扩大。AMA 模块形成一个对抗过程,增加了损失函数的收敛难度,这与原始损失函数的训练目标相反。此外,为了从全精度网络转移知识,我们还利用解耦知识蒸馏。大量实验表明,LRQ 比其他竞争对手获得更好的性能。