Suppr超能文献

采用协同学习和一次性搜索的可部署混合精度量化

Deployable mixed-precision quantization with co-learning and one-time search.

作者信息

Wang Shiguang, Zhang Zhongyu, Ai Guo, Cheng Jian

机构信息

University of Electronic Science and Technology of China, No. 2006, Xiyuan Ave, West Hi-Tech Zone, Chengdu, 611731, China.

Tencent Technology (Shanghai) Co. Ltd, Shanghai, 200233, China.

出版信息

Neural Netw. 2025 Jan;181:106812. doi: 10.1016/j.neunet.2024.106812. Epub 2024 Oct 18.

Abstract

Mixed-precision quantization plays a pivotal role in deploying deep neural networks in resource-constrained environments. However, the task of finding the optimal bit-width configurations for different layers under deployable mixed-precision quantization has barely been explored and remains a challenge. In this work, we present Cobits, an efficient and effective deployable mixed-precision quantization framework based on the relationship between the range of real-valued input and the range of quantized real-valued. It assigns a higher bit-width to the quantizer with a narrower quantized real-valued range and a lower bit-width to the quantizer with a wider quantized real-valued range. Cobits employs a co-learning approach to entangle and learn quantization parameters across various bit-widths, distinguishing between shared and specific parts. The shared part collaborates, while the specific part isolates precision conflicts. Additionally, we upgrade the normal quantizer to dynamic quantizer to mitigate statistical issues in the deployable mixed-precision supernet. Over the trained mixed-precision supernet, we utilize the quantized real-valued ranges to derive quantized-bit-sensitivity, which can serve as importance indicators for efficiently determining bit-width configurations, eliminating the need for iterative validation dataset evaluations. Extensive experiments show that Cobits outperforms previous state-of-the-art quantization methods on the ImageNet and COCO datasets while retaining superior efficiency. We show this approach dynamically adapts to varying bit-width and can generalize to various deployable backends. The code will be made public in https://github.com/sunnyxiaohu/cobits.

摘要

混合精度量化在资源受限环境中部署深度神经网络方面起着关键作用。然而,在可部署的混合精度量化下为不同层找到最优比特宽度配置的任务几乎未被探索,仍然是一个挑战。在这项工作中,我们提出了Cobits,这是一个基于实值输入范围和量化实值范围之间关系的高效且有效的可部署混合精度量化框架。它为量化实值范围较窄的量化器分配较高的比特宽度,为量化实值范围较宽的量化器分配较低的比特宽度。Cobits采用协同学习方法来纠缠和学习跨各种比特宽度的量化参数,区分共享部分和特定部分。共享部分协同工作,而特定部分隔离精度冲突。此外,我们将普通量化器升级为动态量化器,以减轻可部署混合精度超网络中的统计问题。在训练好的混合精度超网络上,我们利用量化实值范围来推导量化比特敏感度,它可以作为有效确定比特宽度配置的重要指标,无需对验证数据集进行迭代评估。大量实验表明,Cobits在ImageNet和COCO数据集上优于先前的最先进量化方法,同时保持了卓越的效率。我们表明这种方法可以动态适应不同的比特宽度,并可以推广到各种可部署的后端。代码将在https://github.com/sunnyxiaohu/cobits上公开。

文献AI研究员

20分钟写一篇综述,助力文献阅读效率提升50倍。

立即体验

用中文搜PubMed

大模型驱动的PubMed中文搜索引擎

马上搜索

文档翻译

学术文献翻译模型,支持多种主流文档格式。

立即体验