• 文献检索
  • 文档翻译
  • 深度研究
  • 学术资讯
  • Suppr Zotero 插件Zotero 插件
  • 邀请有礼
  • 套餐&价格
  • 历史记录
应用&插件
Suppr Zotero 插件Zotero 插件浏览器插件Mac 客户端Windows 客户端微信小程序
定价
高级版会员购买积分包购买API积分包
服务
文献检索文档翻译深度研究API 文档MCP 服务
关于我们
关于 Suppr公司介绍联系我们用户协议隐私条款
关注我们

Suppr 超能文献

核心技术专利:CN118964589B侵权必究
粤ICP备2023148730 号-1Suppr @ 2026

文献检索

告别复杂PubMed语法,用中文像聊天一样搜索,搜遍4000万医学文献。AI智能推荐,让科研检索更轻松。

立即免费搜索

文件翻译

保留排版,准确专业,支持PDF/Word/PPT等文件格式,支持 12+语言互译。

免费翻译文档

深度研究

AI帮你快速写综述,25分钟生成高质量综述,智能提取关键信息,辅助科研写作。

立即免费体验

采用协同学习和一次性搜索的可部署混合精度量化

Deployable mixed-precision quantization with co-learning and one-time search.

作者信息

Wang Shiguang, Zhang Zhongyu, Ai Guo, Cheng Jian

机构信息

University of Electronic Science and Technology of China, No. 2006, Xiyuan Ave, West Hi-Tech Zone, Chengdu, 611731, China.

Tencent Technology (Shanghai) Co. Ltd, Shanghai, 200233, China.

出版信息

Neural Netw. 2025 Jan;181:106812. doi: 10.1016/j.neunet.2024.106812. Epub 2024 Oct 18.

DOI:10.1016/j.neunet.2024.106812
PMID:39481201
Abstract

Mixed-precision quantization plays a pivotal role in deploying deep neural networks in resource-constrained environments. However, the task of finding the optimal bit-width configurations for different layers under deployable mixed-precision quantization has barely been explored and remains a challenge. In this work, we present Cobits, an efficient and effective deployable mixed-precision quantization framework based on the relationship between the range of real-valued input and the range of quantized real-valued. It assigns a higher bit-width to the quantizer with a narrower quantized real-valued range and a lower bit-width to the quantizer with a wider quantized real-valued range. Cobits employs a co-learning approach to entangle and learn quantization parameters across various bit-widths, distinguishing between shared and specific parts. The shared part collaborates, while the specific part isolates precision conflicts. Additionally, we upgrade the normal quantizer to dynamic quantizer to mitigate statistical issues in the deployable mixed-precision supernet. Over the trained mixed-precision supernet, we utilize the quantized real-valued ranges to derive quantized-bit-sensitivity, which can serve as importance indicators for efficiently determining bit-width configurations, eliminating the need for iterative validation dataset evaluations. Extensive experiments show that Cobits outperforms previous state-of-the-art quantization methods on the ImageNet and COCO datasets while retaining superior efficiency. We show this approach dynamically adapts to varying bit-width and can generalize to various deployable backends. The code will be made public in https://github.com/sunnyxiaohu/cobits.

摘要

混合精度量化在资源受限环境中部署深度神经网络方面起着关键作用。然而,在可部署的混合精度量化下为不同层找到最优比特宽度配置的任务几乎未被探索,仍然是一个挑战。在这项工作中,我们提出了Cobits,这是一个基于实值输入范围和量化实值范围之间关系的高效且有效的可部署混合精度量化框架。它为量化实值范围较窄的量化器分配较高的比特宽度,为量化实值范围较宽的量化器分配较低的比特宽度。Cobits采用协同学习方法来纠缠和学习跨各种比特宽度的量化参数,区分共享部分和特定部分。共享部分协同工作,而特定部分隔离精度冲突。此外,我们将普通量化器升级为动态量化器,以减轻可部署混合精度超网络中的统计问题。在训练好的混合精度超网络上,我们利用量化实值范围来推导量化比特敏感度,它可以作为有效确定比特宽度配置的重要指标,无需对验证数据集进行迭代评估。大量实验表明,Cobits在ImageNet和COCO数据集上优于先前的最先进量化方法,同时保持了卓越的效率。我们表明这种方法可以动态适应不同的比特宽度,并可以推广到各种可部署的后端。代码将在https://github.com/sunnyxiaohu/cobits上公开。

相似文献

1
Deployable mixed-precision quantization with co-learning and one-time search.采用协同学习和一次性搜索的可部署混合精度量化
Neural Netw. 2025 Jan;181:106812. doi: 10.1016/j.neunet.2024.106812. Epub 2024 Oct 18.
2
Hessian-based mixed-precision quantization with transition aware training for neural networks.基于黑塞矩阵的神经网络混合精度量化与过渡感知训练
Neural Netw. 2025 Feb;182:106910. doi: 10.1016/j.neunet.2024.106910. Epub 2024 Nov 16.
3
Data Quality-Aware Mixed-Precision Quantization via Hybrid Reinforcement Learning.通过混合强化学习实现数据质量感知的混合精度量化
IEEE Trans Neural Netw Learn Syst. 2025 May;36(5):9018-9031. doi: 10.1109/TNNLS.2024.3409692. Epub 2025 May 2.
4
Fast and Effective: A Novel Sequential Single-Path Search for Mixed-Precision-Quantized Networks.快速且高效:一种用于混合精度量化网络的新型顺序单路径搜索方法
IEEE Trans Cybern. 2023 Oct;53(10):6187-6199. doi: 10.1109/TCYB.2022.3164285. Epub 2023 Sep 15.
5
ADFQ-ViT: Activation-Distribution-Friendly post-training Quantization for Vision Transformers.
Neural Netw. 2025 Jun;186:107289. doi: 10.1016/j.neunet.2025.107289. Epub 2025 Feb 22.
6
Exploiting Retraining-Based Mixed-Precision Quantization for Low-Cost DNN Accelerator Design.利用基于再训练的混合精度量化进行低成本深度神经网络加速器设计。
IEEE Trans Neural Netw Learn Syst. 2021 Jul;32(7):2925-2938. doi: 10.1109/TNNLS.2020.3008996. Epub 2021 Jul 6.
7
GradFreeBits: Gradient-Free Bit Allocation for Mixed-Precision Neural Networks.无梯度比特分配在混合精度神经网络中的应用。
Sensors (Basel). 2022 Dec 13;22(24):9772. doi: 10.3390/s22249772.
8
MedQ: Lossless ultra-low-bit neural network quantization for medical image segmentation.MedQ:用于医学图像分割的无损超低比特神经网络量化。
Med Image Anal. 2021 Oct;73:102200. doi: 10.1016/j.media.2021.102200. Epub 2021 Aug 2.
9
Whether the Support Region of Three-Bit Uniform Quantizer Has a Strong Impact on Post-Training Quantization for MNIST Dataset?三位均匀量化器的支持区域对MNIST数据集的训练后量化有很大影响吗?
Entropy (Basel). 2021 Dec 20;23(12):1699. doi: 10.3390/e23121699.
10
Training high-performance and large-scale deep neural networks with full 8-bit integers.用全 8 位整数训练高性能和大规模深度神经网络。
Neural Netw. 2020 May;125:70-82. doi: 10.1016/j.neunet.2019.12.027. Epub 2020 Jan 15.