• 文献检索
  • 文档翻译
  • 深度研究
  • 学术资讯
  • Suppr Zotero 插件Zotero 插件
  • 邀请有礼
  • 套餐&价格
  • 历史记录
应用&插件
Suppr Zotero 插件Zotero 插件浏览器插件Mac 客户端Windows 客户端微信小程序
定价
高级版会员购买积分包购买API积分包
服务
文献检索文档翻译深度研究API 文档MCP 服务
关于我们
关于 Suppr公司介绍联系我们用户协议隐私条款
关注我们

Suppr 超能文献

核心技术专利:CN118964589B侵权必究
粤ICP备2023148730 号-1Suppr @ 2026

文献检索

告别复杂PubMed语法,用中文像聊天一样搜索,搜遍4000万医学文献。AI智能推荐,让科研检索更轻松。

立即免费搜索

文件翻译

保留排版,准确专业,支持PDF/Word/PPT等文件格式,支持 12+语言互译。

免费翻译文档

深度研究

AI帮你快速写综述,25分钟生成高质量综述,智能提取关键信息,辅助科研写作。

立即免费体验

快速且高效:一种用于混合精度量化网络的新型顺序单路径搜索方法

Fast and Effective: A Novel Sequential Single-Path Search for Mixed-Precision-Quantized Networks.

作者信息

Sun Qigong, Li Xiufang, Jiao Licheng, Ren Yan, Shang Fanhua, Liu Fang

出版信息

IEEE Trans Cybern. 2023 Oct;53(10):6187-6199. doi: 10.1109/TCYB.2022.3164285. Epub 2023 Sep 15.

DOI:10.1109/TCYB.2022.3164285
PMID:35468073
Abstract

Model quantization can reduce the model size and computational latency, it has been successfully applied for many applications of mobile phones, embedded devices, and smart chips. Mixed-precision quantization models can match different bit precision according to the sensitivity of different layers to achieve great performance. However, it is difficult to quickly determine the quantization bit precision of each layer in deep neural networks under some constraints (for example, hardware resources, energy consumption, model size, and computational latency). In this article, a novel sequential single-path search (SSPS) method for mixed-precision model quantization is proposed, in which some given constraints are introduced to guide the searching process. A single-path search cell is proposed to combine a fully differentiable supernet, which can be optimized by gradient-based algorithms. Moreover, we sequentially determine the candidate precisions according to the selection certainties to exponentially reduce the search space and speed up the convergence of the searching process. Experiments show that our method can efficiently search the mixed-precision models for different architectures (for example, ResNet-20, 18, 34, 50, and MobileNet-V2) and datasets (for example, CIFAR-10, ImageNet, and COCO) under given constraints, and our experimental results verify that SSPS significantly outperforms their uniform-precision counterparts.

摘要

模型量化可以减小模型大小并降低计算延迟,它已成功应用于手机、嵌入式设备和智能芯片的许多应用中。混合精度量化模型可以根据不同层的敏感度匹配不同的比特精度,以实现出色的性能。然而,在某些约束条件下(例如硬件资源、能耗、模型大小和计算延迟),很难快速确定深度神经网络中每层的量化比特精度。在本文中,提出了一种用于混合精度模型量化的新颖的顺序单路径搜索(SSPS)方法,其中引入了一些给定的约束来指导搜索过程。提出了一种单路径搜索单元来结合一个完全可微的超网络,该超网络可以通过基于梯度的算法进行优化。此外,我们根据选择确定性顺序确定候选精度,以指数方式减少搜索空间并加快搜索过程的收敛。实验表明,我们的方法可以在给定约束下有效地为不同架构(例如ResNet-20、18、34、50和MobileNet-V2)和数据集(例如CIFAR-10、ImageNet和COCO)搜索混合精度模型,并且我们的实验结果验证了SSPS明显优于其均匀精度的对应方法。

相似文献

1
Fast and Effective: A Novel Sequential Single-Path Search for Mixed-Precision-Quantized Networks.快速且高效:一种用于混合精度量化网络的新型顺序单路径搜索方法
IEEE Trans Cybern. 2023 Oct;53(10):6187-6199. doi: 10.1109/TCYB.2022.3164285. Epub 2023 Sep 15.
2
Exploiting Retraining-Based Mixed-Precision Quantization for Low-Cost DNN Accelerator Design.利用基于再训练的混合精度量化进行低成本深度神经网络加速器设计。
IEEE Trans Neural Netw Learn Syst. 2021 Jul;32(7):2925-2938. doi: 10.1109/TNNLS.2020.3008996. Epub 2021 Jul 6.
3
Quantization Friendly MobileNet (QF-MobileNet) Architecture for Vision Based Applications on Embedded Platforms.面向嵌入式平台视觉应用的量化友好型 MobileNet(QF-MobileNet)架构。
Neural Netw. 2021 Apr;136:28-39. doi: 10.1016/j.neunet.2020.12.022. Epub 2020 Dec 29.
4
GradFreeBits: Gradient-Free Bit Allocation for Mixed-Precision Neural Networks.无梯度比特分配在混合精度神经网络中的应用。
Sensors (Basel). 2022 Dec 13;22(24):9772. doi: 10.3390/s22249772.
5
Model Compression Based on Differentiable Network Channel Pruning.基于可微网络通道剪枝的模型压缩
IEEE Trans Neural Netw Learn Syst. 2023 Dec;34(12):10203-10212. doi: 10.1109/TNNLS.2022.3165123. Epub 2023 Nov 30.
6
Data Quality-Aware Mixed-Precision Quantization via Hybrid Reinforcement Learning.通过混合强化学习实现数据质量感知的混合精度量化
IEEE Trans Neural Netw Learn Syst. 2025 May;36(5):9018-9031. doi: 10.1109/TNNLS.2024.3409692. Epub 2025 May 2.
7
Vertical Layering of Quantized Neural Networks for Heterogeneous Inference.用于异构推理的量化神经网络的垂直分层
IEEE Trans Pattern Anal Mach Intell. 2023 Dec;45(12):15964-15978. doi: 10.1109/TPAMI.2023.3319045. Epub 2023 Nov 3.
8
IVS-Caffe-Hardware-Oriented Neural Network Model Development.基于 IVS 硬件的面向神经网络模型开发。
IEEE Trans Neural Netw Learn Syst. 2022 Oct;33(10):5978-5992. doi: 10.1109/TNNLS.2021.3072145. Epub 2022 Oct 5.
9
MBFQuant: A Multiplier-Bitwidth-Fixed, Mixed-Precision Quantization Method for Mobile CNN-Based Applications.MBFQuant:一种用于移动 CNN 应用的乘法器位数固定、混合精度量化方法。
IEEE Trans Image Process. 2023;32:2438-2453. doi: 10.1109/TIP.2023.3268562. Epub 2023 May 1.
10
A Hardware-Friendly Low-Bit Power-of-Two Quantization Method for CNNs and Its FPGA Implementation.一种面向硬件的 CNN 低比特数 2 的幂量化方法及其 FPGA 实现。
Sensors (Basel). 2022 Sep 1;22(17):6618. doi: 10.3390/s22176618.