高效 Q：一种用于医学图像分割的高效准确的神经网络后训练量化方法。

EfficientQ: An efficient and accurate post-training neural network quantization method for medical image segmentation.

机构信息

Department of Computer Science and Engineering, The Hong Kong University of Science and Technology, Hong Kong, China.

出版信息

Med Image Anal. 2024 Oct;97:103277. doi: 10.1016/j.media.2024.103277. Epub 2024 Jul 22.

DOI:10.1016/j.media.2024.103277

Abstract

Model quantization is a promising technique that can simultaneously compress and accelerate a deep neural network by limiting its computation bit-width, which plays a crucial role in the fast-growing AI industry. Despite model quantization's success in producing well-performing low-bit models, the quantization process itself can still be expensive, which may involve a long fine-tuning stage on a large, well-annotated training set. To make the quantization process more efficient in terms of both time and data requirements, this paper proposes a fast and accurate post-training quantization method, namely EfficientQ. We develop this new method with a layer-wise optimization strategy and leverage the powerful alternating direction method of multipliers (ADMM) algorithm to ensure fast convergence. Furthermore, a weight regularization scheme is incorporated to provide more guidance for the optimization of the discrete weights, and a self-adaptive attention mechanism is proposed to combat the class imbalance problem. Extensive comparison and ablation experiments are conducted on two publicly available medical image segmentation datasets, i.e., LiTS and BraTS2020, and the results demonstrate the superiority of the proposed method over various existing post-training quantization methods in terms of both accuracy and optimization speed. Remarkably, with EfficientQ, the quantization of a practical 3D UNet only requires less than 5 min on a single GPU and one data sample. The source code is available at https://github.com/rongzhao-zhang/EfficientQ.

摘要

模型量化是一种很有前途的技术，可以通过限制计算的位宽来同时压缩和加速深度神经网络，这在快速发展的人工智能行业中起着至关重要的作用。尽管模型量化在生成性能良好的低比特模型方面取得了成功，但量化过程本身仍然很昂贵，可能需要在大型、有良好注释的训练集上进行长时间的微调阶段。为了使量化过程在时间和数据需求方面更加高效，本文提出了一种快速而准确的后训练量化方法，即 EfficientQ。我们采用分层优化策略开发了这种新方法，并利用强大的交替方向乘子法（ADMM）算法来确保快速收敛。此外，还采用了一种权重正则化方案，为离散权重的优化提供更多指导，并提出了一种自适应注意力机制来解决类不平衡问题。在两个公开的医学图像分割数据集 LiTS 和 BraTS2020 上进行了广泛的对比和消融实验，结果表明，与各种现有的后训练量化方法相比，该方法在准确性和优化速度方面都具有优越性。值得注意的是，使用 EfficientQ，对实际的 3D UNet 进行量化仅需不到 5 分钟即可在单个 GPU 上完成一个数据样本的量化。源代码可在 https://github.com/rongzhao-zhang/EfficientQ 上获得。

相似文献

EfficientQ: An efficient and accurate post-training neural network quantization method for medical image segmentation.高效 Q：一种用于医学图像分割的高效准确的神经网络后训练量化方法。

Med Image Anal. 2024 Oct;97:103277. doi: 10.1016/j.media.2024.103277. Epub 2024 Jul 22.

MedQ: Lossless ultra-low-bit neural network quantization for medical image segmentation.MedQ：用于医学图像分割的无损超低比特神经网络量化。

Med Image Anal. 2021 Oct;73:102200. doi: 10.1016/j.media.2021.102200. Epub 2021 Aug 2.

Discretely-constrained deep network for weakly supervised segmentation.基于离散约束的深度网络的弱监督分割。

Neural Netw. 2020 Oct;130:297-308. doi: 10.1016/j.neunet.2020.07.011. Epub 2020 Jul 18.

Optimization-Based Post-Training Quantization With Bit-Split and Stitching.基于优化的带位分割与拼接的训练后量化

IEEE Trans Pattern Anal Mach Intell. 2023 Feb;45(2):2119-2135. doi: 10.1109/TPAMI.2022.3159369. Epub 2023 Jan 6.

URCA: Uncertainty-based region clipping algorithm for semi-supervised medical image segmentation.基于不确定性的区域裁剪算法在半监督医学图像分割中的应用。

Comput Methods Programs Biomed. 2024 Sep;254:108278. doi: 10.1016/j.cmpb.2024.108278. Epub 2024 Jun 11.

Efficient fetal ultrasound image segmentation for automatic head circumference measurement using a lightweight deep convolutional neural network.利用轻量级深度卷积神经网络实现高效胎儿超声图像自动头围测量的分割。

Med Phys. 2022 Aug;49(8):5081-5092. doi: 10.1002/mp.15700. Epub 2022 May 24.

MultiTrans: Multi-branch transformer network for medical image segmentation.多分支转换器网络在医学图像分割中的应用。

Comput Methods Programs Biomed. 2024 Sep;254:108280. doi: 10.1016/j.cmpb.2024.108280. Epub 2024 Jun 8.

Training high-performance and large-scale deep neural networks with full 8-bit integers.用全 8 位整数训练高性能和大规模深度神经网络。

Neural Netw. 2020 May;125:70-82. doi: 10.1016/j.neunet.2019.12.027. Epub 2020 Jan 15.

EMONAS-Net: Efficient multiobjective neural architecture search using surrogate-assisted evolutionary algorithm for 3D medical image segmentation.EMONAS-Net：基于代理辅助进化算法的高效多目标神经架构搜索在 3D 医学图像分割中的应用。

Artif Intell Med. 2021 Sep;119:102154. doi: 10.1016/j.artmed.2021.102154. Epub 2021 Aug 24.

Fast-SNN: Fast Spiking Neural Network by Converting Quantized ANN.快速脉冲神经网络：通过量化人工神经网络转换实现的快速脉冲神经网络

IEEE Trans Pattern Anal Mach Intell. 2023 Dec;45(12):14546-14562. doi: 10.1109/TPAMI.2023.3275769. Epub 2023 Nov 3.

文献检索

告别复杂PubMed语法，用中文像聊天一样搜索，搜遍4000万医学文献。AI智能推荐，让科研检索更轻松。

立即免费搜索

文件翻译

保留排版，准确专业，支持PDF/Word/PPT等文件格式，支持 12+语言互译。

免费翻译文档

深度研究

AI帮你快速写综述，25分钟生成高质量综述，智能提取关键信息，辅助科研写作。

立即免费体验

高效 Q：一种用于医学图像分割的高效准确的神经网络后训练量化方法。

EfficientQ: An efficient and accurate post-training neural network quantization method for medical image segmentation.

机构信息

出版信息

相似文献

文献检索

文件翻译

深度研究

Suppr 超能文献

相似文献