Suppr超能文献

高效 Q:一种用于医学图像分割的高效准确的神经网络后训练量化方法。

EfficientQ: An efficient and accurate post-training neural network quantization method for medical image segmentation.

机构信息

Department of Computer Science and Engineering, The Hong Kong University of Science and Technology, Hong Kong, China.

出版信息

Med Image Anal. 2024 Oct;97:103277. doi: 10.1016/j.media.2024.103277. Epub 2024 Jul 22.

Abstract

Model quantization is a promising technique that can simultaneously compress and accelerate a deep neural network by limiting its computation bit-width, which plays a crucial role in the fast-growing AI industry. Despite model quantization's success in producing well-performing low-bit models, the quantization process itself can still be expensive, which may involve a long fine-tuning stage on a large, well-annotated training set. To make the quantization process more efficient in terms of both time and data requirements, this paper proposes a fast and accurate post-training quantization method, namely EfficientQ. We develop this new method with a layer-wise optimization strategy and leverage the powerful alternating direction method of multipliers (ADMM) algorithm to ensure fast convergence. Furthermore, a weight regularization scheme is incorporated to provide more guidance for the optimization of the discrete weights, and a self-adaptive attention mechanism is proposed to combat the class imbalance problem. Extensive comparison and ablation experiments are conducted on two publicly available medical image segmentation datasets, i.e., LiTS and BraTS2020, and the results demonstrate the superiority of the proposed method over various existing post-training quantization methods in terms of both accuracy and optimization speed. Remarkably, with EfficientQ, the quantization of a practical 3D UNet only requires less than 5 min on a single GPU and one data sample. The source code is available at https://github.com/rongzhao-zhang/EfficientQ.

摘要

模型量化是一种很有前途的技术,可以通过限制计算的位宽来同时压缩和加速深度神经网络,这在快速发展的人工智能行业中起着至关重要的作用。尽管模型量化在生成性能良好的低比特模型方面取得了成功,但量化过程本身仍然很昂贵,可能需要在大型、有良好注释的训练集上进行长时间的微调阶段。为了使量化过程在时间和数据需求方面更加高效,本文提出了一种快速而准确的后训练量化方法,即 EfficientQ。我们采用分层优化策略开发了这种新方法,并利用强大的交替方向乘子法(ADMM)算法来确保快速收敛。此外,还采用了一种权重正则化方案,为离散权重的优化提供更多指导,并提出了一种自适应注意力机制来解决类不平衡问题。在两个公开的医学图像分割数据集 LiTS 和 BraTS2020 上进行了广泛的对比和消融实验,结果表明,与各种现有的后训练量化方法相比,该方法在准确性和优化速度方面都具有优越性。值得注意的是,使用 EfficientQ,对实际的 3D UNet 进行量化仅需不到 5 分钟即可在单个 GPU 上完成一个数据样本的量化。源代码可在 https://github.com/rongzhao-zhang/EfficientQ 上获得。

文献检索

告别复杂PubMed语法,用中文像聊天一样搜索,搜遍4000万医学文献。AI智能推荐,让科研检索更轻松。

立即免费搜索

文件翻译

保留排版,准确专业,支持PDF/Word/PPT等文件格式,支持 12+语言互译。

免费翻译文档

深度研究

AI帮你快速写综述,25分钟生成高质量综述,智能提取关键信息,辅助科研写作。

立即免费体验