Lo Kwee-Seong Medical Image Analysis Laboratory, Department of Computer Science and Engineering, The Hong Kong University of Science and Technology, Hong Kong.
Med Image Anal. 2021 Oct;73:102200. doi: 10.1016/j.media.2021.102200. Epub 2021 Aug 2.
Implementing deep convolutional neural networks (CNNs) with boolean arithmetic is ideal for eliminating the notoriously high computational expense of deep learning models. However, although lossless model compression via weight-only quantization has been achieved in previous works, it is still an open problem about how to reduce the computation precision of CNNs without losing performance, especially for medical image segmentation tasks where data dimension is high and annotation is scarce. This paper presents a novel CNN quantization framework that can squeeze a deep model (both parameters and activation) to extremely low bitwidth, e.g., 1∼2 bits, while maintaining its high performance. In the new method, we first design a strong baseline quantizer with an optimizable quantization range. Then, to relieve the back-propagation difficulty caused by the discontinuous quantization function, we design a radical residual connection scheme that allows gradients to flow through every quantized layer freely. Moreover, a tanh-based derivative function is used to further boost gradient flow and a distributional loss is employed to regularize the model output. Extensive experiments and ablation studies are conducted on two well-established public 3D segmentation datasets, i.e., BRATS2020 and LiTS. Experimental results evidence that our framework not only outperforms state-of-the-art quantization approaches significantly, but also achieves lossless performance on both datasets with ternary (2-bit) quantization.
用布尔算术实现深度卷积神经网络 (CNN) 是消除深度学习模型众所周知的高计算成本的理想方法。然而,尽管之前的工作已经实现了通过仅量化权重的无损模型压缩,但如何在不降低性能的情况下降低 CNN 的计算精度仍然是一个悬而未决的问题,特别是对于医学图像分割任务,其数据维度较高且注释较少。本文提出了一种新的 CNN 量化框架,该框架可以将深度模型(参数和激活)压缩到极低的位宽,例如 1∼2 位,同时保持其高性能。在新方法中,我们首先设计了一个具有可优化量化范围的强大基线量化器。然后,为了缓解由不连续量化函数引起的反向传播困难,我们设计了一种激进的残差连接方案,允许梯度自由流经每个量化层。此外,使用基于 tanh 的导数函数进一步促进梯度流,并使用分布损失来正则化模型输出。在两个成熟的公共 3D 分割数据集,即 BRATS2020 和 LiTS 上进行了广泛的实验和消融研究。实验结果表明,我们的框架不仅显著优于最新的量化方法,而且在两个数据集上都实现了无损性能,使用三元(2 位)量化。