Institute of Computing Technology, Chinese Academy of Sciences, University of Chinese Academy of Sciences, Beijing, China; AI Lab, Lenovo Research, Beijing, China.
AI Lab, Lenovo Research, Beijing, China.
Comput Methods Programs Biomed. 2021 Apr;202:106004. doi: 10.1016/j.cmpb.2021.106004. Epub 2021 Feb 23.
Coronavirus disease 2019 (COVID-19) is a highly contagious virus spreading all around the world. Deep learning has been adopted as an effective technique to aid COVID-19 detection and segmentation from computed tomography (CT) images. The major challenge lies in the inadequate public COVID-19 datasets. Recently, transfer learning has become a widely used technique that leverages the knowledge gained while solving one problem and applying it to a different but related problem. However, it remains unclear whether various non-COVID19 lung lesions could contribute to segmenting COVID-19 infection areas and how to better conduct this transfer procedure. This paper provides a way to understand the transferability of non-COVID19 lung lesions and a better strategy to train a robust deep learning model for COVID-19 infection segmentation.
Based on a publicly available COVID-19 CT dataset and three public non-COVID19 datasets, we evaluate four transfer learning methods using 3D U-Net as a standard encoder-decoder method. i) We introduce the multi-task learning method to get a multi-lesion pre-trained model for COVID-19 infection. ii) We propose and compare four transfer learning strategies with various performance gains and training time costs. Our proposed Hybrid-encoder Learning strategy introduces a Dedicated-encoder and an Adapted-encoder to extract COVID-19 infection features and general lung lesion features, respectively. An attention-based Selective Fusion unit is designed for dynamic feature selection and aggregation.
Experiments show that trained with limited data, proposed Hybrid-encoder strategy based on multi-lesion pre-trained model achieves a mean DSC, NSD, Sensitivity, F1-score, Accuracy and MCC of 0.704, 0.735, 0.682, 0.707, 0.994 and 0.716, respectively, with better genetalization and lower over-fitting risks for segmenting COVID-19 infection.
The results reveal the benefits of transferring knowledge from non-COVID19 lung lesions, and learning from multiple lung lesion datasets can extract more general features, leading to accurate and robust pre-trained models. We further show the capability of the encoder to learn feature representations of lung lesions, which improves segmentation accuracy and facilitates training convergence. In addition, our proposed Hybrid-encoder learning method incorporates transferred lung lesion features from non-COVID19 datasets effectively and achieves significant improvement. These findings promote new insights into transfer learning for COVID-19 CT image segmentation, which can also be further generalized to other medical tasks.
新型冠状病毒病 2019(COVID-19)是一种高度传染性病毒,正在全球范围内传播。深度学习已被采用为一种有效的技术,以辅助从计算机断层扫描(CT)图像中检测和分割 COVID-19。主要的挑战在于公共 COVID-19 数据集不足。最近,迁移学习已成为一种广泛使用的技术,利用在解决一个问题时获得的知识,并将其应用于不同但相关的问题。然而,尚不清楚各种非 COVID-19 肺部病变是否有助于分割 COVID-19 感染区域,以及如何更好地进行这种迁移过程。本文提供了一种理解非 COVID-19 肺部病变的可转移性的方法,以及一种更好的策略,用于训练用于 COVID-19 感染分割的强大深度学习模型。
基于一个公开的 COVID-19 CT 数据集和三个公开的非 COVID-19 数据集,我们使用 3D U-Net 作为标准编码器-解码器方法评估了四种迁移学习方法。i)我们引入多任务学习方法,为 COVID-19 感染获得多病变预训练模型。ii)我们提出并比较了四种具有不同性能增益和训练时间成本的迁移学习策略。我们提出的混合编码器学习策略引入了一个专用编码器和一个自适应编码器,分别提取 COVID-19 感染特征和一般肺部病变特征。设计了一个基于注意力的选择性融合单元,用于动态特征选择和聚合。
实验表明,在使用有限数据进行训练时,基于多病变预训练模型的提出的混合编码器策略在分割 COVID-19 感染方面实现了平均 DSC、NSD、灵敏度、F1 分数、准确性和 MCC 的 0.704、0.735、0.682、0.707、0.994 和 0.716,具有更好的泛化能力和更低的过拟合风险。
结果表明,从非 COVID-19 肺部病变转移知识和从多个肺部病变数据集学习可以提取更一般的特征,从而导致准确和强大的预训练模型。我们进一步展示了编码器学习肺部病变特征表示的能力,这提高了分割精度并促进了训练收敛。此外,我们提出的混合编码器学习方法有效地整合了来自非 COVID-19 数据集的转移肺部病变特征,并取得了显著的改进。这些发现为 COVID-19 CT 图像分割的迁移学习提供了新的见解,这也可以进一步推广到其他医学任务。