IEEE Trans Med Imaging. 2022 Jul;41(7):1874-1884. doi: 10.1109/TMI.2022.3149344. Epub 2022 Jun 30.
Lung nodule malignancy prediction is an essential step in the early diagnosis of lung cancer. Besides the difficulties commonly discussed, the challenges of this task also come from the ambiguous labels provided by annotators, since deep learning models have in some cases been found to reproduce or amplify human biases. In this paper, we propose a multi-view 'divide-and-rule' (MV-DAR) model to learn from both reliable and ambiguous annotations for lung nodule malignancy prediction on chest CT scans. According to the consistency and reliability of their annotations, we divide nodules into three sets: a consistent and reliable set (CR-Set), an inconsistent set (IC-Set), and a low reliable set (LR-Set). The nodule in IC-Set is annotated by multiple radiologists inconsistently, and the nodule in LR-Set is annotated by only one radiologist. Although ambiguous, inconsistent labels tell which label(s) is consistently excluded by all annotators, and the unreliable labels of a cohort of nodules are largely correct from the statistical point of view. Hence, both IC-Set and LR-Set can be used to facilitate the training of MV-DAR. Our MV-DAR contains three DAR models to characterize a lung nodule from three orthographic views and is trained following a two-stage procedure. Each DAR consists of three networks with the same architecture, including a prediction network (Prd-Net), a counterfactual network (CF-Net), and a low reliable network (LR-Net), which are trained on CR-Set, IC-Set, and LR-Set respectively in the pretraining phase. In the fine-tuning phase, the image representation ability learned by CF-Net and LR-Net is transferred to Prd-Net by negative-attention module (NA-Module) and consistent-attention module (CA-Module), aiming to boost the prediction ability of Prd-Net. The MV-DAR model has been evaluated on the LIDC-IDRI dataset and LUNGx dataset. Our results indicate not only the effectiveness of the MV-DAR in learning from ambiguous labels but also its superiority over present noisy label-learning models in lung nodule malignancy prediction.
肺结节良恶性预测是肺癌早期诊断的关键步骤。除了常见的困难外,这项任务的挑战还来自注释者提供的模糊标签,因为深度学习模型在某些情况下被发现会复制或放大人类的偏见。在本文中,我们提出了一种多视图“分而治之”(MV-DAR)模型,用于从胸部 CT 扫描的可靠和模糊注释中学习肺结节良恶性预测。根据注释的一致性和可靠性,我们将结节分为三组:一致且可靠的集合(CR-Set)、不一致的集合(IC-Set)和低可靠的集合(LR-Set)。IC-Set 中的结节由多个放射科医生不一致地注释,LR-Set 中的结节由一个放射科医生注释。尽管存在模糊性,但不一致的标签可以告诉哪些标签被所有注释者一致排除,而且从统计角度来看,一个队列的不可靠标签在很大程度上是正确的。因此,IC-Set 和 LR-Set 都可以用于促进 MV-DAR 的训练。我们的 MV-DAR 包含三个 DAR 模型,从三个正交视图来描述肺结节,并按照两阶段的过程进行训练。每个 DAR 由三个具有相同架构的网络组成,包括预测网络(Prd-Net)、反事实网络(CF-Net)和低可靠网络(LR-Net),它们分别在预训练阶段在 CR-Set、IC-Set 和 LR-Set 上进行训练。在微调阶段,CF-Net 和 LR-Net 学习到的图像表示能力通过负注意力模块(NA-Module)和一致注意力模块(CA-Module)转移到 Prd-Net,旨在提高 Prd-Net 的预测能力。MV-DAR 模型已经在 LIDC-IDRI 数据集和 LUNGx 数据集上进行了评估。我们的结果不仅表明了 MV-DAR 从模糊标签中学习的有效性,还表明了它在肺结节良恶性预测方面优于现有的噪声标签学习模型。