Yi Le, Zhang Lei, Xu Xiuyuan, Guo Jixiang
IEEE Trans Med Imaging. 2023 Jan;42(1):317-328. doi: 10.1109/TMI.2022.3211085. Epub 2022 Dec 29.
Radiographic attributes of lung nodules remedy the shortcomings of lung cancer computer-assisted diagnosis systems, which provides interpretable diagnostic reference for doctors. However, current studies fail to dedicate multi-label classification of lung nodules using convolutional neural networks (CNNs) and are inferior in exploiting statistical dependency between the labels. In addition, data imbalance is an indispensable problem to be reckoned with when employing CNNs to perform lung nodule classification. It introduces greater challenges especially in the multi-label classification. In this paper, we propose a method called MLSL-Net to discriminate lung nodule characteristics and simultaneously address the challenges. Particularly, the proposal employs multi-label softmax loss (MLSL) as the performance index, aiming to reduce the ranking errors between the labels and within the labels during training, thereby optimizing ranking loss and AUC directly. Such criterions can better evaluate the classifier's performance on the multi-label imbalanced dataset. Furthermore, a scale factor is introduced based on the investigation of the max surrogate function. Different from preceding usages, the small factor is used so that to narrow the discrepancy of gradients produced by different labels. More interestingly, this factor also facilitates the exploit of label dependency. Experimental results on the LIDC-IDRI dataset as well as another akin dataset demonstrate that MLSL-Net can effectively perform multi-label classification despite the imbalance issue. Meanwhile, the results confirm the responsibility of the factor for capturing label correlations, accordingly leading to more accurate predictions.
肺结节的影像学特征弥补了肺癌计算机辅助诊断系统的不足,为医生提供了可解释的诊断参考。然而,目前的研究未能使用卷积神经网络(CNN)对肺结节进行多标签分类,并且在利用标签之间的统计依赖性方面表现较差。此外,数据不平衡是在使用CNN进行肺结节分类时不可忽视的问题。它带来了更大的挑战,尤其是在多标签分类中。在本文中,我们提出了一种名为MLSL-Net的方法来区分肺结节特征并同时应对这些挑战。具体而言,该方法采用多标签softmax损失(MLSL)作为性能指标,旨在在训练过程中减少标签之间和标签内部的排序错误,从而直接优化排序损失和AUC。这样的标准可以更好地评估分类器在多标签不平衡数据集上的性能。此外,基于对最大替代函数的研究引入了一个比例因子。与之前的用法不同,使用小因子是为了缩小不同标签产生的梯度差异。更有趣的是,这个因子还有助于利用标签依赖性。在LIDC-IDRI数据集以及另一个类似数据集上的实验结果表明,尽管存在不平衡问题,MLSL-Net仍能有效地进行多标签分类。同时,结果证实了该因子在捕捉标签相关性方面的作用,从而导致更准确的预测。