Medical Engineering Technology and Data Mining Institute, Zhengzhou University, Zhengzhou, 450001, Henan, China.
School of Information Engineering, Zhengzhou University, Zhengzhou, 450001, Henan, China.
Interdiscip Sci. 2022 Mar;14(1):130-140. doi: 10.1007/s12539-021-00472-1. Epub 2021 Nov 2.
Under the background of urgent need for computer-aided technology to provide physicians with objective decision support, aiming at reducing the false positive rate of nodule CT detection in pulmonary nodules detection and improving the accuracy of lung nodule recognition, this paper puts forward a method based on ensemble learning to distinguish between malignant and benign pulmonary nodules.
Firstly, trained on a public data set, a multi-layer feature fusion YOLOv3 network is used to detect lung nodules. Secondly, a CNN was trained to differentiate benign from malignant pulmonary nodules. Then, based on the idea of ensemble learning, the confidence probability of the above two models and the label of the training set are taken as data features to build a Logistic regression model. Finally, two test sets (public data set and private data set) were tested, and the confidence probability output by the two models was fused into the established logistic regression model to determine benign and malignant pulmonary nodules.
The YOLOv3 network was trained to detect chest CT images of the test set. The number of pulmonary nodules detected in the public and private test sets was 356 and 314, respectively. The accuracy, sensitivity and specificity of the two test sets were 80.97%, 81.63%, 78.75% and 79.69%, 86.59%, 72.16%, respectively. With CNN training pulmonary nodules benign and malignant discriminant model analysis of two kinds of test set, the result of accuracy, sensitivity and specificity were 90.12%, 90.66%, 89.47% and 88.57%, 85.62%, 90.87%, respectively. Fused model based on YOLOv3 network and CNN is tested on two test sets, and the result of accuracy, sensitivity and specificity were 93.82%, 94.85%, 92.59% and 92.31%, 92.68%, 91.89%, respectively.
The ensemble learning model is more effective than YOLOv3 network and CNN in removing false positives, and the accuracy of the ensemble. Learning model is higher than the other two networks in identifying pulmonary nodules.
在迫切需要计算机辅助技术为医生提供客观决策支持的背景下,本文旨在降低肺结节检测中 CT 检测的假阳性率,提高肺结节识别的准确性,提出了一种基于集成学习的方法来区分良恶性肺结节。
首先,在公共数据集上进行训练,使用多层特征融合 YOLOv3 网络检测肺结节。其次,训练一个 CNN 来区分良恶性肺结节。然后,基于集成学习的思想,将上述两个模型的置信概率和训练集的标签作为数据特征,建立一个 Logistic 回归模型。最后,对两个测试集(公共数据集和私有数据集)进行测试,将两个模型的置信概率融合到建立的逻辑回归模型中,以确定良恶性肺结节。
训练 YOLOv3 网络来检测测试集的胸部 CT 图像。公共和私有测试集分别检测到 356 个和 314 个肺结节。两个测试集的准确性、敏感性和特异性分别为 80.97%、81.63%、78.75%和 79.69%、86.59%、72.16%。用 CNN 训练肺结节良性和恶性判别模型分析两个测试集,结果的准确性、敏感性和特异性分别为 90.12%、90.66%、89.47%和 88.57%、85.62%、90.87%。基于 YOLOv3 网络和 CNN 的融合模型在两个测试集上进行测试,结果的准确性、敏感性和特异性分别为 93.82%、94.85%、92.59%和 92.31%、92.68%、91.89%。
与 YOLOv3 网络和 CNN 相比,集成学习模型在去除假阳性方面更有效,集成学习模型的准确率在识别肺结节方面高于其他两个网络。