Paul Rahul, Hall Lawrence, Goldgof Dmitry, Schabath Matthew, Gillies Robert
Department of Computer Science and Engineering, University of South Florida, Tampa, Florida, USA.
Department of Cancer Epidemiology, H. L. Moffitt Cancer Center & Research Institute, Tampa, FL, USA.
Proc Int Jt Conf Neural Netw. 2018 Jul;2018. doi: 10.1109/IJCNN.2018.8489345. Epub 2018 Oct 15.
Lung cancer is the leading cause of cancer-related deaths globally, which makes early detection and diagnosis a high priority. Computed tomography (CT) is the method of choice for early detection and diagnosis of lung cancer. Radiomics features extracted from CT-detected lung nodules provide a good platform for early detection, diagnosis, and prognosis. In particular when using low dose CT for lung cancer screening, effective use of radiomics can yield a precise non-invasive approach to nodule tracking. Lately, with the advancement of deep learning, convolutional neural networks (CNN) are also being used to analyze lung nodules. In this study, our own trained CNNs, a pre-trained CNN and radiomics features were used for predictive analysis. Using subsets of participants from the National Lung Screening Trial, we investigated if the prediction of nodule malignancy could be further enhanced by an ensemble of classifiers using different feature sets and learning approaches. We extracted probability predictions from our different models on an unseen test set and combined them to generate better predictions. Ensembles were able to yield increased accuracy and area under the receiver operating characteristic curve (AUC). The best-known AUC of 0.96 and accuracy of 89.45% were obtained, which are significant improvements over the previous best AUC of 0.87 and accuracy of 76.79%.
肺癌是全球癌症相关死亡的主要原因,这使得早期检测和诊断成为重中之重。计算机断层扫描(CT)是肺癌早期检测和诊断的首选方法。从CT检测到的肺结节中提取的影像组学特征为早期检测、诊断和预后提供了一个良好的平台。特别是在使用低剂量CT进行肺癌筛查时,有效利用影像组学可以产生一种精确的非侵入性结节跟踪方法。最近,随着深度学习的发展,卷积神经网络(CNN)也被用于分析肺结节。在本研究中,我们自己训练的CNN、一个预训练的CNN和影像组学特征被用于预测分析。使用来自国家肺癌筛查试验的参与者子集,我们研究了使用不同特征集和学习方法的分类器集成是否可以进一步提高结节恶性肿瘤的预测能力。我们从不同模型在一个未见过的测试集上提取概率预测,并将它们组合起来以生成更好的预测。集成能够提高准确性和受试者操作特征曲线(AUC)下的面积。获得了最知名的AUC为0.96和准确率为89.45%,这比之前最好的AUC为0.87和准确率为76.79%有显著提高。