Gao Riqiang, Tang Yucheng, Khan Mirza S, Xu Kaiwen, Paulson Alexis B, Sullivan Shelbi, Huo Yuankai, Deppen Stephen, Massion Pierre P, Sandler Kim L, Landman Bennett A
Departments of Computer Science (R.G., K.X., Y.H., B.A.L.) and Electrical and Computer Engineering (Y.T., Y.H., B.A.L.), Vanderbilt University, 400 24th Ave S, Featheringill Hall, Room 371, Nashville, TN 37235; and Departments of Radiology and Radiological Sciences (A.B.P., K.L.S.), Thoracic Surgery (S.S., S.D.), General Internal Medicine and Public Health (M.S.K.), Biomedical Informatics (M.S.K.), and Medicine, Division of Allergy, Pulmonary and Critical Care Medicine (P.P.M.), Vanderbilt University Medical Center, Nashville, Tenn.
Radiol Artif Intell. 2021 Oct 13;3(6):e210032. doi: 10.1148/ryai.2021210032. eCollection 2021 Nov.
To develop a model to estimate lung cancer risk using lung cancer screening CT and clinical data elements (CDEs) without manual reading efforts.
Two screening cohorts were retrospectively studied: the National Lung Screening Trial (NLST; participants enrolled between August 2002 and April 2004) and the Vanderbilt Lung Screening Program (VLSP; participants enrolled between 2015 and 2018). Fivefold cross-validation using the NLST dataset was used for initial development and assessment of the co-learning model using whole CT scans and CDEs. The VLSP dataset was used for external testing of the developed model. Area under the receiver operating characteristic curve (AUC) and area under the precision-recall curve were used to measure the performance of the model. The developed model was compared with published risk-prediction models that used only CDEs or imaging data alone. The Brock model was also included for comparison by imputing missing values for patients without a dominant pulmonary nodule.
A total of 23 505 patients from the NLST (mean age, 62 years ± 5 [standard deviation]; 13 838 men, 9667 women) and 147 patients from the VLSP (mean age, 65 years ± 5; 82 men, 65 women) were included. Using cross-validation on the NLST dataset, the AUC of the proposed co-learning model (AUC, 0.88) was higher than the published models predicted with CDEs only (AUC, 0.69; < .05) and with images only (AUC, 0.86; < .05). Additionally, using the external VLSP test dataset, the co-learning model had a higher performance than each of the published individual models (AUC, 0.91 [co-learning] vs 0.59 [CDE-only] and 0.88 [image-only]; < .05 for both comparisons).
The proposed co-learning predictive model combining chest CT images and CDEs had a higher performance for lung cancer risk prediction than models that contained only CDE or only image data; the proposed model also had a higher performance than the Brock model. Computer-aided Diagnosis (CAD), CT, Lung, Thorax © RSNA, 2021.
开发一种模型,利用肺癌筛查CT和临床数据元素(CDE)来估计肺癌风险,无需人工阅片。
回顾性研究了两个筛查队列:国家肺癌筛查试验(NLST;2002年8月至2004年4月期间入组的参与者)和范德比尔特肺癌筛查项目(VLSP;2015年至2018年期间入组的参与者)。使用NLST数据集进行五折交叉验证,用于使用全CT扫描和CDE的协同学习模型的初始开发和评估。VLSP数据集用于对所开发模型进行外部测试。采用受试者操作特征曲线下面积(AUC)和精确召回率曲线下面积来衡量模型的性能。将所开发的模型与仅使用CDE或仅使用影像数据的已发表风险预测模型进行比较。还纳入了Brock模型,通过为无主要肺结节的患者插补缺失值进行比较。
共纳入了NLST的23505例患者(平均年龄62岁±5[标准差];男性13838例,女性9667例)和VLSP的147例患者(平均年龄65岁±5;男性82例,女性65例)。在NLST数据集上进行交叉验证时,所提出的协同学习模型的AUC(0.88)高于仅用CDE预测的已发表模型(AUC,0.69;P<0.05)和仅用影像预测的模型(AUC,0.86;P<0.05)。此外,使用外部VLSP测试数据集时,协同学习模型的性能高于每个已发表的单独模型(AUC,0.91[协同学习]对0.59[仅CDE]和0.88[仅影像];两项比较均P<0.05)。
所提出的结合胸部CT影像和CDE的协同学习预测模型在肺癌风险预测方面的性能高于仅包含CDE或仅包含影像数据的模型;所提出的模型性能也高于Brock模型。计算机辅助诊断(CAD)、CT、肺、胸部 ©RSNA,2021年。