Vanderbilt University, Nashville, TN, 37235, USA.
Vanderbilt University, Nashville, TN, 37235, USA.
Comput Biol Med. 2022 Nov;150:106113. doi: 10.1016/j.compbiomed.2022.106113. Epub 2022 Sep 29.
Patients with indeterminate pulmonary nodules (IPN) with an intermediate to a high probability of lung cancer generally undergo invasive diagnostic procedures. Chest computed tomography image and clinical data have been in estimating the pretest probability of lung cancer. In this study, we apply a deep learning network to integrate multi-modal data from CT images and clinical data (including blood-based biomarkers) to improve lung cancer diagnosis. Our goal is to reduce uncertainty and to avoid morbidity, mortality, over- and undertreatment of patients with IPNs.
We use a retrospective study design with cross-validation and external-validation from four different sites. We introduce a deep learning framework with a two-path structure to learn from CT images and clinical data. The proposed model can learn and predict with single modality if the multi-modal data is not complete. We use 1284 patients in the learning cohort for model development. Three external sites (with 155, 136 and 96 patients, respectively) provided patient data for external validation. We compare our model to widely applied clinical prediction models (Mayo and Brock models) and image-only methods (e.g., Liao et al. model).
Our co-learning model improves upon the performance of clinical-factor-only (Mayo and Brock models) and image-only (Liao et al.) models in both cross-validation of learning cohort (e.g.
, AUC: 0.787 (ours) vs. 0.707-0.719 (baselines), results reported in validation fold and external-validation using three datasets from University of Pittsburgh Medical Center (e.g., 0.918 (ours) vs. 0.828-0.886 (baselines)), Detection of Early Cancer Among Military Personnel (e.g., 0.712 (ours) vs. 0.576-0.709 (baselines)), and University of Colorado Denver (e.g., 0.847 (ours) vs. 0.679-0.746 (baselines)). In addition, our model achieves better re-classification performance (cNRI 0.04 to 0.20) in all cross- and external-validation sets compared to the Mayo model.
Lung cancer risk estimation in patients with IPNs can benefit from the co-learning of CT image and clinical data. Learning from more subjects, even though those only have a single modality, can improve the prediction accuracy. An integrated deep learning model can achieve reasonable discrimination and re-classification performance.
具有中等至高度肺癌可能性的肺部不确定结节(IPN)患者通常需要进行侵入性诊断程序。胸部计算机断层扫描图像和临床数据已用于估计肺癌的术前概率。在这项研究中,我们应用深度学习网络将来自 CT 图像和临床数据(包括基于血液的生物标志物)的多模态数据集成在一起,以改善肺癌诊断。我们的目标是降低不确定性,并避免对具有 IPN 的患者的发病率、死亡率、过度和治疗不足。
我们采用回顾性研究设计,并在四个不同地点进行交叉验证和外部验证。我们引入了一个具有双路径结构的深度学习框架,以从 CT 图像和临床数据中学习。如果多模态数据不完整,所提出的模型可以通过单一模态进行学习和预测。我们使用学习队列中的 1284 名患者进行模型开发。三个外部站点(分别提供了 155、136 和 96 名患者的数据)为外部验证提供了患者数据。我们将我们的模型与广泛应用的临床预测模型(Mayo 和 Brock 模型)和图像仅方法(例如,Liao 等人的模型)进行比较。
在学习队列的交叉验证中,我们的联合学习模型优于仅临床因素(Mayo 和 Brock 模型)和仅图像(Liao 等人的模型)模型的性能(例如,AUC:0.787(我们的)与 0.707-0.719(基线),在验证折叠和匹兹堡大学医学中心的三个数据集的外部验证中报告的结果(例如,0.918(我们的)与 0.828-0.886(基线)),检测军事人员中的早期癌症(例如,0.712(我们的)与 0.576-0.709(基线)),以及科罗拉多大学丹佛分校(例如,0.847(我们的)与 0.679-0.746(基线))。此外,与 Mayo 模型相比,我们的模型在所有交叉验证和外部验证集中都实现了更好的再分类性能(cNRI 0.04 至 0.20)。
肺部不确定结节患者的肺癌风险评估可以从 CT 图像和临床数据的联合学习中受益。即使只有单一模态,从更多的对象中学习也可以提高预测准确性。集成深度学习模型可以实现合理的区分和再分类性能。