Department of Interventional Radiology, Fudan University Shanghai Cancer Center, Xuhui District, Shanghai, China.
Department of Oncology, Shanghai Medical College, Fudan University, Xuhui District, Shanghai, China.
Med Phys. 2022 Oct;49(10):6384-6394. doi: 10.1002/mp.15903. Epub 2022 Aug 15.
To develop a novel multimodal data fusion model by incorporating computed tomography (CT) images and clinical variables based on deep learning for predicting the invasiveness risk of stage I lung adenocarcinoma that manifests as ground-glass nodules (GGNs) and compare the diagnostic performance of it with that of radiologists.
A total of 1946 patients with solitary and histopathologically confirmed GGNs with maximum diameter less than 3 cm were retrospectively enrolled. The training dataset containing 1704 GGNs was augmented by resampling, scaling, random cropping, and so forth, to generate new training data. A multimodal data fusion model based on residual learning architecture and two multilayer perceptron with attention mechanism combining CT images with patient general data and serum tumor markers was built. The distance-based confidence scores (DCS) were calculated and compared among multimodal data models with different combinations. An observer study was conducted and the prediction performance of the fusion algorithms was compared with that of the two radiologists by an independent testing dataset with 242 GGNs.
Among the whole GGNs, 606 GGNs are confirmed as invasive adenocarcinoma (IA) and 1340 are non-IA. The proposed novel multimodal data fusion model combining CT images, patient general data, and serum tumor markers achieved the highest accuracy (88.5%), area under a ROC curve (0.957), F1 (81.5%), F1 (81.9%), and Matthews correlation coefficient (73.2%) for classifying between IA and non-IA GGNs, which was even better than the senior radiologist's performance (accuracy, 86.1%). In addition, the DCSs for multimodal data suggested that CT image had a stronger influence (0.9540) quantitatively than general data (0.6726) or tumor marker (0.6971).
This study demonstrated that the feasibility of integrating different types of data including CT images and clinical variables, and the multimodal data fusion model yielded higher performance for distinguishing IA from non-IA GGNs.
开发一种新的基于深度学习的多模态数据融合模型,将 CT 图像和临床变量相结合,用于预测表现为磨玻璃结节(GGN)的 I 期肺腺癌的侵袭性风险,并将其诊断性能与放射科医生进行比较。
回顾性纳入 1946 名经病理证实的孤立性和最大直径小于 3cm 的 GGN 患者。通过重采样、缩放、随机裁剪等方式对包含 1704 个 GGN 的训练数据集进行扩充,以生成新的训练数据。建立了一种基于残差学习架构和两个带有注意力机制的多层感知机的多模态数据融合模型,将 CT 图像与患者一般数据和血清肿瘤标志物相结合。计算了不同组合的多模态数据模型之间的基于距离的置信度得分(DCS)并进行了比较。进行了观察者研究,并通过包含 242 个 GGN 的独立测试数据集比较了融合算法与两名放射科医生的预测性能。
在所有 GGN 中,606 个 GGN 被确认为侵袭性腺癌(IA),1340 个为非 IA。该研究提出的一种新的多模态数据融合模型,结合 CT 图像、患者一般数据和血清肿瘤标志物,在区分 IA 和非 IA GGN 方面的准确率(88.5%)、ROC 曲线下面积(0.957)、F1 值(81.5%)、F1 值(81.9%)和马氏相关系数(73.2%)最高,甚至优于资深放射科医生的表现(准确率 86.1%)。此外,多模态数据的 DCS 定量表明 CT 图像的影响(0.9540)比一般数据(0.6726)或肿瘤标志物(0.6971)更强。
本研究证明了整合包括 CT 图像和临床变量在内的不同类型数据的可行性,并且多模态数据融合模型在区分 IA 和非 IA GGN 方面具有更高的性能。