Rezvanjou Sara, Moslemi Amir, Peterson Samuel, Tan Wan-Cheng, Hogg James C, Bourbeau Jean, Reinhardt Joseph M, Kirby Miranda
Toronto Metropolitan University, Department of Physics, Toronto, Ontario, Canada.
Sunnybrook Research Institute, Toronto, Ontario, Canada.
J Med Imaging (Bellingham). 2025 May;12(3):034502. doi: 10.1117/1.JMI.12.3.034502. Epub 2025 May 22.
Convolutional neural network (CNN)-based models using computed tomography images can classify chronic obstructive pulmonary disease (COPD) with high performance, but various input image types have been investigated, and it is unclear what image types are optimal. We propose a 2D airway-optimized topological multiplanar reformat (tMPR) input image and compare its performance with established 2D/3D input image types for COPD classification. As a secondary aim, we examined the impact of training on a dataset with predominantly mild COPD cases and testing on a more severe dataset to assess whether it improves generalizability.
CanCOLD study participants were used for training/internal testing; SPIROMICS participants were used for external testing. Several 2D/3D input image types were adapted from the literature. In the proposed models, 2D airway-optimized tMPR images (to convey shape and interior/contextual information) and 3D output fusion of axial/sagittal/coronal images were investigated. The area-under-the-receiver-operator-curve (AUC) was used to evaluate model performance and Brier scores were used to evaluate model calibration. To further examine how training dataset severity impacts generalization, we compared model performance when trained on the milder CanCOLD dataset versus the more severe SPIROMICS dataset, and vice versa.
A total of CanCOLD participants were used for training/validation and for testing; SPIROMICS participants were used for external testing. For the CanCOLD and SPIROMICS test set, the proposed 2D tMPR on its own (CanCOLD: ; SPIROMICS: ) and combined with the 3D axial/coronal/sagittal lung view (CanCOLD: ; SPIROMICS: ) had the highest performance. The combined 2D tMPR and 3D axial/coronal/sagittal lung view had the lowest Brier score (CanCOLD: score = 0.16; SPIROMICS: score = 0.24). Conversely, using SPIROMICS for training/testing and CanCOLD for external testing resulted in lower performance when tested on CanCOLD for 2D tMPR on its own (SPIROMICS: AUC = 0.92; CanCOLD: AUC = 0.74) and when combined with the 3D axial/coronal/sagittal lung view (SPIROMICS: ; CanCOLD: ).
The CNN-based model with the combined 2D tMPR images and 3D lung view as input image types had the highest performance for COPD classification, highlighting the importance of airway information and that the fusion of different types of information as input image can improve CNN-based model performance. In addition, models trained on CanCOLD demonstrated strong generalization to the more severe SPIROMICS cohort, whereas training on SPIROMICS resulted in lower performance when tested on CanCOLD. These findings suggest that training on milder COPD cases may improve classification performance across the disease spectrum.
基于卷积神经网络(CNN)的模型利用计算机断层扫描图像能够高效地对慢性阻塞性肺疾病(COPD)进行分类,但人们对各种输入图像类型进行了研究,尚不清楚哪种图像类型最为理想。我们提出了一种二维气道优化拓扑多平面重组(tMPR)输入图像,并将其性能与用于COPD分类的既定二维/三维输入图像类型进行比较。作为次要目标,我们研究了在以轻度COPD病例为主的数据集上进行训练,并在更严重的数据集上进行测试对泛化能力的影响,以评估这是否能提高泛化性。
将加拿大慢性阻塞性肺疾病队列研究(CanCOLD)的参与者用于训练/内部测试;将肺影像和肺功能系统影像学研究(SPIROMICS)的参与者用于外部测试。从文献中改编了几种二维/三维输入图像类型。在所提出的模型中,研究了二维气道优化tMPR图像(以传达形状和内部/背景信息)以及轴向/矢状/冠状图像的三维输出融合。使用受试者工作特征曲线下面积(AUC)来评估模型性能,使用布里尔评分来评估模型校准。为了进一步研究训练数据集的严重程度如何影响泛化能力,我们比较了在较轻度的CanCOLD数据集与更严重的SPIROMICS数据集上进行训练时模型的性能,反之亦然。
共有 名CanCOLD参与者用于训练/验证, 名用于测试; 名SPIROMICS参与者用于外部测试。对于CanCOLD和SPIROMICS测试集,所提出的二维tMPR单独使用时(CanCOLD: ;SPIROMICS: )以及与三维轴向/冠状/矢状肺视图相结合时(CanCOLD: ;SPIROMICS: )具有最高性能。二维tMPR与三维轴向/冠状/矢状肺视图相结合时布里尔评分最低(CanCOLD:评分 = 0.16;SPIROMICS:评分 = 0.24)。相反,当使用SPIROMICS进行训练/测试并使用CanCOLD进行外部测试时,二维tMPR单独在CanCOLD上进行测试时性能较低(SPIROMICS:AUC = 0.92;CanCOLD:AUC = 0.74),与三维轴向/冠状/矢状肺视图相结合时也是如此(SPIROMICS: ;CanCOLD: )。
以二维tMPR图像和三维肺视图相结合作为输入图像类型的基于CNN的模型在COPD分类中具有最高性能,突出了气道信息的重要性,并且不同类型信息融合作为输入图像可以提高基于CNN的模型性能。此外,在CanCOLD上训练的模型对更严重的SPIROMICS队列具有很强的泛化能力,而在SPIROMICS上训练的模型在CanCOLD上进行测试时性能较低。这些发现表明,在较轻的COPD病例上进行训练可能会提高整个疾病谱的分类性能。