Shih Yu-Cheng, Ko Chi-Lun, Wang Shan-Ying, Chang Chen-Yu, Lin Shau-Syuan, Huang Cheng-Wen, Cheng Mei-Fang, Chen Chung-Ming, Wu Yen-Wen
Department of Nuclear Medicine, Far Eastern Memorial Hospital, New Taipei City, Taiwan.
Department of Biomedical Engineering, National Taiwan University, Taipei, Taiwan.
Eur J Nucl Med Mol Imaging. 2025 Apr 8. doi: 10.1007/s00259-025-07243-w.
Deep learning (DL) models for predicting obstructive coronary artery disease (CAD) using myocardial perfusion imaging (MPI) have shown potential for enhancing diagnostic accuracy. However, their ability to maintain consistent performance across institutions and demographics remains uncertain. This study aimed to investigate the generalizability and potential biases of an in-house MPI DL model between two hospital-based cohorts.
We retrospectively included patients from two medical centers in Taiwan who underwent stress/redistribution thallium-201 MPI followed by invasive coronary angiography within 90 days as the reference standard. A polar map-free 3D DL model trained on 928 MPI images from one center to predict obstructive CAD was tested on internal (933 images) and external (3234 images from the other center) validation sets. Diagnostic performance, assessed using area under receiver operating characteristic curves (AUCs), was compared between the internal and external cohorts, demographic groups, and with the performance of stress total perfusion deficit (TPD).
The model showed significantly lower performance in the external cohort compared to the internal cohort in both patient-based (AUC: 0.713 vs. 0.813) and vessel-based (AUC: 0.733 vs. 0.782) analyses, but still outperformed stress TPD (all p < 0.001). The performance was lower in patients who underwent treadmill stress MPI in the internal cohort and in patients over 70 years old in the external cohort.
This study demonstrated adequate performance but also limitations in the generalizability of the DL-based MPI model, along with biases related to stress type and patient age. Thorough validation is essential before the clinical implementation of DL MPI models.
利用心肌灌注成像(MPI)预测阻塞性冠状动脉疾病(CAD)的深度学习(DL)模型已显示出提高诊断准确性的潜力。然而,其在不同机构和人群中保持一致性能的能力仍不确定。本研究旨在调查两个基于医院的队列中内部MPI DL模型的可推广性和潜在偏差。
我们回顾性纳入了台湾两个医疗中心的患者,这些患者接受了应激/再分布铊-201 MPI检查,并在90天内接受了有创冠状动脉造影作为参考标准。在一个中心的928幅MPI图像上训练的无极坐标图3D DL模型,用于预测阻塞性CAD,并在内部(933幅图像)和外部(来自另一个中心的3234幅图像)验证集上进行测试。使用受试者操作特征曲线下面积(AUC)评估诊断性能,并在内部和外部队列、人口统计学组之间进行比较,并与应激总灌注缺损(TPD)的性能进行比较。
在基于患者(AUC:0.713对0.813)和基于血管(AUC:0.733对0.782)的分析中,该模型在外部队列中的性能显著低于内部队列,但仍优于应激TPD(所有p<0.001)。内部队列中接受平板运动应激MPI的患者和外部队列中70岁以上的患者性能较低。
本研究证明了基于DL的MPI模型具有足够的性能,但在可推广性方面也存在局限性,同时存在与应激类型和患者年龄相关的偏差。在DL MPI模型临床应用之前,进行全面验证至关重要。