Xu Cheng, Guo Heng, Xu Minfeng, Duan Miao, Wang Ming, Liu Peijun, Luo Xinyi, Jin Zhengyu, Liu Hui, Wang Yining
Department of Radiology, State Key Laboratory of Complex Severe and Rare Diseases, Peking Union Medical College Hospital, Chinese Academy of Medical Sciences and Peking Union Medical College, Beijing, China.
Alibaba Group, Hangzhou, China.
Quant Imaging Med Surg. 2022 May;12(5):2684-2695. doi: 10.21037/qims-21-1017.
The aim of this study was to investigate the reliability and accuracy of automatic coronary artery calcium (CAC) scoring and risk classification in non-gated, non-contrast chest computed tomography (CT) of different slice thicknesses using a deep learning algorithm.
This retrospective study was performed at 2 tertiary hospitals. Paired, dedicated calcium-scoring CT scans and non-gated, non-contrast chest CT scans taken within a month from the same patients were included. Chest CT images were grouped according to the slice thickness (group A: 1 mm; group B: 3 mm). For internal scans, the CAC score manually measured on dedicated calcium scoring CT images was used as the gold standard. The deep learning algorithm for group A was trained using 150 chest CT scans and tested using 144 scans, and that for group B was trained using 170 chest CT scans and tested using 144 scans. The intraclass correlation coefficient (ICC) was used to evaluate the correlation between the algorithm and the gold standard. Agreement between the deep learning algorithm, the manual results on chest CT, and the gold standard was determined by Bland-Altman analysis. Cardiac risk categories were compared. External validation was performed on 334 paired scans from a different organization.
A total of 608 internal paired scans (1 mm: 294; 3 mm: 314) of 406 individuals and 334 external paired scans (1 mm: 117; 3 mm: 117) of 117 individuals were included in the analysis. The ICCs between the deep learning algorithm and the gold standard were excellent in both group A (0.90; 95% CI: 0.85-0.93) and group B (0.94; 95% CI: 0.92-0.96). The Bland-Altman plots showed good agreement in both groups. For the cardiovascular risk category, the deep learning algorithm accurately classified 71% of cases in group A and 81% of cases in group B. The Kappa values for risk classification were 0.72 in group A and 0.82 in group B. External validation yielded equally good results.
The automatic calculation of CAC score and cardiovascular risk stratification on non-gated chest CT using a deep learning algorithm was reliable and accurate on both 1 and 3 mm scans. Chest CT with a slice thickness of 3 mm was slightly more accurate in CAC detection and risk classification.
本研究旨在探讨使用深度学习算法在不同层厚的非门控、非增强胸部计算机断层扫描(CT)中自动进行冠状动脉钙化(CAC)评分及风险分类的可靠性和准确性。
本回顾性研究在两家三级医院开展。纳入同一患者在一个月内进行的配对、专门的钙化评分CT扫描以及非门控、非增强胸部CT扫描。胸部CT图像根据层厚分组(A组:1mm;B组:3mm)。对于内部扫描,在专门的钙化评分CT图像上手动测量的CAC评分用作金标准。A组的深度学习算法使用150例胸部CT扫描进行训练,144例进行测试;B组的深度学习算法使用170例胸部CT扫描进行训练,144例进行测试。组内相关系数(ICC)用于评估算法与金标准之间的相关性。通过Bland-Altman分析确定深度学习算法、胸部CT手动结果与金标准之间的一致性。比较心脏风险类别。对来自不同机构的334对扫描进行外部验证。
分析共纳入406名个体的608对内部配对扫描(1mm:294对;3mm:314对)以及117名个体的334对外部配对扫描(1mm:117对;3mm:117对)。A组(0.90;95%CI:0.85 - 0.93)和B组(0.94;95%CI:0.92 - 0.96)中深度学习算法与金标准之间的ICC均极佳。Bland-Altman图显示两组均具有良好的一致性。对于心血管风险类别,深度学习算法在A组中准确分类了71%的病例,在B组中准确分类了81%的病例。A组风险分类的Kappa值为0.72,B组为0.82。外部验证也取得了同样良好的结果。
使用深度学习算法在1mm和3mm扫描的非门控胸部CT上自动计算CAC评分和进行心血管风险分层是可靠且准确的。层厚为3mm的胸部CT在CAC检测和风险分类方面略更准确。