Yun Seoyeong, Choi Jooyoung
Ewha Womans University College of Medicine, Seoul, Korea.
Ewha Med J. 2025 Apr;48(2):e33. doi: 10.12771/emj.2025.00087. Epub 2025 Apr 21.
This study compares 3 deep learning models (UNet, TransUNet, and MIST) for left atrium (LA) segmentation of cardiac computed tomography (CT) images from patients with congenital heart disease (CHD). It investigates how architectural variations in the MIST model, such as spatial squeeze-and-excitation attention, impact Dice score and HD95.
We analyzed 108 publicly available, de-identified CT volumes from the ImageCHD dataset. Volumes underwent resampling, intensity normalization, and data augmentation. UNet, TransUNet, and MIST models were trained using 80% of 97 cases, with the remaining 20% employed for validation. Eleven cases were reserved for testing. Performance was evaluated using the Dice score (measuring overlap accuracy) and HD95 (reflecting boundary accuracy). Statistical comparisons were performed via one-way repeated measures analysis of variance.
MIST achieved the highest mean Dice score (0.74; 95% confidence interval, 0.67-0.81), significantly outperforming TransUNet (0.53; P<0.001) and UNet (0.49; P<0.001). Regarding HD95, TransUNet (9.09 mm) and MIST (5.77 mm) similarly outperformed UNet (27.49 mm; P<0.0001). In ablation experiments, the inclusion of spatial attention did not further enhance the MIST model's performance, suggesting redundancy with existing attention mechanisms. However, the integration of multi-scale features and refined skip connections consistently improved segmentation accuracy and boundary delineation.
MIST demonstrated superior LA segmentation, highlighting the benefits of its integrated multi-scale features and optimized architecture. Nevertheless, its computational overhead complicates practical clinical deployment. Our findings underscore the value of advanced hybrid models in cardiac imaging, providing improved reliability for CHD evaluation. Future studies should balance segmentation accuracy with feasible clinical implementation.
本研究比较了3种深度学习模型(UNet、TransUNet和MIST)在先天性心脏病(CHD)患者心脏计算机断层扫描(CT)图像左心房(LA)分割中的表现。研究探讨了MIST模型中的结构变化,如空间挤压与激励注意力机制,如何影响Dice分数和HD95。
我们分析了来自ImageCHD数据集的108个公开可用的、已去识别的CT容积数据。对容积数据进行了重采样、强度归一化和数据增强处理。使用97例中的80%对UNet、TransUNet和MIST模型进行训练,其余20%用于验证。保留11例用于测试。使用Dice分数(衡量重叠精度)和HD95(反映边界精度)评估性能。通过单向重复测量方差分析进行统计比较。
MIST模型获得了最高的平均Dice分数(0.74;95%置信区间,0.67 - 0.81),显著优于TransUNet(0.53;P < 0.001)和UNet(0.49;P < 0.001)。在HD95方面,TransUNet(9.09毫米)和MIST(5.77毫米)同样优于UNet(27.49毫米;P < 0.0001)。在消融实验中,加入空间注意力机制并未进一步提高MIST模型的性能,表明与现有注意力机制存在冗余。然而,多尺度特征的整合和优化的跳跃连接持续提高了分割精度和边界描绘。
MIST在LA分割中表现出色,突出了其集成多尺度特征和优化架构的优势。然而,其计算开销使实际临床应用变得复杂。我们的研究结果强调了先进混合模型在心脏成像中的价值,为CHD评估提供了更高的可靠性。未来的研究应在分割精度与可行的临床应用之间取得平衡。