Neurosurgery Department, West China Hospital, Sichuan University, Chengdu, China.
Machine Intelligence Laboratory, College of Computer Science, Sichuan University, Chengdu, China.
J Med Internet Res. 2023 Dec 15;25:e44119. doi: 10.2196/44119.
Convolutional neural networks (CNNs) have produced state-of-the-art results in meningioma segmentation on magnetic resonance imaging (MRI). However, images obtained from different institutions, protocols, or scanners may show significant domain shift, leading to performance degradation and challenging model deployment in real clinical scenarios.
This research aims to investigate the realistic performance of a well-trained meningioma segmentation model when deployed across different health care centers and verify the methods to enhance its generalization.
This study was performed in four centers. A total of 606 patients with 606 MRIs were enrolled between January 2015 and December 2021. Manual segmentations, determined through consensus readings by neuroradiologists, were used as the ground truth mask. The model was previously trained using a standard supervised CNN called Deeplab V3+ and was deployed and tested separately in four health care centers. To determine the appropriate approach to mitigating the observed performance degradation, two methods were used: unsupervised domain adaptation and supervised retraining.
The trained model showed a state-of-the-art performance in tumor segmentation in two health care institutions, with a Dice ratio of 0.887 (SD 0.108, 95% CI 0.903-0.925) in center A and a Dice ratio of 0.874 (SD 0.800, 95% CI 0.854-0.894) in center B. Whereas in the other health care institutions, the performance declined, with Dice ratios of 0.631 (SD 0.157, 95% CI 0.556-0.707) in center C and 0.649 (SD 0.187, 95% CI 0.566-0.732) in center D, as they obtained the MRI using different scanning protocols. The unsupervised domain adaptation showed a significant improvement in performance scores, with Dice ratios of 0.842 (SD 0.073, 95% CI 0.820-0.864) in center C and 0.855 (SD 0.097, 95% CI 0.826-0.886) in center D. Nonetheless, it did not overperform the supervised retraining, which achieved Dice ratios of 0.899 (SD 0.026, 95% CI 0.889-0.906) in center C and 0.886 (SD 0.046, 95% CI 0.870-0.903) in center D.
Deploying the trained CNN model in different health care institutions may show significant performance degradation due to the domain shift of MRIs. Under this circumstance, the use of unsupervised domain adaptation or supervised retraining should be considered, taking into account the balance between clinical requirements, model performance, and the size of the available data.
卷积神经网络(CNN)在磁共振成像(MRI)上的脑膜瘤分割方面取得了最先进的成果。然而,来自不同机构、协议或扫描仪的图像可能会显示出显著的领域转移,导致性能下降,并在实际临床场景中难以部署模型。
本研究旨在研究经过充分训练的脑膜瘤分割模型在跨不同医疗机构部署时的实际性能,并验证增强其泛化能力的方法。
本研究在四个中心进行。2015 年 1 月至 2021 年 12 月期间,共纳入 606 名患者的 606 份 MRI。手动分割是通过神经放射科医生的共识阅读确定的,作为地面真实掩模。该模型之前是使用称为 Deeplab V3+的标准监督卷积神经网络进行训练的,并分别在四个医疗机构中进行部署和测试。为了确定减轻观察到的性能下降的适当方法,使用了两种方法:无监督域自适应和有监督再训练。
在两个医疗机构中,经过训练的模型在肿瘤分割方面表现出色,在机构 A 中,Dice 比为 0.887(SD 0.108,95%CI 0.903-0.925),在机构 B 中,Dice 比为 0.874(SD 0.800,95%CI 0.854-0.894)。然而,在其他医疗机构中,性能下降,在机构 C 中,Dice 比为 0.631(SD 0.157,95%CI 0.556-0.707),在机构 D 中,Dice 比为 0.649(SD 0.187,95%CI 0.566-0.732),因为它们使用不同的扫描协议获得了 MRI。无监督域自适应在性能评分方面显示出显著的提高,在机构 C 中,Dice 比为 0.842(SD 0.073,95%CI 0.820-0.864),在机构 D 中,Dice 比为 0.855(SD 0.097,95%CI 0.826-0.886)。尽管如此,它并没有超过有监督再训练,后者在机构 C 中达到了 0.899(SD 0.026,95%CI 0.889-0.906)的 Dice 比,在机构 D 中达到了 0.886(SD 0.046,95%CI 0.870-0.903)的 Dice 比。
在不同医疗机构中部署经过训练的 CNN 模型可能会由于 MRI 的领域转移而导致性能显著下降。在这种情况下,应考虑使用无监督域自适应或有监督再训练,同时考虑临床需求、模型性能和可用数据的大小之间的平衡。