Cao Haoyin, Morotti Andrea, Mazzacane Federico, Desser Dmitriy, Schlunk Frieder, Güttler Christopher, Kniep Helge, Penzkofer Tobias, Fiehler Jens, Hanning Uta, Dell'Orco Andrea, Nawabi Jawed
Department of Radiology, Charité-Universitätsmedizin Berlin, Freie Universität Berlin, Humboldt-Universität zu Berlin, Charitéplatz 1, 10117 Berlin, Germany.
Neurology Unit, Department of Neurological Sciences and Vision, ASST-Spedali Civili, 25123 Brescia, Italy.
J Clin Med. 2023 Jun 12;12(12):4005. doi: 10.3390/jcm12124005.
The objective of this study was to assess the performance of the first publicly available automated 3D segmentation for spontaneous intracerebral hemorrhage (ICH) based on a 3D neural network before and after retraining.
We performed an independent validation of this model using a multicenter retrospective cohort. Performance metrics were evaluated using the dice score (DSC), sensitivity, and positive predictive values (PPV). We retrained the original model (OM) and assessed the performance via an external validation design. A multivariate linear regression model was used to identify independent variables associated with the model's performance. Agreements in volumetric measurements and segmentation were evaluated using Pearson's correlation coefficients (r) and intraclass correlation coefficients (ICC), respectively. With 1040 patients, the OM had a median DSC, sensitivity, and PPV of 0.84, 0.79, and 0.93, compared to thoseo f 0.83, 0.80, and 0.91 in the retrained model (RM). However, the median DSC for infratentorial ICH was relatively low and improved significantly after retraining, at < 0.001. ICH volume and location were significantly associated with the DSC, at < 0.05. The agreement between volumetric measurements (r > 0.90, > 0.05) and segmentations (ICC ≥ 0.9, < 0.001) was excellent.
The model demonstrated good generalization in an external validation cohort. Location-specific variances improved significantly after retraining. External validation and retraining are important steps to consider before applying deep learning models in new clinical settings.
本研究的目的是评估首个公开可用的基于三维神经网络的自发性脑出血(ICH)自动三维分割在重新训练前后的性能。
我们使用多中心回顾性队列对该模型进行了独立验证。使用骰子分数(DSC)、敏感性和阳性预测值(PPV)评估性能指标。我们对原始模型(OM)进行了重新训练,并通过外部验证设计评估了性能。使用多元线性回归模型来识别与模型性能相关的自变量。分别使用Pearson相关系数(r)和组内相关系数(ICC)评估体积测量和分割的一致性。对于1040例患者,原始模型的DSC、敏感性和PPV中位数分别为0.84、0.79和0.93,而重新训练后的模型(RM)分别为0.83、0.80和0.91。然而,幕下ICH的DSC中位数相对较低,重新训练后显著改善,P<0.001。ICH体积和位置与DSC显著相关,P<0.05。体积测量(r>0.90,P>0.05)和分割(ICC≥0.9,P<0.001)之间的一致性非常好。
该模型在外部验证队列中表现出良好的泛化能力。重新训练后,特定位置的差异显著改善。在将深度学习模型应用于新的临床环境之前,外部验证和重新训练是需要考虑的重要步骤。