Zou Kelly H, Warfield Simon K, Bharatha Aditya, Tempany Clare M C, Kaus Michael R, Haker Steven J, Wells William M, Jolesz Ferenc A, Kikinis Ron
Department of Radiology and Surgical Planning Laboratory, Brigham and Women's Hospital, Harvard Medical School, 75 Francis St (Floor L-l), Boston, MA 02115, USA.
Acad Radiol. 2004 Feb;11(2):178-89. doi: 10.1016/s1076-6332(03)00671-8.
To examine a statistical validation method based on the spatial overlap between two sets of segmentations of the same anatomy.
The Dice similarity coefficient (DSC) was used as a statistical validation metric to evaluate the performance of both the reproducibility of manual segmentations and the spatial overlap accuracy of automated probabilistic fractional segmentation of MR images, illustrated on two clinical examples. Example 1: 10 consecutive cases of prostate brachytherapy patients underwent both preoperative 1.5T and intraoperative 0.5T MR imaging. For each case, 5 repeated manual segmentations of the prostate peripheral zone were performed separately on preoperative and on intraoperative images. Example 2: A semi-automated probabilistic fractional segmentation algorithm was applied to MR imaging of 9 cases with 3 types of brain tumors. DSC values were computed and logit-transformed values were compared in the mean with the analysis of variance (ANOVA).
Example 1: The mean DSCs of 0.883 (range, 0.876-0.893) with 1.5T preoperative MRI and 0.838 (range, 0.819-0.852) with 0.5T intraoperative MRI (P < .001) were within and at the margin of the range of good reproducibility, respectively. Example 2: Wide ranges of DSC were observed in brain tumor segmentations: Meningiomas (0.519-0.893), astrocytomas (0.487-0.972), and other mixed gliomas (0.490-0.899).
The DSC value is a simple and useful summary measure of spatial overlap, which can be applied to studies of reproducibility and accuracy in image segmentation. We observed generally satisfactory but variable validation results in two clinical applications. This metric may be adapted for similar validation tasks.
检验一种基于同一解剖结构的两组分割之间空间重叠的统计验证方法。
使用骰子相似系数(DSC)作为统计验证指标,以评估手动分割的可重复性以及MR图像自动概率分数分割的空间重叠准确性,通过两个临床实例进行说明。实例1:10例连续的前列腺近距离治疗患者术前接受1.5T MR成像,术中接受0.5T MR成像。对于每例患者,分别在术前和术中图像上对前列腺外周区进行5次重复的手动分割。实例2:将一种半自动概率分数分割算法应用于9例患有3种脑肿瘤类型患者的MR成像。计算DSC值,并将对数转换后的值进行均值比较,采用方差分析(ANOVA)。
实例1:1.5T术前MRI的平均DSC为0.883(范围0.876 - 0.893),0.5T术中MRI的平均DSC为0.838(范围0.819 - 0.852)(P <.001),分别处于良好可重复性范围内和边缘。实例2:在脑肿瘤分割中观察到DSC的范围较广:脑膜瘤(0.519 - 0.893)、星形细胞瘤(0.487 - 0.972)和其他混合胶质瘤(0.490 - 0.899)。
DSC值是空间重叠的一种简单且有用的汇总度量,可应用于图像分割的可重复性和准确性研究。在两个临床应用中,我们观察到验证结果总体上令人满意但存在差异。该指标可能适用于类似的验证任务。