Rainio Oona, Klén Riku
Turku PET Centre, University of Turku and Turku University Hospital, Turku, Finland.
J Imaging Inform Med. 2025 May 8. doi: 10.1007/s10278-025-01535-1.
The Sørensen-Dice similarity coefficient (DSC) is the most common evaluation metric used for image segmentation but it is not always ideal. Namely, the DSC values only depend on the number of misplaced elements instead of their location with respect to the correct segments. Because of this, the DSC is ill-suited for such tasks where the correct location of the borders of an object is difficult to define in an objective way, as is the case in tumor segmentation in positron emission tomography (PET) images. To avoid this issue, we introduce two different modifications of the DSC, one with weights and one with an additional loss term, which also evaluate the distance between the real and the predicted segments. We computed the values of DSC and our new coefficient from 191 predicted tumor segmentation masks created by using PET images of 89 head and neck squamous cell carcinoma patients. We compared the values of all three coefficients with the scores given to these masks by human evaluators. According to our results, the weighted modification of DSC had a higher correlation with the scores given by the human evaluators than the original DSC, and it also produced significantly less variation within the two highest score classes (p-value 0.018). The new weighted coefficient introduced here has much potential in the evaluation of segmentation results from medical imaging.
索伦森-戴斯相似系数(DSC)是用于图像分割的最常用评估指标,但它并不总是理想的。也就是说,DSC值仅取决于错放元素的数量,而不是它们相对于正确分割的位置。因此,DSC不适用于难以以客观方式定义物体边界正确位置的任务,正电子发射断层扫描(PET)图像中的肿瘤分割就是这种情况。为避免这个问题,我们引入了两种不同的DSC修改方法,一种带权重,一种带有额外的损失项,它们还评估真实分割和预测分割之间的距离。我们从使用89名头颈部鳞状细胞癌患者的PET图像创建的191个预测肿瘤分割掩码中计算了DSC值和我们的新系数。我们将所有三个系数的值与人类评估者给这些掩码的分数进行了比较。根据我们的结果,DSC的加权修改与人类评估者给出的分数的相关性高于原始DSC,并且在两个最高分等级内产生的变化也显著更小(p值<0.018)。这里引入的新加权系数在评估医学成像的分割结果方面有很大潜力。