CLAIM - Charité Lab for Artificial Intelligence in Medicine, Charité Universitätsmedizin Berlin, Berlin, Germany.
Research Studio Data Science, Research Studios Austria, Salzburg, Austria.
Eur Radiol Exp. 2021 Jan 21;5(1):4. doi: 10.1186/s41747-020-00200-2.
Average Hausdorff distance is a widely used performance measure to calculate the distance between two point sets. In medical image segmentation, it is used to compare ground truth images with segmentations allowing their ranking. We identified, however, ranking errors of average Hausdorff distance making it less suitable for applications in segmentation performance assessment. To mitigate this error, we present a modified calculation of this performance measure that we have coined "balanced average Hausdorff distance". To simulate segmentations for ranking, we manually created non-overlapping segmentation errors common in magnetic resonance angiography cerebral vessel segmentation as our use-case. Adding the created errors consecutively and randomly to the ground truth, we created sets of simulated segmentations with increasing number of errors. Each set of simulated segmentations was ranked using both performance measures. We calculated the Kendall rank correlation coefficient between the segmentation ranking and the number of errors in each simulated segmentation. The rankings produced by balanced average Hausdorff distance had a significantly higher median correlation (1.00) than those by average Hausdorff distance (0.89). In 200 total rankings, the former misranked 52 whilst the latter misranked 179 segmentations. Balanced average Hausdorff distance is more suitable for rankings and quality assessment of segmentations than average Hausdorff distance.
平均 Hausdorff 距离是一种广泛用于计算两个点集之间距离的性能度量。在医学图像分割中,它用于比较地面真实图像和分割,以允许对其进行排名。然而,我们发现平均 Hausdorff 距离的排名错误使其不太适合用于分割性能评估。为了减轻这种错误,我们提出了一种对该性能度量的修改计算方法,我们称之为“平衡平均 Hausdorff 距离”。为了模拟用于排名的分割,我们以磁共振血管造影脑血管分割中常见的非重叠分割错误为例手动创建分割错误。我们将创建的错误连续且随机地添加到地面真实中,从而创建具有越来越多错误的模拟分割集。使用这两种性能度量对每个模拟分割集进行排名。我们计算了分割排名和每个模拟分割中错误数量之间的 Kendall 等级相关系数。平衡平均 Hausdorff 距离产生的排名的中位数相关性(1.00)明显高于平均 Hausdorff 距离(0.89)。在 200 次总排名中,前者错误排名了 52 次,而后者错误排名了 179 次。平衡平均 Hausdorff 距离比平均 Hausdorff 距离更适合分割的排名和质量评估。