Kuo Hsien-Chi, Giger Maryellen L, Reiser Ingrid, Drukker Karen, Boone John M, Lindfors Karen K, Yang Kai, Edwards Alexandra
University of Chicago , Department of Radiology, 5841 S. Maryland Avenue, Chicago 60637, Illinois, United States.
University of California at Davis , Department of Radiology, 4860 Y Street, Suite 3100, Sacramento 95817, California, United States.
J Med Imaging (Bellingham). 2014 Oct;1(3):031012. doi: 10.1117/1.JMI.1.3.031012. Epub 2014 Dec 24.
Evaluation of segmentation algorithms usually involves comparisons of segmentations to gold-standard delineations without regard to the ultimate medical decision-making task. We compare two segmentation evaluations methods-a Dice similarity coefficient (DSC) evaluation and a diagnostic classification task-based evaluation method using lesions from breast computed tomography. In our investigation, we use results from two previously developed lesion-segmentation algorithms [a global active contour model (GAC) and a global with local aspects active contour model]. Although similar DSC values were obtained (0.80 versus 0.77), we show that the global + local active contour (GLAC) model, as compared with the GAC model, is able to yield significantly improved classification performance in terms of area under the receivers operating characteristic (ROC) curve in the task of distinguishing malignant from benign lesions. [Area under the [Formula: see text] compared to 0.63, [Formula: see text]]. This is mainly because the GLAC model yields better detailed information required in the calculation of morphological features. Based on our findings, we conclude that the DSC metric alone is not sufficient for evaluating segmentation lesions in computer-aided diagnosis tasks.
分割算法的评估通常涉及将分割结果与金标准划定进行比较,而不考虑最终的医学决策任务。我们比较了两种分割评估方法——一种是骰子相似系数(DSC)评估,另一种是基于诊断分类任务的评估方法,使用的是乳腺计算机断层扫描的病变。在我们的研究中,我们使用了两种先前开发的病变分割算法的结果[一种全局活动轮廓模型(GAC)和一种具有局部特征的全局活动轮廓模型]。尽管获得了相似的DSC值(0.80对0.77),但我们表明,与GAC模型相比,全局+局部活动轮廓(GLAC)模型在区分恶性和良性病变的任务中,在接收器操作特征(ROC)曲线下面积方面能够产生显著提高的分类性能。[与0.63相比,[公式:见正文]下面积,[公式:见正文]]。这主要是因为GLAC模型产生了计算形态特征所需的更好的详细信息。基于我们的发现,我们得出结论,仅DSC指标不足以评估计算机辅助诊断任务中的分割病变。