Department of Radiology, Center for Magnetic Resonance Research, University of Minnesota, 2021 6th St SE, Minneapolis, MN, 55455, USA.
Division of Biostatistics, School of Public Health, University of Minnesota, 420 Delaware St SE, Minneapolis, MN, 55455, USA.
Med Phys. 2018 May;45(5):2076-2088. doi: 10.1002/mp.12861. Epub 2018 Apr 16.
Computer-aided detection/diagnosis (CAD) of prostate cancer (PCa) on multiparametric MRI (mpMRI) is an active area of research. In the literature, the performance of predictive models trained to detect PCa on mpMRI has typically been reported in terms of voxel-wise measures such as sensitivity and specificity and/or area under the receiver operating curve (AUC). However, it is unclear whether models that score higher by these measures are actually superior. Here, we propose a novel method for lesion identification as well as novel measures that assess the quality of the detected lesions.
A total of 46 axial MRI slices of interest from 34 patients and the associated histopathologic ground truths were used to develop and to characterize the proposed measures. The proposed lesion-wise score s is based on the Jaccard similarity index with modifications that emphasize the overlap and colocalization of predicted lesions with ground truth lesions. Thresholding of s allowed for the sensitivity and specificity of lesion detection to be assessed, while the proposed lesion-summary score s is a weighted average of s s that provides a single summary statistic of lesion detection performance. The proposed measures were used to compare the lesion detection performance of a predictive model vs that of a radiologist on the same data set. The measures were also used to evaluate the degree to which viewing the cancer prediction improved diagnostic accuracy.
The lesion-wise score qualitatively reflected the goodness of predicted lesions over a wide range of values (s = 0.1 to s = 0.8) and was found to encompass a larger range of values than the Dice coefficient did over the same range of prediction qualities (0-0.9 vs 0-0.75). The lesion-summary score was shown to vary linearly with voxel-wise sensitivity and quadratically with voxel-wise specificity and correlated well with voxel-wise AUC (ρ = 0.68) and the Dice coefficient (ρ = 0.88). Radiologist performance was found to be significantly improved after viewing the model-generated cancer prediction maps as quantified by both s (P = 0.01) and DSC (P = 0.04), with improvements in both lesion detection sensitivity and specificity.
The proposed measures allow for the assessment of lesion detection performance, which is most relevant in a clinical setting and would not be possible to do with voxel-wise measures alone.
前列腺癌(PCa)的计算机辅助检测/诊断(CAD)是磁共振成像(mpMRI)领域的一个活跃研究领域。在文献中,通常以体素级别的指标(如敏感度和特异性)和/或接收器操作特征曲线(AUC)下的面积来报告在 mpMRI 上训练以检测 PCa 的预测模型的性能。然而,尚不清楚通过这些指标得分更高的模型实际上是否更优越。在这里,我们提出了一种新的病灶识别方法和新的指标,用于评估检测到的病灶的质量。
使用 34 名患者的 46 个感兴趣的轴向 MRI 切片和相关的组织病理学真实数据来开发和描述所提出的方法。所提出的病灶级评分 s 基于杰卡德相似性指数,并进行了修改,以强调预测病灶与真实病灶之间的重叠和共定位。 s 的阈值可用于评估病灶检测的敏感度和特异性,而提出的病灶总结评分 s 是 s s 的加权平均值,提供了病灶检测性能的单个综合统计数据。使用这些方法比较了预测模型与同一数据集上的放射科医生的病灶检测性能。还使用这些方法评估了查看癌症预测对提高诊断准确性的程度。
病灶级评分在广泛的数值范围内(s = 0.1 到 s = 0.8)定性地反映了预测病灶的好坏程度,并且发现它所涵盖的范围比 Dice 系数在相同的预测质量范围内(0-0.9 比 0-0.75)更广。结果表明,病灶总结评分与体素级别的敏感度呈线性关系,与体素级别的特异性呈二次关系,与体素级别的 AUC(ρ = 0.68)和 Dice 系数(ρ = 0.88)相关性良好。通过 s(P = 0.01)和 DSC(P = 0.04)的量化发现,在查看模型生成的癌症预测图后,放射科医生的性能得到了显著提高,病灶检测的敏感性和特异性均得到了提高。
所提出的方法可以评估病灶检测性能,这在临床环境中最为相关,仅凭体素级别的指标无法做到。