Jesneck Jonathan L, Nolte Loren W, Baker Jay A, Floyd Carey E, Lo Joseph Y
Department of Biomedical Engineering, Duke University, Durham, North Carolina 27705, USA.
Med Phys. 2006 Aug;33(8):2945-54. doi: 10.1118/1.2208934.
As more diagnostic testing options become available to physicians, it becomes more difficult to combine various types of medical information together in order to optimize the overall diagnosis. To improve diagnostic performance, here we introduce an approach to optimize a decision-fusion technique to combine heterogeneous information, such as from different modalities, feature categories, or institutions. For classifier comparison we used two performance metrics: The receiving operator characteristic (ROC) area under the curve [area under the ROC curve (AUC)] and the normalized partial area under the curve (pAUC). This study used four classifiers: Linear discriminant analysis (LDA), artificial neural network (ANN), and two variants of our decision-fusion technique, AUC-optimized (DF-A) and pAUC-optimized (DF-P) decision fusion. We applied each of these classifiers with 100-fold cross-validation to two heterogeneous breast cancer data sets: One of mass lesion features and a much more challenging one of microcalcification lesion features. For the calcification data set, DF-A outperformed the other classifiers in terms of AUC (p < 0.02) and achieved AUC=0.85 +/- 0.01. The DF-P surpassed the other classifiers in terms of pAUC (p < 0.01) and reached pAUC=0.38 +/- 0.02. For the mass data set, DF-A outperformed both the ANN and the LDA (p < 0.04) and achieved AUC=0.94 +/- 0.01. Although for this data set there were no statistically significant differences among the classifiers' pAUC values (pAUC=0.57 +/- 0.07 to 0.67 +/- 0.05, p > 0.10), the DF-P did significantly improve specificity versus the LDA at both 98% and 100% sensitivity (p < 0.04). In conclusion, decision fusion directly optimized clinically significant performance measures, such as AUC and pAUC, and sometimes outperformed two well-known machine-learning techniques when applied to two different breast cancer data sets.
随着越来越多的诊断测试选项可供医生使用,将各种类型的医学信息结合起来以优化整体诊断变得更加困难。为了提高诊断性能,我们在此介绍一种优化决策融合技术的方法,以结合异构信息,例如来自不同模态、特征类别或机构的信息。为了进行分类器比较,我们使用了两个性能指标:曲线下的接受者操作特征(ROC)面积[ROC曲线下面积(AUC)]和曲线下归一化部分面积(pAUC)。本研究使用了四个分类器:线性判别分析(LDA)、人工神经网络(ANN)以及我们决策融合技术的两个变体,即AUC优化(DF - A)和pAUC优化(DF - P)决策融合。我们将这些分类器中的每一个通过100倍交叉验证应用于两个异构乳腺癌数据集:一个是肿块病变特征数据集,另一个是更具挑战性的微钙化病变特征数据集。对于钙化数据集,DF - A在AUC方面优于其他分类器(p < 0.02),并实现了AUC = 0.85 ± 0.01。DF - P在pAUC方面超过了其他分类器(p < 0.01),并达到了pAUC = 0.38 ± 0.02。对于肿块数据集,DF - A的表现优于ANN和LDA(p < 0.04),并实现了AUC = 0.94 ± 0.01。尽管对于该数据集,分类器的pAUC值之间没有统计学上的显著差异(pAUC = 0.57 ± 0.07至0.67 ± 0.05,p > 0.10),但DF - P在98%和100%的灵敏度下相对于LDA确实显著提高了特异性(p < 0.04)。总之,决策融合直接优化了临床显著的性能指标,如AUC和pAUC,并且在应用于两个不同的乳腺癌数据集时,有时优于两种著名的机器学习技术。