Department of Imaging Sciences, University of Rochester, 430 Elmwood Avenue, Rochester, NY 14627, USA; Department of Biomedical Engineering, University of Rochester, 430 Elmwood Avenue, Rochester, NY 14627, USA.
Department of Imaging Sciences, University of Rochester, 430 Elmwood Avenue, Rochester, NY 14627, USA; Department of Biomedical Engineering, University of Rochester, 430 Elmwood Avenue, Rochester, NY 14627, USA.
Artif Intell Med. 2014 Jan;60(1):65-77. doi: 10.1016/j.artmed.2013.11.003. Epub 2013 Nov 23.
While dimension reduction has been previously explored in computer aided diagnosis (CADx) as an alternative to feature selection, previous implementations of its integration into CADx do not ensure strict separation between training and test data required for the machine learning task. This compromises the integrity of the independent test set, which serves as the basis for evaluating classifier performance.
We propose, implement and evaluate an improved CADx methodology where strict separation is maintained. This is achieved by subjecting the training data alone to dimension reduction; the test data is subsequently processed with out-of-sample extension methods. Our approach is demonstrated in the research context of classifying small diagnostically challenging lesions annotated on dynamic breast magnetic resonance imaging (MRI) studies. The lesions were dynamically characterized through topological feature vectors derived from Minkowski functionals. These feature vectors were then subject to dimension reduction with different linear and non-linear algorithms applied in conjunction with out-of-sample extension techniques. This was followed by classification through supervised learning with support vector regression. Area under the receiver-operating characteristic curve (AUC) was evaluated as the metric of classifier performance.
Of the feature vectors investigated, the best performance was observed with Minkowski functional 'perimeter' while comparable performance was observed with 'area'. Of the dimension reduction algorithms tested with 'perimeter', the best performance was observed with Sammon's mapping (0.84±0.10) while comparable performance was achieved with exploratory observation machine (0.82±0.09) and principal component analysis (0.80±0.10).
The results reported in this study with the proposed CADx methodology present a significant improvement over previous results reported with such small lesions on dynamic breast MRI. In particular, non-linear algorithms for dimension reduction exhibited better classification performance than linear approaches, when integrated into our CADx methodology. We also note that while dimension reduction techniques may not necessarily provide an improvement in classification performance over feature selection, they do allow for a higher degree of feature compaction.
在计算机辅助诊断 (CADx) 中,降维已被探索作为特征选择的替代方法,但之前实现的降维方法并不能确保机器学习任务所需的训练数据和测试数据之间的严格分离。这会影响独立测试集的完整性,独立测试集是评估分类器性能的基础。
我们提出、实施和评估了一种改进的 CADx 方法,该方法可保持严格的分离。这是通过仅对训练数据进行降维来实现的;然后,使用样本外扩展方法对测试数据进行处理。我们的方法在动态乳腺磁共振成像 (MRI) 研究中对小的具有挑战性的诊断病变进行分类的研究背景下得到了验证。通过从 Minkowski 函数中导出拓扑特征向量来对病变进行动态特征描述。然后,对这些特征向量进行降维处理,同时应用线性和非线性算法以及样本外扩展技术。然后通过支持向量回归的监督学习进行分类。接收器工作特性曲线下的面积 (AUC) 被评估为分类器性能的指标。
在所研究的特征向量中,Minkowski 函数的“周长”表现出最佳性能,而“面积”表现出相当的性能。在所测试的降维算法中,Sammon 映射的性能最佳(0.84±0.10),而探索性观察机 (0.82±0.09) 和主成分分析 (0.80±0.10) 的性能也相当。
本研究报告的结果与动态乳腺 MRI 上此类小病变的先前结果相比,提出的 CADx 方法有了显著提高。特别是,在将非线性降维算法集成到我们的 CADx 方法中时,它们的分类性能优于线性方法。我们还注意到,虽然降维技术不一定能提高分类性能,但它们确实允许更高程度的特征紧凑化。