Lee Michael C, Nelson Sarah J
Surbeck Laboratory of Advanced Imaging, Department of Radiology, University of California, UCSF Radiology Box 2532, 1700 4th Street, San Francisco, CA 94143-2532, USA.
Artif Intell Med. 2008 May;43(1):61-74. doi: 10.1016/j.artmed.2008.03.002. Epub 2008 Apr 29.
The purpose of this study was to develop a pattern classification algorithm for use in predicting the location of new contrast-enhancement in brain tumor patients using data obtained via multivariate magnetic resonance (MR) imaging from a prior scan. We also explore the use of feature selection or weighting in improving the accuracy of the pattern classifier.
Contrast-enhanced MR images, perfusion images, diffusion images, and proton spectroscopic imaging data were obtained from 26 patients with glioblastoma multiforme brain tumors, divided into a design set and an unseen test set for verification of results. A k-NN algorithm was implemented to classify unknown data based on a set of training data with ground truth derived from post-treatment contrast-enhanced images; the quality of the k-NN results was evaluated using a leave-one-out cross-validation method. A genetic algorithm was implemented to select optimal features and feature weights for the k-NN algorithm. The binary representation of the weights was varied from 1 to 4 bits. Each individual parameter was thresholded as a simple classification technique, and the results compared with the k-NN.
The feature selection k-NN was able to achieve a sensitivity of 0.78+/-0.18 and specificity of 0.79+/-0.06 on the holdout test data using only 7 of the 38 original features. Similar results were obtained with non-binary weights, but using a larger number of features. Overfitting was also observed in the higher bit representations. The best single-variable classifier, based on a choline-to-NAA abnormality index computed from spectroscopic data, achieved a sensitivity of 0.79+/-0.20 and specificity of 0.71+/-0.11. The k-NN results had lower variation across patients than the single-variable classifiers.
We have demonstrated that the optimized k-NN rule could be used for quantitative analysis of multivariate images, and be applied to a specific clinical research question. Selecting features was found to be useful in improving the accuracy of feature weighting algorithms and improving the comprehensibility of the results. We believe that in addition to lending insight into parameter relevance, such algorithms may be useful in aiding radiological interpretation of complex multimodality datasets.
本研究的目的是开发一种模式分类算法,用于利用先前扫描的多变量磁共振(MR)成像数据预测脑肿瘤患者新的对比增强位置。我们还探讨了使用特征选择或加权来提高模式分类器的准确性。
从26例多形性胶质母细胞瘤患者中获取对比增强MR图像、灌注图像、扩散图像和质子光谱成像数据,分为设计集和未见过的测试集以验证结果。实施k近邻(k-NN)算法,根据一组具有来自治疗后对比增强图像的真实情况的训练数据对未知数据进行分类;使用留一法交叉验证方法评估k-NN结果的质量。实施遗传算法为k-NN算法选择最佳特征和特征权重。权重的二进制表示从1位变化到4位。将每个单独参数作为一种简单的分类技术进行阈值处理,并将结果与k-NN进行比较。
特征选择k-NN在仅使用38个原始特征中的7个特征时就能在保留测试数据上实现0.78±0.18的灵敏度和0.79±±0.06的特异性。使用非二进制权重也获得了类似结果,但使用了更多特征。在更高位表示中也观察到了过拟合。基于从光谱数据计算的胆碱与N-乙酰天门冬氨酸异常指数的最佳单变量分类器实现了0.79±0.20的灵敏度和0.71±0.11的特异性。k-NN结果在患者之间的变异性低于单变量分类器。
我们已经证明优化后的k-NN规则可用于多变量图像的定量分析,并应用于特定临床研究问题。发现选择特征有助于提高特征加权算法的准确性并提高结果的可理解性。我们认为,除了有助于深入了解参数相关性外,此类算法可能有助于辅助对复杂多模态数据集的放射学解释。