Cho Baek Hwan, Yu Hwanjo, Lee Jongshill, Chee Young Joon, Kim In Young, Kim Sun I
Department of Biomedical Engineering, Hanyang University, Seoul 133-605, Korea.
IEEE Trans Inf Technol Biomed. 2008 Mar;12(2):247-56. doi: 10.1109/TITB.2007.902300.
Nonlinear classifiers, e.g., support vector machines (SVMs) with radial basis function (RBF) kernels, have been used widely for automatic diagnosis of diseases because of their high accuracies. However, it is difficult to visualize the classifiers, and thus difficult to provide intuitive interpretation of results to physicians. We developed a new nonlinear kernel, the localized radial basis function (LRBF) kernel, and new visualization system visualization for risk factor analysis (VRIFA) that applies a nomogram and LRBF kernel to visualize the results of nonlinear SVMs and improve the interpretability of results while maintaining high prediction accuracy. Three representative medical datasets from the University of California, Irvine repository and Statlog dataset-breast cancer, diabetes, and heart disease datasets-were used to evaluate the system. The results showed that the classification performance of the LRBF is comparable with that of the RBF, and the LRBF is easy to visualize via a nomogram. Our study also showed that the LRBF kernel is less sensitive to noise features than the RBF kernel, whereas the LRBF kernel degrades the prediction accuracy more when important features are eliminated. We demonstrated the VRIFA system, which visualizes the results of linear and nonlinear SVMs with LRBF kernels, on the three datasets.
非线性分类器,例如带有径向基函数(RBF)核的支持向量机(SVM),因其高准确率而被广泛用于疾病的自动诊断。然而,这些分类器难以可视化,因此难以向医生直观地解释结果。我们开发了一种新的非线性核,即局部径向基函数(LRBF)核,以及一种新的可视化系统——风险因素分析可视化(VRIFA),该系统应用列线图和LRBF核来可视化非线性支持向量机的结果,并在保持高预测准确率的同时提高结果的可解释性。使用来自加利福尼亚大学欧文分校库和Statlog数据集的三个具有代表性的医学数据集——乳腺癌、糖尿病和心脏病数据集——来评估该系统。结果表明,LRBF的分类性能与RBF相当,并且通过列线图很容易对LRBF进行可视化。我们的研究还表明,与RBF核相比,LRBF核对噪声特征不太敏感,而当重要特征被消除时,LRBF核会使预测准确率下降得更多。我们在这三个数据集上展示了VRIFA系统,该系统可以可视化带有LRBF核的线性和非线性支持向量机的结果。