Suppr超能文献

基于非对称熵的分类树与支持向量机在心血管风险分层中的比较

A comparison of non-symmetric entropy-based classification trees and support vector machine for cardiovascular risk stratification.

作者信息

Singh Anima, Guttag John V

机构信息

Department of Electrical Engineering and Computer Science, Massachusetts Institute of Technology, 77 Massachusetts Avenue, Cambridge, MA 02139, USA.

出版信息

Annu Int Conf IEEE Eng Med Biol Soc. 2011;2011:79-82. doi: 10.1109/IEMBS.2011.6089901.

Abstract

Classification tree-based risk stratification models generate easily interpretable classification rules. This feature makes classification tree-based models appealing for use in a clinical setting, provided that they have comparable accuracy to other methods. In this paper, we present and evaluate the performance of a non-symmetric entropy-based classification tree algorithm. The algorithm is designed to accommodate class imbalance found in many medical datasets. We evaluate the performance of this algorithm, and compare it to that of SVM-based classifiers, when applied to 4219 non-ST elevation acute coronary syndrome patients. We generated SVM-based classifiers using three different strategies for handling class imbalance: cost-sensitive SVM learning, synthetic minority oversampling (SMOTE), and random majority undersampling. We used both linear and radial basis kernel-based SVMs. Our classification tree models outperformed SVM-based classifiers generated using each of the three techniques. On average, the classification tree models yielded a 14% improvement in G-score and a 21% improvement in F-score relative to the linear SVM classifiers with the best performance. Similarly, our classification tree models yielded a 12% improvement in G-score and a 21% improvement in the F-score over the best RBF kernel-based SVM classifiers.

摘要

基于分类树的风险分层模型生成易于解释的分类规则。这一特性使得基于分类树的模型在临床环境中颇具吸引力,前提是它们与其他方法具有相当的准确性。在本文中,我们展示并评估了一种基于非对称熵的分类树算法的性能。该算法旨在适应许多医学数据集中存在的类别不平衡问题。当应用于4219例非ST段抬高型急性冠状动脉综合征患者时,我们评估了该算法的性能,并将其与基于支持向量机(SVM)的分类器的性能进行比较。我们使用三种不同的策略来处理类别不平衡问题,从而生成基于SVM的分类器:代价敏感SVM学习、合成少数类过采样技术(SMOTE)和随机多数类欠采样。我们使用了基于线性核和径向基核的SVM。我们的分类树模型优于使用这三种技术中的每一种生成的基于SVM的分类器。平均而言,相对于性能最佳的线性SVM分类器,分类树模型的G分数提高了14%,F分数提高了21%。同样,相对于性能最佳的基于径向基核的SVM分类器,我们的分类树模型的G分数提高了12%,F分数提高了21%。

文献检索

告别复杂PubMed语法,用中文像聊天一样搜索,搜遍4000万医学文献。AI智能推荐,让科研检索更轻松。

立即免费搜索

文件翻译

保留排版,准确专业,支持PDF/Word/PPT等文件格式,支持 12+语言互译。

免费翻译文档

深度研究

AI帮你快速写综述,25分钟生成高质量综述,智能提取关键信息,辅助科研写作。

立即免费体验