Bashir Saba, Qamar Usman, Khan Farhan Hassan
Computer Engineering Department, College of Electrical and Mechanical Engineering, National University of Sciences and Technology (NUST), Islamabad, Pakistan,
Australas Phys Eng Sci Med. 2015 Jun;38(2):305-23. doi: 10.1007/s13246-015-0337-6. Epub 2015 Mar 10.
Conventional clinical decision support systems are based on individual classifiers or simple combination of these classifiers which tend to show moderate performance. This research paper presents a novel classifier ensemble framework based on enhanced bagging approach with multi-objective weighted voting scheme for prediction and analysis of heart disease. The proposed model overcomes the limitations of conventional performance by utilizing an ensemble of five heterogeneous classifiers: Naïve Bayes, linear regression, quadratic discriminant analysis, instance based learner and support vector machines. Five different datasets are used for experimentation, evaluation and validation. The datasets are obtained from publicly available data repositories. Effectiveness of the proposed ensemble is investigated by comparison of results with several classifiers. Prediction results of the proposed ensemble model are assessed by ten fold cross validation and ANOVA statistics. The experimental evaluation shows that the proposed framework deals with all type of attributes and achieved high diagnosis accuracy of 84.16 %, 93.29 % sensitivity, 96.70 % specificity, and 82.15 % f-measure. The f-ratio higher than f-critical and p value less than 0.05 for 95 % confidence interval indicate that the results are extremely statistically significant for most of the datasets.
传统的临床决策支持系统基于单个分类器或这些分类器的简单组合,其性能往往中等。本文提出了一种基于增强装袋方法和多目标加权投票方案的新型分类器集成框架,用于心脏病的预测和分析。该模型通过使用朴素贝叶斯、线性回归、二次判别分析、基于实例的学习器和支持向量机这五个异构分类器的集成,克服了传统性能的局限性。使用五个不同的数据集进行实验、评估和验证。这些数据集来自公开可用的数据存储库。通过将结果与几个分类器进行比较,研究了所提出集成方法的有效性。所提出的集成模型的预测结果通过十折交叉验证和方差分析统计进行评估。实验评估表明,所提出的框架能够处理所有类型的属性,并实现了84.16%的高诊断准确率、93.29%的灵敏度、96.70%的特异性和82.15%的F值。对于95%置信区间,F比率高于F临界值且p值小于0.05,这表明对于大多数数据集,结果在统计上具有极其显著的意义。