Spencer Robinson, Thabtah Fadi, Abdelhamid Neda, Thompson Michael
Digital Technologies, Manukau Institute of Technology, New Zealand.
Computing, Auckland Institute of Studies, New Zealand.
Digit Health. 2020 Mar 29;6:2055207620914777. doi: 10.1177/2055207620914777. eCollection 2020 Jan-Dec.
Machine learning has been used successfully to improve the accuracy of computer-aided diagnosis systems. This paper experimentally assesses the performance of models derived by machine learning techniques by using relevant features chosen by various feature-selection methods. Four commonly used heart disease datasets have been evaluated using principal component analysis, Chi squared testing, ReliefF and symmetrical uncertainty to create distinctive feature sets. Then, a variety of classification algorithms have been used to create models that are then compared to seek the optimal features combinations, to improve the correct prediction of heart conditions. We found the benefits of using feature selection vary depending on the machine learning technique used for the heart datasets we consider. However, the best model we created used a combination of Chi-squared feature selection with the BayesNet algorithm and achieved an accuracy of 85.00% on the considered datasets.
机器学习已成功用于提高计算机辅助诊断系统的准确性。本文通过使用各种特征选择方法选择的相关特征,对机器学习技术导出的模型性能进行了实验评估。使用主成分分析、卡方检验、ReliefF和对称不确定性对四个常用的心脏病数据集进行了评估,以创建独特的特征集。然后,使用了多种分类算法来创建模型,随后对这些模型进行比较,以寻找最佳特征组合,从而提高心脏病状况的正确预测率。我们发现,使用特征选择的好处因我们所考虑的心脏病数据集所使用的机器学习技术而异。然而,我们创建的最佳模型使用了卡方特征选择与贝叶斯网络算法的组合,在所考虑的数据集上实现了85.00%的准确率。