Department of Advanced Biomedical Sciences, University Hospital of Naples 'Federico II', Naples, Italy.
Department of Public Health, University Hospital of Naples 'Federico II', Naples, Italy.
Comput Methods Programs Biomed. 2020 Jun;189:105343. doi: 10.1016/j.cmpb.2020.105343. Epub 2020 Jan 16.
Coronary artery disease (CAD) is still one of the primary causes of death in the developed countries. Stress single-photon emission computed tomography is used to evaluate myocardial perfusion and ventricular function in patients with suspected or known CAD. This study sought to test data mining and machine learning tools and to compare some supervised learning algorithms in a large cohort of Italian subjects with suspected or known CAD who underwent stress myocardial perfusion imaging.
The dataset consisted of 10,265 patients with suspected or known CAD. The analysis was conducted using Knime analytics platform in order to implement Random Forests, C4.5, Gradient boosted tree, Naïve Bayes, and K nearest neighbor (KNN) after a procedure of features filtering. K-fold cross-validation was employed.
Accuracy, error, precision, recall, and specificity were computed through the above-mentioned algorithms. Random Forests and gradients boosted trees obtained the highest accuracy (>95%), while it was comprised between 83% and 88%. The highest value for sensitivity and specificity was obtained by C4.5 (99.3%) and by Gradient boosted tree (96.9%). Naïve Bayes had the lowest precision (70.9%) and specificity (72.0%), KNN the lowest recall and sensitivity (79.2%).
The high scores obtained by the implementation of the algorithms suggests health facilities consider the idea of including services of advanced data analysis to help clinicians in decision-making. Similar applications of this kind of study in other contexts could support this idea.
在发达国家,冠状动脉疾病(CAD)仍然是主要死亡原因之一。应激单光子发射计算机断层扫描用于评估疑似或已知 CAD 患者的心肌灌注和心室功能。本研究旨在测试数据挖掘和机器学习工具,并比较在意大利疑似或已知 CAD 患者的大队列中使用一些监督学习算法进行应激心肌灌注成像。
该数据集包括 10265 名疑似或已知 CAD 患者。分析使用 Knime 分析平台进行,以实施随机森林、C4.5、梯度提升树、朴素贝叶斯和 K 最近邻(KNN),然后对特征进行过滤。采用 K 折交叉验证。
通过上述算法计算了准确性、误差、精度、召回率和特异性。随机森林和梯度提升树获得了最高的准确性(>95%),而精度在 83%至 88%之间。C4.5(99.3%)和梯度提升树(96.9%)获得了最高的敏感性和特异性。朴素贝叶斯的精度(70.9%)和特异性(72.0%)最低,KNN 的召回率和敏感性(79.2%)最低。
算法实施获得的高分表明医疗机构可以考虑引入先进数据分析服务,以帮助临床医生做出决策。在其他情况下进行类似的研究可以支持这一观点。