Health Informatics Laboratory, Faculty of Nursing, National and Kapodistrian University of Athens, Greece.
Department of Digital Systems, University of Piraeus, Greece.
Med Arch. 2020 Feb;74(1):39-41. doi: 10.5455/medarh.2020.74.39-41.
The World Health Organization has estimated that 12 million deaths occur worldwide, every year due to Heart diseases. Half the deaths in the developed countries are due to cardiovascular diseases. The early prognosis of cardiovascular diseases can aid in making decisions on lifestyle changes in high risk patients.
The aim of this paper is to build and compare classification techniques for cardiovascular diseases.
The dataset contained 4270 patients and 14 attributes and it is available on the UCI data repository. The prediction is a binary outcome (event and no event). Variables of each attribute is a potential risk factor. There are both demographic, behavioral and medical risk factors. The classification goal is to predict whether the patient has 10-year risk of future coronary heart disease (CHD).
Different classifiers were tested. The SMOTE technique was used in order to solve the class imbalance. The cross-validation method was used in order to estimate how accurately our predictive models will perform. We evaluate our classifiers by using the following metrics: precision, recall, F1-score, Accuracy, AUC (Area Under Curve).
Based on the resluts, the best scores have the Random Forest and Decision Tree classifiers.
世界卫生组织估计,每年全球有 1200 万人因心脏病而死亡。在发达国家,有一半的死亡是由于心血管疾病。心血管疾病的早期预后可以帮助高危患者做出生活方式改变的决策。
本文旨在构建和比较心血管疾病的分类技术。
数据集包含 4270 名患者和 14 个属性,可在 UCI 数据存储库中获得。预测是一个二分类结果(事件和非事件)。每个属性的变量都是一个潜在的风险因素。有来自人口统计学、行为和医疗方面的风险因素。分类的目标是预测患者未来 10 年内是否有患冠心病(CHD)的风险。
测试了不同的分类器。为了解决类别不平衡问题,使用了 SMOTE 技术。使用交叉验证方法来估计我们的预测模型的性能有多准确。我们使用以下指标来评估我们的分类器:精度、召回率、F1 分数、准确性、AUC(曲线下面积)。
根据结果,随机森林和决策树分类器的得分最高。