Annu Int Conf IEEE Eng Med Biol Soc. 2022 Jul;2022:1066-1069. doi: 10.1109/EMBC48229.2022.9871121.
Cardiovascular diseases (CVDs) are among the most serious disorders leading to high mortality rates worldwide. CVDs can be diagnosed and prevented early by identifying risk biomarkers using statistical and machine learning (ML) models, In this work, we utilize clinical CVD risk factors and biochemical data using machine learning models such as Logistic Regression (LR), Support Vector Machine (SVM), Random Forest (RF), Naïve Bayes (NB), Extreme Grading Boosting (XGB) and Adaptive Boosting (AdaBoost) to predict death caused by CVD within ten years of follow-up. We used the cohort of the Ludwigshafen Risk and Cardiovascular Health (LURIC) study and 2943 patients were included in the analysis (484 annotated as dead due to CVD). We calculated the Accuracy (ACC), Precision, Recall, F1-Score, Specificity (SPE) and area under the receiver operating characteristic curve (AUC) of each model. The findings of the comparative analysis show that Logistic Regression has been proven to be the most reliable algorithm having accuracy 72.20 %. These results will be used in the TIMELY study to estimate the risk score and mortality of CVD in patients with 10-year risk.
心血管疾病(CVDs)是导致全球高死亡率的最严重疾病之一。通过使用统计和机器学习(ML)模型识别风险生物标志物,可以早期诊断和预防 CVD。在这项工作中,我们利用临床 CVD 风险因素和生化数据,使用机器学习模型(如逻辑回归(LR)、支持向量机(SVM)、随机森林(RF)、朴素贝叶斯(NB)、极端梯度提升(XGB)和自适应提升(AdaBoost))来预测在十年随访期间由 CVD 引起的死亡。我们使用了路德维希港风险和心血管健康(LURIC)研究的队列,共有 2943 名患者被纳入分析(484 名因 CVD 死亡)。我们计算了每个模型的准确性(ACC)、精度、召回率、F1 分数、特异性(SPE)和接收器操作特征曲线下的面积(AUC)。比较分析的结果表明,逻辑回归已被证明是最可靠的算法,准确率为 72.20%。这些结果将用于 TIMELY 研究,以估计具有 10 年风险的 CVD 患者的风险评分和死亡率。