Liu Tianyi, Krentz Andrew, Lu Lei, Curcin Vasa
School of Life Course & Population Sciences, King's College London, SE1 1UL London, UK.
Metadvice, 45 Pall Mall, St. James's SW1Y 5JG London, UK.
Eur Heart J Digit Health. 2024 Oct 27;6(1):7-22. doi: 10.1093/ehjdh/ztae080. eCollection 2025 Jan.
Cardiovascular disease (CVD) remains a major cause of mortality in the UK, prompting the need for improved risk predictive models for primary prevention. Machine learning (ML) models utilizing electronic health records (EHRs) offer potential enhancements over traditional risk scores like QRISK3 and ASCVD. To systematically evaluate and compare the efficacy of ML models against conventional CVD risk prediction algorithms using EHR data for medium to long-term (5-10 years) CVD risk prediction. A systematic review and random-effect meta-analysis were conducted according to preferred reporting items for systematic reviews and meta-analyses guidelines, assessing studies from 2010 to 2024. We retrieved 32 ML models and 26 conventional statistical models from 20 selected studies, focusing on performance metrics such as area under the curve (AUC) and heterogeneity across models. ML models, particularly random forest and deep learning, demonstrated superior performance, with the highest recorded pooled AUCs of 0.865 (95% CI: 0.812-0.917) and 0.847 (95% CI: 0.766-0.927), respectively. These significantly outperformed the conventional risk score of 0.765 (95% CI: 0.734-0.796). However, significant heterogeneity (I² > 99%) and potential publication bias were noted across the studies. While ML models show enhanced calibration for CVD risk, substantial variability and methodological concerns limit their current clinical applicability. Future research should address these issues by enhancing methodological transparency and standardization to improve the reliability and utility of these models in clinical settings. This study highlights the advanced capabilities of ML models in CVD risk prediction and emphasizes the need for rigorous validation to facilitate their integration into clinical practice.
心血管疾病(CVD)仍是英国主要的死亡原因,这促使人们需要改进用于一级预防的风险预测模型。利用电子健康记录(EHR)的机器学习(ML)模型比传统风险评分(如QRISK3和ASCVD)具有潜在优势。为了系统地评估和比较ML模型与传统CVD风险预测算法在使用EHR数据进行中长期(5 - 10年)CVD风险预测方面的效果。根据系统评价和荟萃分析的首选报告项目指南进行了系统评价和随机效应荟萃分析,评估了2010年至2024年的研究。我们从20项选定研究中检索了32个ML模型和26个传统统计模型,重点关注曲线下面积(AUC)等性能指标以及模型间的异质性。ML模型,特别是随机森林和深度学习模型,表现出卓越性能,记录的最高合并AUC分别为0.865(95% CI:0.812 - 0.917)和0.847(95% CI:0.766 - 0.927)。这些显著优于传统风险评分0.765(95% CI:0.734 - 0.796)。然而,研究中发现了显著的异质性(I² > 99%)和潜在的发表偏倚。虽然ML模型在CVD风险校准方面表现出优势,但巨大的变异性和方法学问题限制了它们目前的临床适用性。未来研究应通过提高方法学透明度和标准化来解决这些问题,以提高这些模型在临床环境中的可靠性和实用性。本研究突出了ML模型在CVD风险预测方面的先进能力,并强调了进行严格验证以促进其融入临床实践的必要性。