Suppr超能文献

通过对不规则重复的电子健康记录进行机器学习建模来改善心血管风险预测。

Improving cardiovascular risk prediction through machine learning modelling of irregularly repeated electronic health records.

作者信息

Li Chaiquan, Liu Xiaofei, Shen Peng, Sun Yexiang, Zhou Tianjing, Chen Weiye, Chen Qi, Lin Hongbo, Tang Xun, Gao Pei

机构信息

Department of Epidemiology and Biostatistics, School of Public Health, Peking University Health Science Center, No. 38 Xueyuan Road, Haidian District, 100191 Beijing, China.

Yinzhou District Center for Disease Control and Prevention, No. 1221 Xueshi Road, Yinzhou District, 315199 Ningbo, China.

出版信息

Eur Heart J Digit Health. 2023 Oct 17;5(1):30-40. doi: 10.1093/ehjdh/ztad058. eCollection 2024 Jan.

Abstract

AIMS

Existing electronic health records (EHRs) often consist of abundant but irregular longitudinal measurements of risk factors. In this study, we aim to leverage such data to improve the risk prediction of atherosclerotic cardiovascular disease (ASCVD) by applying machine learning (ML) algorithms, which can allow automatic screening of the population.

METHODS AND RESULTS

A total of 215 744 Chinese adults aged between 40 and 79 without a history of cardiovascular disease were included (6081 cases) from an EHR-based longitudinal cohort study. To allow interpretability of the model, the predictors of demographic characteristics, medication treatment, and repeatedly measured records of lipids, glycaemia, obesity, blood pressure, and renal function were used. The primary outcome was ASCVD, defined as non-fatal acute myocardial infarction, coronary heart disease death, or fatal and non-fatal stroke. The eXtreme Gradient boosting (XGBoost) algorithm and Least Absolute Shrinkage and Selection Operator (LASSO) regression models were derived to predict the 5-year ASCVD risk. In the validation set, compared with the refitted Chinese guideline-recommended Cox model (i.e. the China-PAR), the XGBoost model had a significantly higher -statistic of 0.792, (the differences in the -statistics: 0.011, 0.006-0.017, < 0.001), with similar results reported for LASSO regression (the differences in the -statistics: 0.008, 0.005-0.011, < 0.001). The XGBoost model demonstrated the best calibration performance (men: = 0.598, = 0.75; women: = 1.867, = 0.08). Moreover, the risk distribution of the ML algorithms differed from that of the conventional model. The net reclassification improvement rates of XGBoost and LASSO over the Cox model were 3.9% (1.4-6.4%) and 2.8% (0.7-4.9%), respectively.

CONCLUSION

Machine learning algorithms with irregular, repeated real-world data could improve cardiovascular risk prediction. They demonstrated significantly better performance for reclassification to identify the high-risk population correctly.

摘要

目的

现有的电子健康记录(EHR)通常包含丰富但不规则的危险因素纵向测量数据。在本研究中,我们旨在利用这些数据,通过应用机器学习(ML)算法来改善动脉粥样硬化性心血管疾病(ASCVD)的风险预测,该算法可实现人群的自动筛查。

方法与结果

基于一项基于EHR的纵向队列研究,纳入了215744名年龄在40至79岁之间且无心血管疾病病史的中国成年人(6081例)。为使模型具有可解释性,使用了人口统计学特征、药物治疗以及血脂、血糖、肥胖、血压和肾功能的重复测量记录作为预测因子。主要结局为ASCVD,定义为非致命性急性心肌梗死、冠心病死亡或致命性和非致命性卒中。采用极端梯度提升(XGBoost)算法和最小绝对收缩与选择算子(LASSO)回归模型来预测5年ASCVD风险。在验证集中,与重新拟合的中国指南推荐的Cox模型(即China-PAR)相比,XGBoost模型的C统计量显著更高,为0.792(C统计量差异:0.011,0.006 - 0.017,P < 0.001),LASSO回归的结果相似(C统计量差异:0.008,0.005 - 0.011,P < 0.001)。XGBoost模型表现出最佳的校准性能(男性:χ² = 0.598,P = 0.75;女性:χ² = 1.867,P = 0.08)。此外,ML算法的风险分布与传统模型不同。XGBoost和LASSO相对于Cox模型的净重新分类改善率分别为3.9%(1.4 - 6.4%)和2.8%(0.7 - 4.9%)。

结论

利用不规则的、重复的真实世界数据的机器学习算法可改善心血管疾病风险预测。它们在正确识别高危人群的重新分类方面表现出显著更好的性能。

https://cdn.ncbi.nlm.nih.gov/pmc/blobs/4bf8/10802828/a6fc406f58d6/ztad058_ga1.jpg

文献检索

告别复杂PubMed语法,用中文像聊天一样搜索,搜遍4000万医学文献。AI智能推荐,让科研检索更轻松。

立即免费搜索

文件翻译

保留排版,准确专业,支持PDF/Word/PPT等文件格式,支持 12+语言互译。

免费翻译文档

深度研究

AI帮你快速写综述,25分钟生成高质量综述,智能提取关键信息,辅助科研写作。

立即免费体验