Suppr超能文献

基于机器学习和SHAP值的中国社区心血管疾病老年患者肌肉减少症预测模型

Sarcopenia prediction model based on machine learning and SHAP values for community-based older adults with cardiovascular disease in China.

作者信息

Yu Peil, Zhang Xinxin, Sun Guoxuan, Zeng Ping, Zheng Chu, Wang Ke

机构信息

Department of Biostatistics, School of Public Health, Xuzhou Medical University, Xuzhou, Jiangsu, China.

Center for Medical Statistics and Data Analysis, Xuzhou Medical University, Xuzhou, Jiangsu, China.

出版信息

Front Public Health. 2025 May 21;13:1527304. doi: 10.3389/fpubh.2025.1527304. eCollection 2025.

Abstract

BACKGROUND

Sarcopenia (SP), is recognized as a complication of cardiovascular disease (CVD), but few relevant diagnostic models have been developed. This study aims to establish an interpretable diagnostic model for the occurrence of SP in older adult CVD patients living in Chinese community-dwelling (CD).

METHODS

We randomly selected participants with CVD recruited from CHARLS from 2011 to 2015 and divided them into a training set and a test set. In the training set, we processed and screened the predictor variables and addressed the data imbalance by the synthetic minority oversampling technique (SMOTE). Subsequently, we built four machine learning (ML) models to predict SP. After 100 iterations, we selected the best performing model for risk stratification by comparing model discrimination and calibration. Then, we analyzed the relationship between ML risk and SP using scatterplots and logistic regression (LR). Finally, the Shapley's Additive Explanatory Plot (SHAP) illustrates how each feature level affects the predicted probability of SP.

RESULTS

We ultimately included 1,088 CD older adults, 18.61% of whom reported SP. The optimal model, XGBoost, was selected for prediction and risk stratification. After both univariate (odds ratio [OR]: 12.45,  = 4.74 × 10) and multivariate analyses (OR: 6.98,  = 3.96 × 10), participants with higher ML scores had a higher risk of SP. In sex-specific subanalyses, BMI, height, age, DBP, HDL, etc. were all significant predictors.

CONCLUSION

This study develops a novel clinically-integrated tool that can be used to easily predict SP in the older adults population with CVD, providing a basis for the development of personalized therapeutic measures.

摘要

背景

肌肉减少症(SP)被认为是心血管疾病(CVD)的一种并发症,但相关的诊断模型却很少。本研究旨在为居住在中国社区的老年CVD患者发生SP建立一个可解释的诊断模型。

方法

我们从2011年至2015年的中国健康与养老追踪调查(CHARLS)中随机选取CVD参与者,并将其分为训练集和测试集。在训练集中,我们对预测变量进行处理和筛选,并通过合成少数过采样技术(SMOTE)解决数据不平衡问题。随后,我们建立了四个机器学习(ML)模型来预测SP。经过100次迭代后,我们通过比较模型的区分度和校准度来选择表现最佳的模型进行风险分层。然后,我们使用散点图和逻辑回归(LR)分析ML风险与SP之间的关系。最后,Shapley加性解释图(SHAP)说明了每个特征水平如何影响SP的预测概率。

结果

我们最终纳入了1088名社区居住的老年人,其中18.61%报告有SP。选择了最优模型XGBoost进行预测和风险分层。在单因素分析(优势比[OR]:12.45,P = 4.74×10⁻⁵)和多因素分析(OR:6.98,P = 3.96×10⁻⁴)之后,ML得分较高的参与者发生SP的风险更高。在按性别进行的亚组分析中,体重指数、身高、年龄、舒张压、高密度脂蛋白等均为显著预测因素。

结论

本研究开发了一种新型的临床综合工具,可用于轻松预测老年CVD人群中的SP,为制定个性化治疗措施提供依据。

https://cdn.ncbi.nlm.nih.gov/pmc/blobs/1baa/12133505/f35ae4e98791/fpubh-13-1527304-g001.jpg

文献AI研究员

20分钟写一篇综述,助力文献阅读效率提升50倍。

立即体验

用中文搜PubMed

大模型驱动的PubMed中文搜索引擎

马上搜索

文档翻译

学术文献翻译模型,支持多种主流文档格式。

立即体验