用于预测中老年人群多种疾病发展轨迹的机器学习模型。

Machine learning models for predicting multimorbidity trajectories in middle-aged and elderly adults.

作者信息

Yao Li, Li Qiaoxing, Zhou Zihan, Yin Jiajia, Wang Tingrui, Liu Yan, Li Qinqin, Xiao Lu, Yang Dongliang

机构信息

School of Management and Collaborative Innovation Laboratory of Digital Transformation and Governance, Guizhou University, Guiyang, 550025, Guizhou, China.

Department of Respiratory and Critical Care Medicine, The Affiliated Hospital of Guizhou Medical University, Guiyang, 550004, Guizhou, China.

出版信息

Sci Rep. 2025 Jul 9;15(1):24711. doi: 10.1038/s41598-025-07060-z.

DOI:10.1038/s41598-025-07060-z

PMID:40634428

原文链接:https://pmc.ncbi.nlm.nih.gov/articles/PMC12241554/

Abstract

Multimorbidity has emerged as a significant public health issue in the context of global population aging. Predicting and managing the progression of multimorbidity in the elderly population is crucial. This study aims to develop predictive models for multimorbidity trajectories in middle-aged and elderly populations and to identify the key factors influencing the progression of multimorbidity. First, a time-series clustering method was used to construct the multimorbidity trajectories. Then, predictive models based on machine learning techniques were developed to forecast the progression of different trajectories and identify key risk factors. This study utilized data from the China Health and Retirement Longitudinal Study (CHARLS) database, including 12,198 middle-aged and elderly individuals (aged 45 and above). Four distinct multimorbidity progression patterns were identified: Stable Low-Risk Group (45.26%), Progressively Worsening Group (14.35%), Moderate Stability Group (31.90%) and Consistently Deteriorating Group (8.49%). Among the predictive models, the XGBoost model achieved the best performance, with an accuracy of 0.664 (95%CI: 0.648-0.681), a macro ROC-AUC of 0.825 (95%CI: 0.816-0.834), a micro ROC-AUC of 0.884 (95%CI: 0.876-0.892), and a log loss of 0.806 (95%CI: 0.781-0.831). Other models, including Random Forest, Support Vector Machine, Logistic Regression, and Artificial Neural Networks, showed similar accuracy and ROC-AUC values. The study identified three key factors-baseline disease counts, self-rated Activities of Daily Living (ADL), and self-rated health status-as critical predictors of multimorbidity trajectories.

摘要

在全球人口老龄化背景下，多重疾病共患已成为一个重大的公共卫生问题。预测和管理老年人群多重疾病共患的进展至关重要。本研究旨在开发中老年人群多重疾病共患轨迹的预测模型，并确定影响多重疾病共患进展的关键因素。首先，使用时间序列聚类方法构建多重疾病共患轨迹。然后，基于机器学习技术开发预测模型，以预测不同轨迹的进展并识别关键风险因素。本研究利用了中国健康与养老追踪调查（CHARLS）数据库的数据，包括12198名中老年个体（年龄在45岁及以上）。识别出四种不同的多重疾病共患进展模式：稳定低风险组（45.26%）、逐渐恶化组（14.35%）、中度稳定组（31.90%）和持续恶化组（8.49%）。在预测模型中，XGBoost模型表现最佳，准确率为0.664（95%CI：0.648 - 0.681），宏ROC-AUC为0.825（95%CI：0.816 - 0.834），微ROC-AUC为0.884（95%CI：0.876 - 0.892），对数损失为0.806（95%CI：0.781 - 0.831）。其他模型，包括随机森林、支持向量机、逻辑回归和人工神经网络，显示出相似的准确率和ROC-AUC值。该研究确定了三个关键因素——基线疾病数量、自评日常生活活动能力（ADL）和自评健康状况——作为多重疾病共患轨迹的关键预测因素。