The State Key Laboratory of Molecular Vaccine and Molecular Diagnostics, School of Public Health, Xiamen University, Xiamen, 361102, Fujian, China.
National Institute for Data Science in Health and Medicine, Xiamen University, Xiamen, 361102, Fujian, China.
BMC Geriatr. 2022 Jul 28;22(1):627. doi: 10.1186/s12877-022-03295-x.
To explore the heterogeneous disability trajectories and construct explainable machine learning models for effective prediction of long-term disability trajectories and understanding the mechanisms of predictions among the elderly Chinese at community level.
This study retrospectively collected data from the Chinese Longitudinal Healthy Longevity and Happy Family Study between 2002 and 2018. A total of 4149 subjects aged 65 + in 2002 with completed activities of daily living (ADL) information for at least three waves were included. The mixed growth model was used to identify disability trajectories, and five machine learning models were further established to predict disability trajectories using epidemiological variables. An explainable approach was deployed to understand the model's decisions.
Three distinct disability trajectories, including normal class (77.3%), progressive class (15.5%), and high-onset class (7.2%), were identified for three-class prediction. The latter two were further merged into abnormal class, accompanied by normal class for two-class prediction. Machine learning, especially random forest and extreme gradient boosting achieved good performance in both two tasks. ADL, age, leisure activity, cognitive function, and blood pressure were key predictors.
The findings suggest that machine learning showed good performance and maybe of additional value in analyzing quality indicators in predicting disability trajectories, thereby providing basis to personalize intervention measures.
探索异质残疾轨迹,并构建可解释的机器学习模型,以便有效预测老年人的长期残疾轨迹,并深入了解社区老年人预测残疾轨迹的机制。
本研究回顾性地收集了 2002 年至 2018 年期间中国纵向健康长寿与幸福家庭研究的数据。共纳入了 2002 年年龄在 65 岁及以上且至少完成三次日常生活活动(ADL)信息的 4149 名受试者。采用混合增长模型识别残疾轨迹,进一步使用流行病学变量建立了 5 个机器学习模型来预测残疾轨迹。采用可解释的方法来理解模型的决策。
为三分类预测确定了三种不同的残疾轨迹,包括正常类(77.3%)、进展类(15.5%)和高发性类(7.2%)。后两者进一步合并为异常类,与正常类一起进行二分类预测。机器学习,特别是随机森林和极端梯度增强,在这两个任务中都表现出了良好的性能。ADL、年龄、休闲活动、认知功能和血压是关键的预测指标。
研究结果表明,机器学习在分析预测残疾轨迹的质量指标方面表现出良好的性能,可能具有附加价值,从而为个性化干预措施提供依据。