Du Chenlin, Zhang Zeyu, Liu Baoqin, Cao Zijian, Jiang Nan, Zhang Zongjiu
School of Biomedical Engineering Tsinghua University Beijing China.
Tsinghua Medicine, Tsinghua University Beijing China.
Health Care Sci. 2024 Dec 10;3(6):426-437. doi: 10.1002/hcs2.120. eCollection 2024 Dec.
Frailty in older adults is linked to increased risks and lower quality of life. Pre-frailty, a condition preceding frailty, is intervenable, but its determinants and assessment are challenging. This study aims to develop and validate an explainable machine learning model for pre-frailty risk assessment among community-dwelling older adults.
The study included 3141 adults aged 60 or above from the China Health and Retirement Longitudinal Study. Pre-frailty was characterized by one or two criteria from the physical frailty phenotype scale. We extracted 80 distinct features across seven dimensions to evaluate pre-frailty risk. A model was constructed using recursive feature elimination and a stacking-CatBoost distillation module on 80% of the sample and validated on a separate 20% holdout data set.
The study used data from 2508 community-dwelling older adults (mean age, 67.24 years [range, 60-96]; 1215 [48.44%] females) to develop a pre-frailty risk assessment model. We selected 57 predictive features and built a distilled CatBoost model, which achieved the highest discrimination (AUROC: 0.7560 [95% CI: 0.7169, 0.7928]) on the 20% holdout data set. The living city, BMI, and peak expiratory flow (PEF) were the three most significant contributors to pre-frailty risk. Physical and environmental factors were the top 2 impactful feature dimensions.
An accurate and interpretable pre-frailty risk assessment framework using state-of-the-art machine learning techniques and explanation methods has been developed. Our framework incorporates a wide range of features and determinants, allowing for a comprehensive and nuanced understanding of pre-frailty risk.
老年人的衰弱与风险增加和生活质量降低有关。衰弱前期是衰弱之前的一种状态,是可干预的,但其决定因素和评估具有挑战性。本研究旨在开发并验证一种用于社区居住老年人衰弱前期风险评估的可解释机器学习模型。
该研究纳入了来自中国健康与养老追踪调查的3141名60岁及以上的成年人。衰弱前期由身体衰弱表型量表中的一项或两项标准来定义。我们从七个维度提取了80个不同的特征来评估衰弱前期风险。使用递归特征消除和堆叠式CatBoost蒸馏模块在80%的样本上构建模型,并在单独的20%保留数据集上进行验证。
该研究使用了2508名社区居住老年人(平均年龄67.24岁[范围60 - 96岁];1215名[48.44%]女性)的数据来开发衰弱前期风险评估模型。我们选择了57个预测特征并构建了一个蒸馏CatBoost模型,该模型在20%的保留数据集上实现了最高的区分度(曲线下面积:0.7560[95%置信区间:0.7169, 0.7928])。居住城市、体重指数和呼气峰值流速(PEF)是衰弱前期风险的三个最重要因素。身体和环境因素是最具影响力的两个特征维度。
已开发出一个使用先进机器学习技术和解释方法的准确且可解释的衰弱前期风险评估框架。我们的框架纳入了广泛的特征和决定因素,能够对衰弱前期风险进行全面且细致入微的理解。