Suppr超能文献

可解释的机器学习模型预测老年人社会隔离风险:一项前瞻性队列研究。

Explainable machine learning models predicting the risk of social isolation in older adults: a prospective cohort study.

作者信息

Jiang Mingfei, Li Xiaoran

机构信息

School of Public Health, Southeast University, Hunan Road, Nanjing, Jiangsu, 210009, China.

Department of Radiology, Nanjing Gaochun People's Hospital, No.53, Maoshan Road, Nanjing, 211300, China.

出版信息

BMC Public Health. 2025 May 30;25(1):1999. doi: 10.1186/s12889-025-23108-1.

Abstract

INTRODUCTION

This study aimed to develop a machine learning system to predict social isolation risk in older adults.

METHODS

Data from a sample of 6588 older adults in China were analyzed using information from China Health and Retirement Longitudinal Study from 2015 to 2018. We employed the light gradient boosting machine (Lightgbm) algorithm to determine the most common predictors of social isolation among older adults. After identifying these predictors, we trained and optimized 7 models to predict the risk of social isolation among older adults: Lightgbm, logistic regression, decision tree, support vector machine, random forest, gradient boosting decision tree (Gbdt), and Xgboost. In addition, the Shapely additive explanation (SHAP) method was used to show the contribution of each social isolation predictor to the prediction. Statistical analysis was conducted from December 2023 to April 2024.

RESULTS

The Gbdt model had the best performance with an accuracy of 0.7247, sensitivity of 0.9207, specificity of 0.6273, F1 score of 0.6894, and Area Under Curve of 0.84. In addition, the SHAP method demonstrated that intergeneration financial support, child visits, age, left-hand grip strength, and loneliness were the most important characteristics.

CONCLUSIONS

The combination of Gbdt and SHAP provides a clear explanation of the factors contributing to predicting the personalized risk of social isolation for individuals and an intuitive understanding of the impact of key features.

摘要

引言

本研究旨在开发一个机器学习系统,以预测老年人的社会隔离风险。

方法

利用2015年至2018年中国健康与养老追踪调查的信息,对来自中国6588名老年人样本的数据进行分析。我们采用轻量级梯度提升机(Lightgbm)算法来确定老年人社会隔离最常见的预测因素。在确定这些预测因素后,我们训练并优化了7个模型,以预测老年人的社会隔离风险:Lightgbm、逻辑回归、决策树、支持向量机、随机森林、梯度提升决策树(Gbdt)和Xgboost。此外,使用Shapely加法解释(SHAP)方法来展示每个社会隔离预测因素对预测的贡献。统计分析于2023年12月至2024年4月进行。

结果

Gbdt模型表现最佳,准确率为0.7247,灵敏度为0.9207,特异性为0.6273,F1分数为0.6894,曲线下面积为0.84。此外,SHAP方法表明代际经济支持、子女探访、年龄、左手握力和孤独感是最重要的特征。

结论

Gbdt和SHAP的结合为预测个体社会隔离的个性化风险的影响因素提供了清晰的解释,并直观地说明了关键特征的影响。

文献AI研究员

20分钟写一篇综述,助力文献阅读效率提升50倍。

立即体验

用中文搜PubMed

大模型驱动的PubMed中文搜索引擎

马上搜索

文档翻译

学术文献翻译模型,支持多种主流文档格式。

立即体验