Kang Cheng-Wei, Yan Zhao-Kui, Tian Jia-Liang, Pu Xiao-Bing, Wu Li-Xue
Department of Orthopaedics, West China School of Public Health and West China Fourth Hospital, Sichuan University, Chengdu, Sichuan, 610041, China.
Department of Pathology, West China School of Public Health and West China Fourth Hospital, Sichuan University, Chengdu, Sichuan, 610041, China.
BMC Public Health. 2025 Jan 20;25(1):242. doi: 10.1186/s12889-025-21284-8.
This study aimed to identify the risk factors associated with falls in hospitalized patients, develop a predictive risk model using machine learning algorithms, and evaluate the validity of the model's predictions.
A cross-sectional design was employed using data from the DRYAD public database.
The study utilized data from the Fukushima Medical University Hospital Cohort Study, obtained from the DRYAD public database. 20% of the dataset was allocated as an independent test set, while the remaining 80% was utilized for training and validation. To address data imbalance in binary variables, the Synthetic Minority Oversampling Technique combined with Edited Nearest Neighbors (SMOTE-ENN) was applied. Univariate analysis and least absolute shrinkage and selection operator (LASSO) regression were used to analyze and screen variables. Predictive models were constructed by integrating key clinical features, and eight machine learning algorithms were evaluated to identify the most effective model. Additionally, SHAP (Shapley Additive Explanations) was used to interpret the predictive models and rank the importance of risk factors.
The final model included the following variables: Adl_standing, Adl_evacuation, Age_group, Planned_surgery, Wheelchair, History_of_falls, Hypnotic_drugs, Psychotropic_drugs, and Remote_caring_system. Among the evaluated models, the Random Forest algorithm demonstrated superior performance, achieving an AUC of 0.814 (95% CI: 0.802-0.827) in the training set, 0.781 (95% CI: 0.740-0.821) in the validation set, and 0.795 (95% CI: 0.770-0.820) in the test set.
Machine learning algorithms, particularly Random Forest, are effective in predicting fall risk among hospitalized patients. These findings can significantly enhance fall prevention strategies within healthcare settings.
本研究旨在确定住院患者跌倒的相关危险因素,使用机器学习算法开发预测风险模型,并评估该模型预测的有效性。
采用横断面设计,使用DRYAD公共数据库中的数据。
本研究利用从DRYAD公共数据库获取的福岛医科大学医院队列研究的数据。将20%的数据集分配为独立测试集,其余80%用于训练和验证。为解决二元变量中的数据不平衡问题,应用了合成少数过采样技术与编辑最近邻法(SMOTE-ENN)相结合的方法。采用单因素分析和最小绝对收缩和选择算子(LASSO)回归来分析和筛选变量。通过整合关键临床特征构建预测模型,并评估了八种机器学习算法以确定最有效的模型。此外,使用SHAP(Shapley加性解释)来解释预测模型并对危险因素的重要性进行排名。
最终模型包括以下变量:日常生活活动能力中的站立、撤离、年龄组、计划手术、轮椅使用、跌倒史、催眠药物、精神药物和远程护理系统。在所评估的模型中,随机森林算法表现出卓越的性能,在训练集中的曲线下面积(AUC)为0.814(95%置信区间:0.802 - 0.827),在验证集中为0.781(95%置信区间:0.740 - 0.821),在测试集中为0.795(95%置信区间:0.770 - 0.820)。
机器学习算法,特别是随机森林算法,在预测住院患者跌倒风险方面是有效的。这些发现可显著加强医疗机构内的跌倒预防策略。