IRCCS Fondazione Don Carlo Gnocchi onlus, Firenze, Italy.
Department of Experimental and Clinical Medicine, University of Florence, Firenze, Italy.
Sci Rep. 2024 Oct 24;14(1):25188. doi: 10.1038/s41598-024-74537-8.
Good data quality is vital for personalising plans in rehabilitation. Machine learning (ML) improves prognostics but integrating it with Multiple Imputation (MImp) for dealing missingness is an unexplored field. This work aims to provide post-stroke ambulation prognosis, integrating MImp with ML, and identify the prognostic influential factors. Stroke survivors in intensive rehabilitation were enrolled. Data on demographics, events, clinical, physiotherapy, and psycho-social assessment were collected. An independent ambulation at discharge, using the Functional Ambulation Category scale, was the outcome. After handling missingness using MImp, ML models were optimised, cross-validated, and tested. Interpretability techniques analysed predictor contributions. Pre-MImp, the dataset included 54.1% women, 79.2% ischaemic patients, median age 80.0 (interquartile range: 15.0). Post-MImp, 368 non-ambulatory patients on 10 imputed datasets were used for training, 80 for testing. The random forest (the validation best-performing algorithm) obtained 75.5% aggregated balanced accuracy on the test set. The main predictors included modified Barthel index, Fugl-Meyer assessment/motricity index, short physical performance battery, age, Charlson comorbidity index/cumulative illness rating scale, and trunk control test. This is among the first studies applying ML, together with MImp, to predict ambulation recovery in post-stroke rehabilitation. This pipeline reliably exploits the potential of incomplete datasets for healthcare prognosis, identifying relevant predictors.
良好的数据质量对于康复计划的个性化至关重要。机器学习(ML)可以改善预后,但将其与多重插补(MImp)结合起来处理缺失数据是一个尚未探索的领域。本研究旨在提供卒中后步行预后,将 MImp 与 ML 相结合,并确定预后的影响因素。纳入了在强化康复中的卒中幸存者。收集了人口统计学、事件、临床、物理治疗和心理社会评估的数据。使用功能性步行分类量表(Functional Ambulation Category scale)评估出院时的独立步行能力,作为结局。使用 MImp 处理缺失值后,优化、交叉验证和测试了 ML 模型。解释性技术分析了预测因素的贡献。在 MImp 之前,数据集包括 54.1%的女性,79.2%的缺血性患者,中位数年龄为 80.0(四分位距:15.0)。在 MImp 之后,使用 10 个插补数据集的 368 名非步行患者进行了训练,80 名用于测试。随机森林(验证中表现最佳的算法)在测试集中获得了 75.5%的综合平衡准确率。主要预测因素包括改良巴氏指数、Fugl-Meyer 评估/运动指数、简短体能测试、年龄、Charlson 合并症指数/累积疾病评分量表和躯干控制测试。这是首次将 ML 与 MImp 一起应用于预测卒中后康复中步行恢复的研究之一。该流水线可靠地利用了不完整数据集在医疗预后中的潜力,确定了相关的预测因素。