Istituti Clinici Scientifici Maugeri IRCCS, Pavia, Italy.
Department of Advanced Biomedical Sciences, University Hospital of Naples "Federico II", Naples, Italy.
Sci Rep. 2020 Nov 18;10(1):20127. doi: 10.1038/s41598-020-77243-3.
Stroke is among the leading causes of death and disability worldwide. Approximately 20-25% of stroke survivors present severe disability, which is associated with increased mortality risk. Prognostication is inherent in the process of clinical decision-making. Machine learning (ML) methods have gained increasing popularity in the setting of biomedical research. The aim of this study was twofold: assessing the performance of ML tree-based algorithms for predicting three-year mortality model in 1207 stroke patients with severe disability who completed rehabilitation and comparing the performance of ML algorithms to that of a standard logistic regression. The logistic regression model achieved an area under the Receiver Operating Characteristics curve (AUC) of 0.745 and was well calibrated. At the optimal risk threshold, the model had an accuracy of 75.7%, a positive predictive value (PPV) of 33.9%, and a negative predictive value (NPV) of 91.0%. The ML algorithm outperformed the logistic regression model through the implementation of synthetic minority oversampling technique and the Random Forests, achieving an AUC of 0.928 and an accuracy of 86.3%. The PPV was 84.6% and the NPV 87.5%. This study introduced a step forward in the creation of standardisable tools for predicting health outcomes in individuals affected by stroke.
中风是全球范围内导致死亡和残疾的主要原因之一。大约 20-25%的中风幸存者存在严重残疾,这与死亡率风险增加有关。预后是临床决策过程中的固有内容。机器学习 (ML) 方法在生物医学研究中越来越受欢迎。本研究旨在评估 ML 基于树的算法在预测 1207 名完成康复的严重残疾中风患者三年死亡率模型方面的性能,并将 ML 算法的性能与标准逻辑回归进行比较。逻辑回归模型在接受者操作特征曲线(ROC)下的面积(AUC)为 0.745,并且具有良好的校准度。在最佳风险阈值下,该模型的准确性为 75.7%,阳性预测值(PPV)为 33.9%,阴性预测值(NPV)为 91.0%。通过实施合成少数过采样技术和随机森林,ML 算法的表现优于逻辑回归模型,AUC 为 0.928,准确性为 86.3%。PPV 为 84.6%,NPV 为 87.5%。本研究在为受中风影响的个体预测健康结果创建可标准化工具方面迈出了一步。