Suppr超能文献

使用机器学习模型预测长期中风复发

Prediction of Long-Term Stroke Recurrence Using Machine Learning Models.

作者信息

Abedi Vida, Avula Venkatesh, Chaudhary Durgesh, Shahjouei Shima, Khan Ayesha, Griessenauer Christoph J, Li Jiang, Zand Ramin

机构信息

Department of Molecular and Functional Genomics, Geisinger Health System, Danville, PA 17822, USA.

Biocomplexity Institute, Virginia Tech, Blacksburg, VA 24061, USA.

出版信息

J Clin Med. 2021 Mar 20;10(6):1286. doi: 10.3390/jcm10061286.

Abstract

BACKGROUND

The long-term risk of recurrent ischemic stroke, estimated to be between 17% and 30%, cannot be reliably assessed at an individual level. Our goal was to study whether machine-learning can be trained to predict stroke recurrence and identify key clinical variables and assess whether performance metrics can be optimized.

METHODS

We used patient-level data from electronic health records, six interpretable algorithms (Logistic Regression, Extreme Gradient Boosting, Gradient Boosting Machine, Random Forest, Support Vector Machine, Decision Tree), four feature selection strategies, five prediction windows, and two sampling strategies to develop 288 models for up to 5-year stroke recurrence prediction. We further identified important clinical features and different optimization strategies.

RESULTS

We included 2091 ischemic stroke patients. Model area under the receiver operating characteristic (AUROC) curve was stable for prediction windows of 1, 2, 3, 4, and 5 years, with the highest score for the 1-year (0.79) and the lowest score for the 5-year prediction window (0.69). A total of 21 (7%) models reached an AUROC above 0.73 while 110 (38%) models reached an AUROC greater than 0.7. Among the 53 features analyzed, age, body mass index, and laboratory-based features (such as high-density lipoprotein, hemoglobin A1c, and creatinine) had the highest overall importance scores. The balance between specificity and sensitivity improved through sampling strategies.

CONCLUSION

All of the selected six algorithms could be trained to predict the long-term stroke recurrence and laboratory-based variables were highly associated with stroke recurrence. The latter could be targeted for personalized interventions. Model performance metrics could be optimized, and models can be implemented in the same healthcare system as intelligent decision support for targeted intervention.

摘要

背景

复发性缺血性中风的长期风险估计在17%至30%之间,无法在个体水平上进行可靠评估。我们的目标是研究是否可以通过机器学习来预测中风复发,确定关键临床变量,并评估性能指标是否可以优化。

方法

我们使用电子健康记录中的患者层面数据、六种可解释算法(逻辑回归、极端梯度提升、梯度提升机、随机森林、支持向量机、决策树)、四种特征选择策略、五个预测窗口和两种抽样策略,开发了288个模型,用于预测长达5年的中风复发情况。我们进一步确定了重要的临床特征和不同的优化策略。

结果

我们纳入了2091例缺血性中风患者。受试者工作特征曲线下面积(AUROC)在1年、2年、3年、4年和5年预测窗口中保持稳定,1年预测窗口得分最高(0.79),5年预测窗口得分最低(0.69)。共有21个(7%)模型的AUROC高于0.73,而110个(38%)模型的AUROC大于0.7。在分析的53个特征中,年龄、体重指数和基于实验室的特征(如高密度脂蛋白、糖化血红蛋白和肌酐)的总体重要性得分最高。通过抽样策略提高了特异性和敏感性之间的平衡。

结论

所有选定的六种算法都可以训练用于预测中风的长期复发,基于实验室的变量与中风复发高度相关。后者可作为个性化干预的目标。模型性能指标可以优化,并且模型可以在同一医疗系统中作为智能决策支持工具,用于针对性干预。

https://cdn.ncbi.nlm.nih.gov/pmc/blobs/3262/8003970/8a831c0317b9/jcm-10-01286-g001.jpg

文献检索

告别复杂PubMed语法,用中文像聊天一样搜索,搜遍4000万医学文献。AI智能推荐,让科研检索更轻松。

立即免费搜索

文件翻译

保留排版,准确专业,支持PDF/Word/PPT等文件格式,支持 12+语言互译。

免费翻译文档

深度研究

AI帮你快速写综述,25分钟生成高质量综述,智能提取关键信息,辅助科研写作。

立即免费体验