Suppr超能文献

使用机器学习进行心源性休克预测的数据处理管道

Data processing pipeline for cardiogenic shock prediction using machine learning.

作者信息

Jajcay Nikola, Bezak Branislav, Segev Amitai, Matetzky Shlomi, Jankova Jana, Spartalis Michael, El Tahlawi Mohammad, Guerra Federico, Friebel Julian, Thevathasan Tharusan, Berta Imrich, Pölzl Leo, Nägele Felix, Pogran Edita, Cader F Aaysha, Jarakovic Milana, Gollmann-Tepeköylü Can, Kollarova Marta, Petrikova Katarina, Tica Otilia, Krychtiuk Konstantin A, Tavazzi Guido, Skurk Carsten, Huber Kurt, Böhm Allan

机构信息

Premedix Academy, Bratislava, Slovakia.

Department of Complex Systems, Institute of Computer Science, Czech Academy of Sciences, Prague, Czech Republic.

出版信息

Front Cardiovasc Med. 2023 Mar 23;10:1132680. doi: 10.3389/fcvm.2023.1132680. eCollection 2023.

Abstract

INTRODUCTION

Recent advances in machine learning provide new possibilities to process and analyse observational patient data to predict patient outcomes. In this paper, we introduce a data processing pipeline for cardiogenic shock (CS) prediction from the MIMIC III database of intensive cardiac care unit patients with acute coronary syndrome. The ability to identify high-risk patients could possibly allow taking pre-emptive measures and thus prevent the development of CS.

METHODS

We mainly focus on techniques for the imputation of missing data by generating a pipeline for imputation and comparing the performance of various multivariate imputation algorithms, including k-nearest neighbours, two singular value decomposition (SVD)-based methods, and Multiple Imputation by Chained Equations. After imputation, we select the final subjects and variables from the imputed dataset and showcase the performance of the gradient-boosted framework that uses a tree-based classifier for cardiogenic shock prediction.

RESULTS

We achieved good classification performance thanks to data cleaning and imputation (cross-validated mean area under the curve 0.805) without hyperparameter optimization.

CONCLUSION

We believe our pre-processing pipeline would prove helpful also for other classification and regression experiments.

摘要

引言

机器学习的最新进展为处理和分析观察性患者数据以预测患者预后提供了新的可能性。在本文中,我们介绍了一种用于从患有急性冠状动脉综合征的重症监护病房患者的MIMIC III数据库中预测心源性休克(CS)的数据处理流程。识别高危患者的能力可能会使我们能够采取预防措施,从而防止心源性休克的发生。

方法

我们主要关注通过生成插补流程并比较各种多变量插补算法(包括k近邻算法、两种基于奇异值分解(SVD)的方法以及链式方程多重插补法)来插补缺失数据的技术。插补后,我们从插补数据集中选择最终的研究对象和变量,并展示使用基于树的分类器的心源性休克预测梯度提升框架的性能。

结果

由于进行了数据清理和插补,我们在未进行超参数优化的情况下取得了良好的分类性能(交叉验证的曲线下平均面积为0.805)。

结论

我们相信我们的预处理流程对其他分类和回归实验也将有所帮助。

https://cdn.ncbi.nlm.nih.gov/pmc/blobs/524b/10077147/9cee39c84dcb/fcvm-10-1132680-g001.jpg

文献检索

告别复杂PubMed语法,用中文像聊天一样搜索,搜遍4000万医学文献。AI智能推荐,让科研检索更轻松。

立即免费搜索

文件翻译

保留排版,准确专业,支持PDF/Word/PPT等文件格式,支持 12+语言互译。

免费翻译文档

深度研究

AI帮你快速写综述,25分钟生成高质量综述,智能提取关键信息,辅助科研写作。

立即免费体验