使用机器学习进行心源性休克预测的数据处理管道

Data processing pipeline for cardiogenic shock prediction using machine learning.

作者信息

Jajcay Nikola, Bezak Branislav, Segev Amitai, Matetzky Shlomi, Jankova Jana, Spartalis Michael, El Tahlawi Mohammad, Guerra Federico, Friebel Julian, Thevathasan Tharusan, Berta Imrich, Pölzl Leo, Nägele Felix, Pogran Edita, Cader F Aaysha, Jarakovic Milana, Gollmann-Tepeköylü Can, Kollarova Marta, Petrikova Katarina, Tica Otilia, Krychtiuk Konstantin A, Tavazzi Guido, Skurk Carsten, Huber Kurt, Böhm Allan

机构信息

Premedix Academy, Bratislava, Slovakia.

Department of Complex Systems, Institute of Computer Science, Czech Academy of Sciences, Prague, Czech Republic.

出版信息

Front Cardiovasc Med. 2023 Mar 23;10:1132680. doi: 10.3389/fcvm.2023.1132680. eCollection 2023.

DOI:10.3389/fcvm.2023.1132680

PMID:37034352

原文链接:https://pmc.ncbi.nlm.nih.gov/articles/PMC10077147/

Abstract

INTRODUCTION

Recent advances in machine learning provide new possibilities to process and analyse observational patient data to predict patient outcomes. In this paper, we introduce a data processing pipeline for cardiogenic shock (CS) prediction from the MIMIC III database of intensive cardiac care unit patients with acute coronary syndrome. The ability to identify high-risk patients could possibly allow taking pre-emptive measures and thus prevent the development of CS.

METHODS

We mainly focus on techniques for the imputation of missing data by generating a pipeline for imputation and comparing the performance of various multivariate imputation algorithms, including k-nearest neighbours, two singular value decomposition (SVD)-based methods, and Multiple Imputation by Chained Equations. After imputation, we select the final subjects and variables from the imputed dataset and showcase the performance of the gradient-boosted framework that uses a tree-based classifier for cardiogenic shock prediction.

RESULTS

We achieved good classification performance thanks to data cleaning and imputation (cross-validated mean area under the curve 0.805) without hyperparameter optimization.

CONCLUSION

We believe our pre-processing pipeline would prove helpful also for other classification and regression experiments.

摘要

引言

机器学习的最新进展为处理和分析观察性患者数据以预测患者预后提供了新的可能性。在本文中，我们介绍了一种用于从患有急性冠状动脉综合征的重症监护病房患者的MIMIC III数据库中预测心源性休克（CS）的数据处理流程。识别高危患者的能力可能会使我们能够采取预防措施，从而防止心源性休克的发生。

方法

我们主要关注通过生成插补流程并比较各种多变量插补算法（包括k近邻算法、两种基于奇异值分解（SVD）的方法以及链式方程多重插补法）来插补缺失数据的技术。插补后，我们从插补数据集中选择最终的研究对象和变量，并展示使用基于树的分类器的心源性休克预测梯度提升框架的性能。

结果

由于进行了数据清理和插补，我们在未进行超参数优化的情况下取得了良好的分类性能（交叉验证的曲线下平均面积为0.805）。

结论

我们相信我们的预处理流程对其他分类和回归实验也将有所帮助。

https://cdn.ncbi.nlm.nih.gov/pmc/blobs/524b/10077147/9cee39c84dcb/fcvm-10-1132680-g001.jpg

相似文献

Data processing pipeline for cardiogenic shock prediction using machine learning.使用机器学习进行心源性休克预测的数据处理管道

Front Cardiovasc Med. 2023 Mar 23;10:1132680. doi: 10.3389/fcvm.2023.1132680. eCollection 2023.

Exploiting mutual information for the imputation of static and dynamic mixed-type clinical data with an adaptive k-nearest neighbours approach.利用互信息，采用自适应 k-最近邻方法对静态和动态混合类型临床数据进行插补。

BMC Med Inform Decis Mak. 2020 Aug 20;20(Suppl 5):174. doi: 10.1186/s12911-020-01166-2.

SuperMICE: An Ensemble Machine Learning Approach to Multiple Imputation by Chained Equations.超级小鼠：一种基于链式方程的多重填补集成机器学习方法。

Am J Epidemiol. 2022 Feb 19;191(3):516-525. doi: 10.1093/aje/kwab271.

A new analytical framework for missing data imputation and classification with uncertainty: Missing data imputation and heart failure readmission prediction.一种具有不确定性的缺失数据插补和分类的新分析框架：缺失数据插补和心力衰竭再入院预测。

PLoS One. 2020 Sep 21;15(9):e0237724. doi: 10.1371/journal.pone.0237724. eCollection 2020.

Missing data imputation, prediction, and feature selection in diagnosis of vaginal prolapse.阴道脱垂诊断中的缺失数据插补、预测和特征选择。

BMC Med Res Methodol. 2023 Nov 6;23(1):259. doi: 10.1186/s12874-023-02079-0.

Missing data imputation on the 5-year survival prediction of breast cancer patients with unknown discrete values.对具有未知离散值的乳腺癌患者5年生存预测中的缺失数据进行插补。

Comput Biol Med. 2015 Apr;59:125-133. doi: 10.1016/j.compbiomed.2015.02.006. Epub 2015 Feb 16.

Application of machine learning missing data imputation techniques in clinical decision making: taking the discharge assessment of patients with spontaneous supratentorial intracerebral hemorrhage as an example.机器学习缺失数据插补技术在临床决策中的应用：以自发性幕上脑出血患者出院评估为例。

BMC Med Inform Decis Mak. 2022 Jan 13;22(1):13. doi: 10.1186/s12911-022-01752-6.

Advanced methods for missing values imputation based on similarity learning.基于相似性学习的缺失值插补先进方法。

PeerJ Comput Sci. 2021 Jul 21;7:e619. doi: 10.7717/peerj-cs.619. eCollection 2021.

Early Prediction of Diabetes Using an Ensemble of Machine Learning Models.使用机器学习模型集成进行糖尿病早期预测。

Int J Environ Res Public Health. 2022 Sep 28;19(19):12378. doi: 10.3390/ijerph191912378.

Development of a machine learning model to predict the risk of late cardiogenic shock in patients with ST-segment elevation myocardial infarction.开发一种机器学习模型以预测ST段抬高型心肌梗死患者发生晚期心源性休克的风险。

Ann Transl Med. 2021 Jul;9(14):1162. doi: 10.21037/atm-21-2905.

引用本文的文献

An artificial intelligence model to predict mortality among hemodialysis patients: A retrospective validated cohort study.一种预测血液透析患者死亡率的人工智能模型：一项回顾性验证队列研究。

Sci Rep. 2025 Jul 29;15(1):27699. doi: 10.1038/s41598-025-06576-8.

Profiling of Cardiogenic Shock: Incorporating Machine Learning Into Bedside Management.心源性休克的剖析：将机器学习纳入床边管理

J Soc Cardiovasc Angiogr Interv. 2024 May 28;4(3Part B):102047. doi: 10.1016/j.jscai.2024.102047. eCollection 2025 Mar.

Machine learning-based scoring system to predict cardiogenic shock in acute coronary syndrome.基于机器学习的急性冠状动脉综合征心源性休克预测评分系统。

Eur Heart J Digit Health. 2025 Jan 6;6(2):240-251. doi: 10.1093/ehjdh/ztaf002. eCollection 2025 Mar.

The Heart of Transformation: Exploring Artificial Intelligence in Cardiovascular Disease.变革的核心：探索心血管疾病中的人工智能

Biomedicines. 2025 Feb 10;13(2):427. doi: 10.3390/biomedicines13020427.

Revolutionizing Cardiology through Artificial Intelligence-Big Data from Proactive Prevention to Precise Diagnostics and Cutting-Edge Treatment-A Comprehensive Review of the Past 5 Years.通过人工智能革新心脏病学——从主动预防到精准诊断与前沿治疗的大数据——过去五年的全面综述

Diagnostics (Basel). 2024 May 26;14(11):1103. doi: 10.3390/diagnostics14111103.

Novel Medical Treatments and Devices for the Management of Heart Failure with Reduced Ejection Fraction.用于治疗射血分数降低的心力衰竭的新型医学疗法和设备。

J Cardiovasc Dev Dis. 2024 Apr 19;11(4):125. doi: 10.3390/jcdd11040125.

Development and external validation of a dynamic risk score for early prediction of cardiogenic shock in cardiac intensive care units using machine learning.基于机器学习的心脏重症监护病房心源性休克早期预测的动态风险评分的建立和外部验证。

Eur Heart J Acute Cardiovasc Care. 2024 Jun 30;13(6):472-480. doi: 10.1093/ehjacc/zuae037.

Artificial Intelligence in the Early Prediction of Cardiogenic Shock in Acute Heart Failure or Myocardial Infarction Patients: A Systematic Review and Meta-Analysis.人工智能在急性心力衰竭或心肌梗死患者心源性休克早期预测中的应用：一项系统评价和荟萃分析

Cureus. 2023 Dec 12;15(12):e50395. doi: 10.7759/cureus.50395. eCollection 2023 Dec.

本文引用的文献

How to deal with non-detectable and outlying values in biomarker research: Best practices and recommendations for univariate imputation approaches.生物标志物研究中如何处理未检测到的值和离群值：单变量插补方法的最佳实践与建议

Compr Psychoneuroendocrinol. 2021 Mar 29;7:100052. doi: 10.1016/j.cpnec.2021.100052. eCollection 2021 Aug.

Machine Learning for Clinical Decision-Making: Challenges and Opportunities in Cardiovascular Imaging.用于临床决策的机器学习：心血管成像中的挑战与机遇

Front Cardiovasc Med. 2022 Jan 4;8:765693. doi: 10.3389/fcvm.2021.765693. eCollection 2021.

Technical and practical aspects of artificial intelligence in cardiology.人工智能在心脏病学中的技术和实践方面。

Bratisl Lek Listy. 2022;123(1):16-21. doi: 10.4149/BLL_2022_003.

Prediction model of in-hospital mortality in intensive care unit patients with heart failure: machine learning-based, retrospective analysis of the MIMIC-III database.基于机器学习的 MIMIC-III 数据库回顾性分析：预测 ICU 心力衰竭患者院内死亡率的模型。

BMJ Open. 2021 Jul 23;11(7):e044779. doi: 10.1136/bmjopen-2020-044779.

Clinical Characteristics of Aortic Aneurysm in MIMIC-III.MIMIC-III 中的主动脉瘤临床特征。

Heart Surg Forum. 2021 Apr 2;24(2):E351-E358. doi: 10.1532/hsf.3571.

Selecting the model for multiple imputation of missing data: Just use an IC!选择缺失数据多重插补模型：只用信息准则（IC）！

Stat Med. 2021 May 10;40(10):2467-2497. doi: 10.1002/sim.8915. Epub 2021 Feb 24.

A Review of Challenges and Opportunities in Machine Learning for Health.机器学习在健康领域的挑战与机遇综述。

AMIA Jt Summits Transl Sci Proc. 2020 May 30;2020:191-200. eCollection 2020.

Analysis of adult disease characteristics and mortality on MIMIC-III.分析 MIMIC-III 中的成人疾病特征和死亡率。

PLoS One. 2020 Apr 30;15(4):e0232176. doi: 10.1371/journal.pone.0232176. eCollection 2020.

Utilization of echocardiography during septic shock was associated with a decreased 28-day mortality: a propensity score-matched analysis of the MIMIC-III database.脓毒性休克期间超声心动图的应用与28天死亡率降低相关：对MIMIC-III数据库的倾向评分匹配分析。

Ann Transl Med. 2019 Nov;7(22):662. doi: 10.21037/atm.2019.10.79.

Machine Learning, Predictive Analytics, and Clinical Practice: Can the Past Inform the Present?机器学习、预测分析与临床实践：过去能否为当下提供借鉴？

JAMA. 2019 Dec 17;322(23):2283-2284. doi: 10.1001/jama.2019.17831.

文献检索

告别复杂PubMed语法，用中文像聊天一样搜索，搜遍4000万医学文献。AI智能推荐，让科研检索更轻松。

立即免费搜索

文件翻译

保留排版，准确专业，支持PDF/Word/PPT等文件格式，支持 12+语言互译。

免费翻译文档

深度研究

AI帮你快速写综述，25分钟生成高质量综述，智能提取关键信息，辅助科研写作。

立即免费体验

使用机器学习进行心源性休克预测的数据处理管道

Data processing pipeline for cardiogenic shock prediction using machine learning.

作者信息

机构信息

出版信息

INTRODUCTION

METHODS

RESULTS

CONCLUSION

引言

方法

结果

结论

相似文献

引用本文的文献

本文引用的文献

文献检索

文件翻译

深度研究

Suppr 超能文献

相似文献

引用本文的文献

本文引用的文献