Suppr超能文献

使用不均衡数据集预测创伤患者医院获得性感染的死亡率

Mortality Prediction from Hospital-Acquired Infections in Trauma Patients Using an Unbalanced Dataset.

作者信息

Karajizadeh Mehrdad, Nasiri Mahdi, Yadollahi Mahnaz, Zolfaghari Amir Hussain, Pakdam Ali

机构信息

School of Management & Information Sciences, Shiraz University of Medical Sciences, Shiraz, Iran.

Trauma Research Center, Shahid Rajaee (Emtiaz) Trauma Hospital, Shiraz University of Medical Sciences, Shiraz, Iran.

出版信息

Healthc Inform Res. 2020 Oct;26(4):284-294. doi: 10.4258/hir.2020.26.4.284. Epub 2020 Oct 31.

Abstract

OBJECTIVES

Machine learning has been widely used to predict diseases, and it is used to derive impressive knowledge in the healthcare domain. Our objective was to predict in-hospital mortality from hospital-acquired infections in trauma patients on an unbalanced dataset.

METHODS

Our study was a cross-sectional analysis on trauma patients with hospital-acquired infections who were admitted to Shiraz Trauma Hospital from March 20, 2017, to March 21, 2018. The study data was obtained from the surveillance hospital infection database. The data included sex, age, mechanism of injury, body region injured, severity score, type of intervention, infection day after admission, and microorganism causes of infections. We developed our mortality prediction model by random under-sampling, random over-sampling, clustering (k-mean)-C5.0, SMOTE-C5.0, ADASYN-C5.5, SMOTE-SVM, ADASYN-SVM, SMOTE-ANN, and ADASYN-ANN among hospital-acquired infections in trauma patients. All mortality predictions were conducted by IBM SPSS Modeler 18.

RESULTS

We studied 549 individuals with hospital-acquired infections in a trauma hospital in Shiraz during 2017 and 2018. Prediction accuracy before balancing of the dataset was 86.16%. In contrast, the prediction accuracy for the balanced dataset achieved by random under-sampling, random over-sampling, clustering (k-mean)-C5.0, SMOTE-C5.0, ADASYN-C5.5, and SMOTE-SVM was 70.69%, 94.74%, 93.02%, 93.66%, 90.93%, and 100%, respectively.

CONCLUSIONS

Our findings demonstrate that cleaning an unbalanced dataset increases the accuracy of the classification model. Also, predicting mortality by a clustered under-sampling approach was more precise in comparison to random under-sampling and random over-sampling methods.

摘要

目的

机器学习已被广泛用于疾病预测,并在医疗领域获得了令人瞩目的知识。我们的目标是在一个不平衡数据集上预测创伤患者医院获得性感染后的院内死亡率。

方法

我们的研究是对2017年3月20日至2018年3月21日入住设拉子创伤医院的医院获得性感染创伤患者进行的横断面分析。研究数据来自医院感染监测数据库。数据包括性别、年龄、损伤机制、受伤身体部位、严重程度评分、干预类型、入院后感染日期以及感染的微生物原因。我们通过随机欠采样、随机过采样、聚类(k均值)-C5.0、SMOTE-C5.0、ADASYN-C5.5、SMOTE-SVM、ADASYN-SVM、SMOTE-ANN和ADASYN-ANN在创伤患者医院获得性感染中开发了死亡率预测模型。所有死亡率预测均由IBM SPSS Modeler 18进行。

结果

我们研究了2017年和2018年设拉子一家创伤医院的549例医院获得性感染患者。数据集平衡前的预测准确率为86.16%。相比之下,通过随机欠采样、随机过采样、聚类(k均值)-C5.0、SMOTE-C5.0、ADASYN-C5.5和SMOTE-SVM实现的平衡数据集的预测准确率分别为70.69%、94.74%、93.02%、93.66%、90.93%和100%。

结论

我们的研究结果表明,清理不平衡数据集可提高分类模型的准确性。此外,与随机欠采样和随机过采样方法相比,通过聚类欠采样方法预测死亡率更为精确。

相似文献

1
Mortality Prediction from Hospital-Acquired Infections in Trauma Patients Using an Unbalanced Dataset.
Healthc Inform Res. 2020 Oct;26(4):284-294. doi: 10.4258/hir.2020.26.4.284. Epub 2020 Oct 31.
5
Identification of Orphan Genes in Unbalanced Datasets Based on Ensemble Learning.
Front Genet. 2020 Oct 2;11:820. doi: 10.3389/fgene.2020.00820. eCollection 2020.
6
Prediction and optimization of employee turnover intentions in enterprises based on unbalanced data.
PLoS One. 2023 Aug 17;18(8):e0290086. doi: 10.1371/journal.pone.0290086. eCollection 2023.
7
An explainable machine learning framework for lung cancer hospital length of stay prediction.
Sci Rep. 2022 Jan 12;12(1):607. doi: 10.1038/s41598-021-04608-7.
8
Stroke Prediction with Machine Learning Methods among Older Chinese.
Int J Environ Res Public Health. 2020 Mar 12;17(6):1828. doi: 10.3390/ijerph17061828.
9
Application of machine learning algorithms in predicting HIV infection among men who have sex with men: Model development and validation.
Front Public Health. 2022 Aug 25;10:967681. doi: 10.3389/fpubh.2022.967681. eCollection 2022.
10
Hybrid model for precise hepatitis-C classification using improved random forest and SVM method.
Sci Rep. 2023 Aug 1;13(1):12473. doi: 10.1038/s41598-023-36605-3.

引用本文的文献

2
Predictive Model for In-Hospital Death in Older Patients with Type 2 Diabetes Mellitus: A Multicenter Retrospective Study in Southwest China.
Diabetes Metab Syndr Obes. 2025 Jun 9;18:1873-1889. doi: 10.2147/DMSO.S527018. eCollection 2025.
4
7
Incidence and pattern of traumatic spine injury in a single level I trauma center of southern Iran.
Chin J Traumatol. 2023 Jul;26(4):199-203. doi: 10.1016/j.cjtee.2023.01.001. Epub 2023 Jan 10.

本文引用的文献

1
Predicting hospital associated disability from imbalanced data using supervised learning.
Artif Intell Med. 2019 Apr;95:88-95. doi: 10.1016/j.artmed.2018.09.004. Epub 2018 Oct 3.
2
Data Mining Algorithms and Techniques in Mental Health: A Systematic Review.
J Med Syst. 2018 Jul 21;42(9):161. doi: 10.1007/s10916-018-1018-2.
3
Injury patterns among various age and gender groups of trauma patients in southern Iran: A cross-sectional study.
Medicine (Baltimore). 2017 Oct;96(41):e7812. doi: 10.1097/MD.0000000000007812.
4
Discovering medical knowledge using association rule mining in young adults with acute myocardial infarction.
J Med Syst. 2013 Apr;37(2):9896. doi: 10.1007/s10916-012-9896-1. Epub 2013 Jan 15.
5
Classifying highly imbalanced ICU data.
Health Care Manag Sci. 2013 Jun;16(2):119-28. doi: 10.1007/s10729-012-9216-9. Epub 2012 Nov 7.
6
Lessons learned from data mining of WHO mortality database.
Methods Inf Med. 2011;50(4):380-5. doi: 10.3414/ME10-02-0019. Epub 2011 Jun 21.
7
Increases in mortality, length of stay, and cost associated with hospital-acquired infections in trauma patients.
Arch Surg. 2011 Jul;146(7):794-801. doi: 10.1001/archsurg.2011.41. Epub 2011 Mar 21.
8
Late outcomes of trauma patients with infections during index hospitalization.
J Trauma. 2009 Oct;67(4):805-14. doi: 10.1097/TA.0b013e318185e1fb.
9
Risk factors affecting in-hospital mortality in patients with nosocomial infections.
J Formos Med Assoc. 2007 Feb;106(2):110-8. doi: 10.1016/S0929-6646(09)60226-6.
10
Infection control - a problem for patient safety.
N Engl J Med. 2003 Feb 13;348(7):651-6. doi: 10.1056/NEJMhpr020557.

文献AI研究员

20分钟写一篇综述,助力文献阅读效率提升50倍。

立即体验

用中文搜PubMed

大模型驱动的PubMed中文搜索引擎

马上搜索

文档翻译

学术文献翻译模型,支持多种主流文档格式。

立即体验