• 文献检索
  • 文档翻译
  • 深度研究
  • 学术资讯
  • Suppr Zotero 插件Zotero 插件
  • 邀请有礼
  • 套餐&价格
  • 历史记录
应用&插件
Suppr Zotero 插件Zotero 插件浏览器插件Mac 客户端Windows 客户端微信小程序
定价
高级版会员购买积分包购买API积分包
服务
文献检索文档翻译深度研究API 文档MCP 服务
关于我们
关于 Suppr公司介绍联系我们用户协议隐私条款
关注我们

Suppr 超能文献

核心技术专利:CN118964589B侵权必究
粤ICP备2023148730 号-1Suppr @ 2026

文献检索

告别复杂PubMed语法,用中文像聊天一样搜索,搜遍4000万医学文献。AI智能推荐,让科研检索更轻松。

立即免费搜索

文件翻译

保留排版,准确专业,支持PDF/Word/PPT等文件格式,支持 12+语言互译。

免费翻译文档

深度研究

AI帮你快速写综述,25分钟生成高质量综述,智能提取关键信息,辅助科研写作。

立即免费体验

量化解决数据挑战对住院时间预测的影响。

Quantifying the impact of addressing data challenges in prediction of length of stay.

机构信息

Center for Health Informatics and Technology, The Maersk Mc-Kinney Institute, University of Southern Denmark, Odense, Denmark.

Department of Mathematics and Computer Science (IMADA), University of Southern Denmark, Odense, Denmark.

出版信息

BMC Med Inform Decis Mak. 2021 Oct 30;21(1):298. doi: 10.1186/s12911-021-01660-1.

DOI:10.1186/s12911-021-01660-1
PMID:34749708
原文链接:https://pmc.ncbi.nlm.nih.gov/articles/PMC8576901/
Abstract

BACKGROUND

Prediction of length of stay (LOS) at admission time can provide physicians and nurses insight into the illness severity of patients and aid them in avoiding adverse events and clinical deterioration. It also assists hospitals with more effectively managing their resources and manpower.

METHODS

In this field of research, there are some important challenges, such as missing values and LOS data skewness. Moreover, various studies use a binary classification which puts a wide range of patients with different conditions into one category. To address these shortcomings, first multivariate imputation techniques are applied to fill incomplete records, then two proper resampling techniques, namely Borderline-SMOTE and SMOGN, are applied to address data skewness in the classification and regression domains, respectively. Finally, machine learning (ML) techniques including neural networks, extreme gradient boosting, random forest, support vector machine, and decision tree are implemented for both approaches to predict LOS of patients admitted to the Emergency Department of Odense University Hospital between June 2018 and April 2019. The ML models are developed based on data obtained from patients at admission time, including pulse rate, arterial blood oxygen saturation, respiratory rate, systolic blood pressure, triage category, arrival ICD-10 codes, age, and gender.

RESULTS

The performance of predictive models before and after addressing missing values and data skewness is evaluated using four evaluation metrics namely receiver operating characteristic, area under the curve (AUC), R-squared score (R), and normalized root mean square error (NRMSE). Results show that the performance of predictive models is improved on average by 15.75% for AUC, 32.19% for R score, and 11.32% for NRMSE after addressing the mentioned challenges. Moreover, our results indicate that there is a relationship between the missing values rate, data skewness, and illness severity of patients, so it is clinically essential to take incomplete records of patients into account and apply proper solutions for interpolation of missing values.

CONCLUSION

We propose a new method comprised of three stages: missing values imputation, data skewness handling, and building predictive models based on classification and regression approaches. Our results indicated that addressing these challenges in a proper way enhanced the performance of models significantly, which led to a more valid prediction of LOS.

摘要

背景

在入院时预测住院时间(LOS)可以让医生和护士了解患者的疾病严重程度,并帮助他们避免不良事件和临床恶化。它还可以帮助医院更有效地管理资源和人力。

方法

在这个研究领域,存在一些重要的挑战,例如缺失值和 LOS 数据偏度。此外,各种研究使用二元分类,将不同条件的广泛患者归入一个类别。为了解决这些缺点,首先应用多元插补技术来填补不完整的记录,然后应用两种适当的重采样技术,即边界-SMOTE 和 SMOGN,分别在分类和回归领域解决数据偏度问题。最后,应用机器学习(ML)技术,包括神经网络、极端梯度提升、随机森林、支持向量机和决策树,用于这两种方法来预测 2018 年 6 月至 2019 年 4 月期间在奥登塞大学医院急诊科入院的患者的 LOS。ML 模型是基于患者入院时的数据开发的,包括脉搏率、动脉血氧饱和度、呼吸频率、收缩压、分诊类别、到达 ICD-10 代码、年龄和性别。

结果

使用四个评估指标,即接收者操作特征、曲线下面积(AUC)、R 平方得分(R)和归一化均方根误差(NRMSE),评估了在解决缺失值和数据偏度前后预测模型的性能。结果表明,在解决了所述挑战后,AUC 的平均性能提高了 15.75%,R 得分提高了 32.19%,NRMSE 提高了 11.32%。此外,我们的结果表明,缺失值率、数据偏度和患者的疾病严重程度之间存在关系,因此,考虑患者的不完整记录并应用适当的缺失值插值解决方案在临床上是必要的。

结论

我们提出了一种由三个阶段组成的新方法:缺失值插补、数据偏度处理和基于分类和回归方法构建预测模型。我们的结果表明,以适当的方式解决这些挑战可以显著提高模型的性能,从而更有效地预测 LOS。

https://cdn.ncbi.nlm.nih.gov/pmc/blobs/1764/8576901/db70423407e5/12911_2021_1660_Fig8_HTML.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/1764/8576901/de788fd0d9c2/12911_2021_1660_Fig1_HTML.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/1764/8576901/4ec89efb4eaf/12911_2021_1660_Fig2_HTML.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/1764/8576901/dcfa6f556429/12911_2021_1660_Fig3_HTML.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/1764/8576901/e25c83f67a10/12911_2021_1660_Fig4_HTML.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/1764/8576901/8ff2c1ebbb18/12911_2021_1660_Fig5_HTML.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/1764/8576901/7080081e83fb/12911_2021_1660_Fig6_HTML.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/1764/8576901/f3343bb0a7df/12911_2021_1660_Fig7_HTML.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/1764/8576901/db70423407e5/12911_2021_1660_Fig8_HTML.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/1764/8576901/de788fd0d9c2/12911_2021_1660_Fig1_HTML.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/1764/8576901/4ec89efb4eaf/12911_2021_1660_Fig2_HTML.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/1764/8576901/dcfa6f556429/12911_2021_1660_Fig3_HTML.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/1764/8576901/e25c83f67a10/12911_2021_1660_Fig4_HTML.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/1764/8576901/8ff2c1ebbb18/12911_2021_1660_Fig5_HTML.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/1764/8576901/7080081e83fb/12911_2021_1660_Fig6_HTML.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/1764/8576901/f3343bb0a7df/12911_2021_1660_Fig7_HTML.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/1764/8576901/db70423407e5/12911_2021_1660_Fig8_HTML.jpg

相似文献

1
Quantifying the impact of addressing data challenges in prediction of length of stay.量化解决数据挑战对住院时间预测的影响。
BMC Med Inform Decis Mak. 2021 Oct 30;21(1):298. doi: 10.1186/s12911-021-01660-1.
2
Machine learning-based prediction of hospital prolonged length of stay admission at emergency department: a Gradient Boosting algorithm analysis.基于机器学习的急诊科住院时间延长预测:梯度提升算法分析
Front Artif Intell. 2023 Jul 28;6:1179226. doi: 10.3389/frai.2023.1179226. eCollection 2023.
3
Emergency department triage prediction of clinical outcomes using machine learning models.运用机器学习模型对急诊科患者临床结局进行分诊预测。
Crit Care. 2019 Feb 22;23(1):64. doi: 10.1186/s13054-019-2351-7.
4
A Machine Learning Approach to Predicting Need for Hospitalization for Pediatric Asthma Exacerbation at the Time of Emergency Department Triage.一种机器学习方法,用于预测儿科哮喘急诊分诊时需要住院治疗的情况。
Acad Emerg Med. 2018 Dec;25(12):1463-1470. doi: 10.1111/acem.13655. Epub 2018 Nov 29.
5
Machine learning for developing a prediction model of hospital admission of emergency department patients: Hype or hope?用于开发急诊科患者住院预测模型的机器学习:炒作还是希望?
Int J Med Inform. 2021 Aug;152:104496. doi: 10.1016/j.ijmedinf.2021.104496. Epub 2021 May 15.
6
Machine learning as a tool to identify inpatients who are not at risk of adverse drug events in a large dataset of a tertiary care hospital in the USA.机器学习在识别美国一家三级护理医院大型数据集中心非药物不良事件风险患者中的应用。
Br J Clin Pharmacol. 2023 Dec;89(12):3523-3538. doi: 10.1111/bcp.15846. Epub 2023 Aug 1.
7
A Machine Learning Prediction Model of Respiratory Failure Within 48 Hours of Patient Admission for COVID-19: Model Development and Validation.基于机器学习的 COVID-19 患者入院 48 小时内发生呼吸衰竭的预测模型:模型建立与验证。
J Med Internet Res. 2021 Feb 10;23(2):e24246. doi: 10.2196/24246.
8
Machine learning-based models to support decision-making in emergency department triage for patients with suspected cardiovascular disease.基于机器学习的模型在急诊科分诊疑似心血管疾病患者中的决策支持。
Int J Med Inform. 2021 Jan;145:104326. doi: 10.1016/j.ijmedinf.2020.104326. Epub 2020 Nov 3.
9
Predictors of in-hospital length of stay among cardiac patients: A machine learning approach.心脏病人住院时间的预测因素:一种机器学习方法。
Int J Cardiol. 2019 Aug 1;288:140-147. doi: 10.1016/j.ijcard.2019.01.046. Epub 2019 Jan 19.
10
Machine Learning-Based Prediction of Clinical Outcomes for Children During Emergency Department Triage.基于机器学习的急诊科分诊中儿童临床结局预测。
JAMA Netw Open. 2019 Jan 4;2(1):e186937. doi: 10.1001/jamanetworkopen.2018.6937.

引用本文的文献

1
Evaluating the Machine Learning Models in Predicting Intensive Care Unit Discharge for Neurosurgical Patients Undergoing Craniotomy: A Big Data Analysis.评估机器学习模型对接受开颅手术的神经外科患者重症监护病房出院情况的预测:一项大数据分析。
Neurocrit Care. 2025 May 6. doi: 10.1007/s12028-025-02246-9.
2
A comparative study of neuro-fuzzy and neural network models in predicting length of stay in university hospital.神经模糊模型与神经网络模型在预测大学医院住院时间方面的比较研究。
BMC Health Serv Res. 2025 Mar 27;25(1):446. doi: 10.1186/s12913-025-12623-x.
3
Applications of Artificial Intelligence for Metastatic Gastrointestinal Cancer: A Systematic Literature Review.

本文引用的文献

1
Using data mining to predict emergency department length of stay greater than 4 hours: Derivation and single-site validation of a decision tree algorithm.运用数据挖掘预测急诊停留时间超过 4 小时:决策树算法的推导和单站点验证。
Emerg Med Australas. 2020 Jun;32(3):416-421. doi: 10.1111/1742-6723.13421. Epub 2019 Dec 6.
2
The value of missing information in severity of illness score development.缺失信息在疾病严重程度评分发展中的价值。
J Biomed Inform. 2019 Sep;97:103255. doi: 10.1016/j.jbi.2019.103255. Epub 2019 Jul 23.
3
Predictors of in-hospital length of stay among cardiac patients: A machine learning approach.
人工智能在转移性胃肠道癌中的应用:一项系统文献综述
Cancers (Basel). 2025 Feb 6;17(3):558. doi: 10.3390/cancers17030558.
4
The Depression Anxiety Stress Scale 8: investigating its cutoff scores in relevance to loneliness and burnout among dementia family caregivers.抑郁焦虑压力量表 8 版:在痴呆症家庭照顾者的孤独感和倦怠感方面,研究其与量表的分界值。
Sci Rep. 2024 Jun 6;14(1):13075. doi: 10.1038/s41598-024-60127-1.
5
PSO-XnB: a proposed model for predicting hospital stay of CAD patients.PSO-XnB:一种用于预测冠心病患者住院时间的提议模型。
Front Artif Intell. 2024 May 3;7:1381430. doi: 10.3389/frai.2024.1381430. eCollection 2024.
6
Evaluation of different machine learning algorithms for predicting the length of stay in the emergency departments: a single-centre study.评估不同机器学习算法用于预测急诊科住院时间:一项单中心研究。
Front Digit Health. 2024 Jan 8;5:1323849. doi: 10.3389/fdgth.2023.1323849. eCollection 2023.
7
Predictive models in emergency medicine and their missing data strategies: a systematic review.急诊医学中的预测模型及其缺失数据策略:一项系统综述。
NPJ Digit Med. 2023 Feb 23;6(1):28. doi: 10.1038/s41746-023-00770-6.
8
Length of Stay Prediction Model of Indoor Patients Based on Light Gradient Boosting Machine.基于 Light Gradient Boosting Machine 的室内患者住院时间预测模型。
Comput Intell Neurosci. 2022 Aug 30;2022:9517029. doi: 10.1155/2022/9517029. eCollection 2022.
9
Network analytics and machine learning for predicting length of stay in elderly patients with chronic diseases at point of admission.用于预测慢性病老年患者入院时住院时间的网络分析和机器学习。
BMC Med Inform Decis Mak. 2022 Mar 10;22(1):62. doi: 10.1186/s12911-022-01802-z.
心脏病人住院时间的预测因素:一种机器学习方法。
Int J Cardiol. 2019 Aug 1;288:140-147. doi: 10.1016/j.ijcard.2019.01.046. Epub 2019 Jan 19.
4
Artificial intelligence and machine learning in emergency medicine.急诊医学中的人工智能与机器学习
Emerg Med Australas. 2018 Dec;30(6):870-874. doi: 10.1111/1742-6723.13145. Epub 2018 Jul 16.
5
Risk factors associated with short term mortality changes over time, after arrival to the emergency department.到达急诊科后,与短期死亡率变化相关的风险因素会随时间而改变。
Scand J Trauma Resusc Emerg Med. 2018 Apr 20;26(1):29. doi: 10.1186/s13049-018-0493-2.
6
Evaluation of hospital outcomes: the relation between length-of-stay, readmission, and mortality in a large international administrative database.医院结局评估:大型国际管理数据库中住院时间、再入院率和死亡率之间的关系
BMC Health Serv Res. 2018 Feb 14;18(1):116. doi: 10.1186/s12913-018-2916-1.
7
Associations Between Short or Long Length of Stay and 30-Day Readmission and Mortality in Hospitalized Patients With Heart Failure.心力衰竭住院患者的住院时间长短与 30 天再入院和死亡率的关系。
JACC Heart Fail. 2017 Aug;5(8):578-588. doi: 10.1016/j.jchf.2017.03.012. Epub 2017 May 10.
8
Predicting Length of Stay among Patients Discharged from the Emergency Department-Using an Accelerated Failure Time Model.使用加速失效时间模型预测急诊科出院患者的住院时间
PLoS One. 2017 Jan 20;12(1):e0165756. doi: 10.1371/journal.pone.0165756. eCollection 2017.
9
Length of Hospital Stay Prediction at the Admission Stage for Cardiology Patients Using Artificial Neural Network.基于人工神经网络的心脏病患者入院阶段住院时间预测。
J Healthc Eng. 2016;2016. doi: 10.1155/2016/7035463.
10
Opportunities and challenges in developing risk prediction models with electronic health records data: a systematic review.利用电子健康记录数据开发风险预测模型的机遇与挑战:一项系统综述
J Am Med Inform Assoc. 2017 Jan;24(1):198-208. doi: 10.1093/jamia/ocw042. Epub 2016 May 17.