预测住院时长：一种用于处理高度偏态数据的两阶段建模方法。

Predicting in-hospital length of stay: a two-stage modeling approach to account for highly skewed data.

作者信息

Xu Zhenhui, Zhao Congwen, Scales Charles D, Henao Ricardo, Goldstein Benjamin A

机构信息

Department of Biostatistics and Bioinformatics, Duke University, 2424 Erwin Road, Suite 1104, Durham, NC, 27705, USA.

Duke Clinical Research Institute, Duke University, Durham, NC, USA.

出版信息

BMC Med Inform Decis Mak. 2022 Apr 24;22(1):110. doi: 10.1186/s12911-022-01855-0.

DOI:10.1186/s12911-022-01855-0

PMID:35462534

原文链接:https://pmc.ncbi.nlm.nih.gov/articles/PMC9035272/

Abstract

BACKGROUND

In the early stages of the COVID-19 pandemic our institution was interested in forecasting how long surgical patients receiving elective procedures would spend in the hospital. Initial examination of our models indicated that, due to the skewed nature of the length of stay, accurate prediction was challenging and we instead opted for a simpler classification model. In this work we perform a deeper examination of predicting in-hospital length of stay.

METHODS

We used electronic health record data on length of stay from 42,209 elective surgeries. We compare different loss-functions (mean squared error, mean absolute error, mean relative error), algorithms (LASSO, Random Forests, multilayer perceptron) and data transformations (log and truncation). We also assess the performance of two stage hybrid classification-regression approach.

RESULTS

Our results show that while it is possible to accurately predict short length of stays, predicting longer length of stay is extremely challenging. As such, we opt for a two-stage model that first classifies patients into long versus short length of stays and then a second stage that fits a regresssor among those predicted to have a short length of stay.

DISCUSSION

The results indicate both the challenges and considerations necessary to applying machine-learning methods to skewed outcomes.

CONCLUSIONS

Two-stage models allow those developing clinical decision support tools to explicitly acknowledge where they can and cannot make accurate predictions.

摘要

背景

在新冠疫情早期，我们机构对预测接受择期手术的外科患者住院时长很感兴趣。对我们模型的初步检查表明，由于住院时长的分布具有偏态性，准确预测具有挑战性，因此我们选择了一个更简单的分类模型。在这项工作中，我们对预测住院时长进行了更深入的研究。

方法

我们使用了42209例择期手术患者住院时长的电子健康记录数据。我们比较了不同的损失函数（均方误差、平均绝对误差、平均相对误差）、算法（套索回归、随机森林、多层感知器）和数据变换（对数变换和截断）。我们还评估了两阶段混合分类回归方法的性能。

结果

我们的结果表明，虽然有可能准确预测短住院时长，但预测长住院时长极具挑战性。因此，我们选择了一个两阶段模型，该模型首先将患者分为长住院时长和短住院时长两类，然后在预测为短住院时长的患者中拟合一个回归模型。

讨论

结果表明了将机器学习方法应用于偏态结果时所面临的挑战和需要考虑的因素。

结论

两阶段模型使开发临床决策支持工具的人员能够明确认识到哪些地方可以做出准确预测，哪些地方不能。

https://cdn.ncbi.nlm.nih.gov/pmc/blobs/7b89/9036772/828cf3c4d7fa/12911_2022_1855_Fig1_HTML.jpg

相似文献

Predicting in-hospital length of stay: a two-stage modeling approach to account for highly skewed data.

BMC Med Inform Decis Mak. 2022 Apr 24;22(1):110. doi: 10.1186/s12911-022-01855-0.

Development and Performance of a Clinical Decision Support Tool to Inform Resource Utilization for Elective Operations.

JAMA Netw Open. 2020 Nov 2;3(11):e2023547. doi: 10.1001/jamanetworkopen.2020.23547.

Consultation length and no-show prediction for improving appointment scheduling efficiency at a cardiology clinic: A data analytics approach.

Int J Med Inform. 2021 Jan;145:104290. doi: 10.1016/j.ijmedinf.2020.104290. Epub 2020 Oct 1.

Predictors of in-hospital length of stay among cardiac patients: A machine learning approach.

Int J Cardiol. 2019 Aug 1;288:140-147. doi: 10.1016/j.ijcard.2019.01.046. Epub 2019 Jan 19.

Machine learning using preoperative patient factors can predict duration of surgery and length of stay for total knee arthroplasty.

Int J Med Inform. 2022 Feb;158:104670. doi: 10.1016/j.ijmedinf.2021.104670. Epub 2021 Dec 22.

A COVID-19 Pandemic Artificial Intelligence-Based System With Deep Learning Forecasting and Automatic Statistical Data Acquisition: Development and Implementation Study.

J Med Internet Res. 2021 May 20;23(5):e27806. doi: 10.2196/27806.

Predicting emergency department orders with multilabel machine learning techniques and simulating effects on length of stay.

J Am Med Inform Assoc. 2019 Dec 1;26(12):1427-1436. doi: 10.1093/jamia/ocz171.

A two-stage modeling approach for breast cancer survivability prediction.

Int J Med Inform. 2021 May;149:104438. doi: 10.1016/j.ijmedinf.2021.104438. Epub 2021 Mar 11.

Artificial intelligence guided predicting the length of hospital-stay in a neurosurgical hospital based on the text data of electronic medical records.

Zh Vopr Neirokhir Im N N Burdenko. 2022;86(6):43-51. doi: 10.17116/neiro20228606143.

Machine learning models to predict length of stay and discharge destination in complex head and neck surgery.

Head Neck. 2021 Mar;43(3):788-797. doi: 10.1002/hed.26528. Epub 2020 Nov 3.

引用本文的文献

Forecasting Surgical Bed Utilization: Architectural Design of a Machine Learning Pipeline Incorporating Predicted Length of Stay and Surgical Volume.

J Med Syst. 2025 May 21;49(1):67. doi: 10.1007/s10916-025-02201-3.

Assessing artificial intelligence ability in predicting hospitalization duration for pleural empyema patients managed with uniportal video-assisted thoracoscopic surgery: a retrospective observational study.

BMC Surg. 2025 May 19;25(1):218. doi: 10.1186/s12893-025-02959-w.

Hybrid Machine Learning Approach to Zero-Inflated Data Improves Accuracy of Dengue Prediction.

PLoS Negl Trop Dis. 2024 Oct 21;18(10):e0012599. doi: 10.1371/journal.pntd.0012599. eCollection 2024 Oct.

A conditional multi-label model to improve prediction of a rare outcome: An illustration predicting autism diagnosis.

J Biomed Inform. 2024 Sep;157:104711. doi: 10.1016/j.jbi.2024.104711. Epub 2024 Aug 30.

Exploring trends and autonomy levels of adaptive business intelligence in healthcare: A systematic review.

PLoS One. 2024 May 10;19(5):e0302697. doi: 10.1371/journal.pone.0302697. eCollection 2024.

Predicting length of stay ranges by using novel deep neural networks.

Heliyon. 2023 Feb 9;9(2):e13573. doi: 10.1016/j.heliyon.2023.e13573. eCollection 2023 Feb.

本文引用的文献

Development of a Machine Learning Model Using Electronic Health Record Data to Identify Antibiotic Use Among Hospitalized Patients.

JAMA Netw Open. 2021 Mar 1;4(3):e213460. doi: 10.1001/jamanetworkopen.2021.3460.

Development and Performance of a Clinical Decision Support Tool to Inform Resource Utilization for Elective Operations.

JAMA Netw Open. 2020 Nov 2;3(11):e2023547. doi: 10.1001/jamanetworkopen.2020.23547.

Personalized predictions of patient outcomes during and after hospitalization using artificial intelligence.

NPJ Digit Med. 2020 Apr 3;3:51. doi: 10.1038/s41746-020-0249-z. eCollection 2020.

Use of Machine Learning Algorithms for Prediction of Fetal Risk using Cardiotocographic Data.

Int J Appl Basic Med Res. 2019 Oct-Dec;9(4):226-230. doi: 10.4103/ijabmr.IJABMR_370_18. Epub 2019 Oct 11.

A Two-Stage Model to Predict Surgical Patients' Lengths of Stay From an Electronic Patient Database.

IEEE J Biomed Health Inform. 2019 Mar;23(2):848-856. doi: 10.1109/JBHI.2018.2819646. Epub 2018 Mar 26.

Quantifying the impact of different approaches for handling continuous predictors on the performance of a prognostic model.

Stat Med. 2016 Oct 15;35(23):4124-35. doi: 10.1002/sim.6986. Epub 2016 May 18.

Predicting prolonged length of hospital stay in older emergency department users: use of a novel analysis method, the Artificial Neural Network.

Eur J Intern Med. 2015 Sep;26(7):478-82. doi: 10.1016/j.ejim.2015.06.002. Epub 2015 Jul 2.

Comparison of regression methods for modeling intensive care length of stay.

PLoS One. 2014 Oct 31;9(10):e109684. doi: 10.1371/journal.pone.0109684. eCollection 2014.

Predicting length of stay from an electronic patient record system: a primary total knee replacement example.

BMC Med Inform Decis Mak. 2014 Apr 4;14:26. doi: 10.1186/1472-6947-14-26.

Use of data mining techniques to determine and predict length of stay of cardiac patients.

Healthc Inform Res. 2013 Jun;19(2):121-9. doi: 10.4258/hir.2013.19.2.121. Epub 2013 Jun 30.

文献AI研究员

20分钟写一篇综述，助力文献阅读效率提升50倍。

立即体验

用中文搜PubMed

大模型驱动的PubMed中文搜索引擎

马上搜索

文档翻译

学术文献翻译模型，支持多种主流文档格式。

立即体验

预测住院时长：一种用于处理高度偏态数据的两阶段建模方法。

Predicting in-hospital length of stay: a two-stage modeling approach to account for highly skewed data.

作者信息

机构信息

出版信息

BACKGROUND

METHODS

RESULTS

DISCUSSION

CONCLUSIONS

背景

方法

结果

讨论

结论

相似文献

引用本文的文献

本文引用的文献

文献AI研究员

用中文搜PubMed

文档翻译

Suppr 超能文献

相似文献

引用本文的文献

本文引用的文献