利用非结构化数据预测住院时间。

The prediction of hospital length of stay using unstructured data.

机构信息

Pôle Territorial Santé Publique et Performance, Centre Hospitalier de Troyes, 101 Avenue Anatole France CS 10718, 10003, Troyes Cedex, France.

Research and Consulting, CODOC SAS, 75008, Paris, France.

出版信息

BMC Med Inform Decis Mak. 2021 Dec 18;21(1):351. doi: 10.1186/s12911-021-01722-4.

DOI:10.1186/s12911-021-01722-4

PMID:34922532

原文链接:https://pmc.ncbi.nlm.nih.gov/articles/PMC8684269/

Abstract

OBJECTIVE

This study aimed to assess the performance improvement for machine learning-based hospital length of stay (LOS) predictions when clinical signs written in text are accounted for and compared to the traditional approach of solely considering structured information such as age, gender and major ICD diagnosis.

METHODS

This study was an observational retrospective cohort study and analyzed patient stays admitted between 1 January to 24 September 2019. For each stay, a patient was admitted through the Emergency Department (ED) and stayed for more than two days in the subsequent service. LOS was predicted using two random forest models. The first included unstructured text extracted from electronic health records (EHRs). A word-embedding algorithm based on UMLS terminology with exact matching restricted to patient-centric affirmation sentences was used to assess the EHR data. The second model was primarily based on structured data in the form of diagnoses coded from the International Classification of Disease 10th Edition (ICD-10) and triage codes (CCMU/GEMSA classifications). Variables common to both models were: age, gender, zip/postal code, LOS in the ED, recent visit flag, assigned patient ward after the ED stay and short-term ED activity. Models were trained on 80% of data and performance was evaluated by accuracy on the remaining 20% test data.

RESULTS

The model using unstructured data had a 75.0% accuracy compared to 74.1% for the model containing structured data. The two models produced a similar prediction in 86.6% of cases. In a secondary analysis restricted to intensive care patients, the accuracy of both models was also similar (76.3% vs 75.0%).

CONCLUSIONS

LOS prediction using unstructured data had similar accuracy to using structured data and can be considered of use to accurately model LOS.

摘要

目的

本研究旨在评估基于机器学习的住院时间（LOS）预测的性能改进，当考虑到文本中记录的临床症状并与仅考虑年龄、性别和主要 ICD 诊断等结构化信息的传统方法进行比较时。

方法

本研究为观察性回顾性队列研究，分析了 2019 年 1 月 1 日至 9 月 24 日期间入院的患者住院情况。每位患者均通过急诊部（ED）入院，并在后续服务中住院超过两天。使用两个随机森林模型预测 LOS。第一个模型包括从电子健康记录（EHR）中提取的非结构化文本。使用基于 UMLS 术语的词嵌入算法，限制为以患者为中心的肯定语句，评估 EHR 数据。第二个模型主要基于诊断编码的国际疾病分类第 10 版（ICD-10）和分诊代码（CCMU/GEMSA 分类）的结构化数据。两个模型共有的变量为：年龄、性别、邮政编码、ED 中的 LOS、最近就诊标志、ED 后分配给患者的病房和短期 ED 活动。模型在 80%的数据上进行训练，并在剩余的 20%测试数据上评估性能。

结果

使用非结构化数据的模型准确率为 75.0%，而包含结构化数据的模型准确率为 74.1%。两种模型在 86.6%的情况下产生了相似的预测。在仅限于重症监护患者的二次分析中，两种模型的准确性也相似（76.3%与 75.0%）。

结论

使用非结构化数据进行 LOS 预测的准确率与使用结构化数据相似，可以考虑用于准确建模 LOS。

https://cdn.ncbi.nlm.nih.gov/pmc/blobs/534e/8684269/bb5c738120c2/12911_2021_1722_Fig1_HTML.jpg

相似文献

The prediction of hospital length of stay using unstructured data.

BMC Med Inform Decis Mak. 2021 Dec 18;21(1):351. doi: 10.1186/s12911-021-01722-4.

Advanced diagnostic imaging utilization during emergency department visits in the United States: A predictive modeling study for emergency department triage.

PLoS One. 2019 Apr 9;14(4):e0214905. doi: 10.1371/journal.pone.0214905. eCollection 2019.

Quantifying the impact of addressing data challenges in prediction of length of stay.

BMC Med Inform Decis Mak. 2021 Oct 30;21(1):298. doi: 10.1186/s12911-021-01660-1.

Prediction of acute appendicitis among patients with undifferentiated abdominal pain at emergency department.

BMC Med Res Methodol. 2022 Jan 14;22(1):18. doi: 10.1186/s12874-021-01490-9.

Parameters affecting length of stay in a pediatric emergency department: a retrospective observational study.

Eur J Pediatr. 2017 May;176(5):591-598. doi: 10.1007/s00431-017-2879-y. Epub 2017 Mar 8.

Prediction of In-hospital Mortality in Emergency Department Patients With Sepsis: A Local Big Data-Driven, Machine Learning Approach.

Acad Emerg Med. 2016 Mar;23(3):269-78. doi: 10.1111/acem.12876. Epub 2016 Feb 13.

Predictors of in-hospital length of stay among cardiac patients: A machine learning approach.

Int J Cardiol. 2019 Aug 1;288:140-147. doi: 10.1016/j.ijcard.2019.01.046. Epub 2019 Jan 19.

Early short-term prediction of emergency department length of stay using natural language processing for low-acuity outpatients.

Am J Emerg Med. 2020 Nov;38(11):2368-2373. doi: 10.1016/j.ajem.2020.03.019. Epub 2020 Mar 10.

A deep attention model to forecast the Length Of Stay and the in-hospital mortality right on admission from ICD codes and demographic data.

J Biomed Inform. 2021 Jun;118:103778. doi: 10.1016/j.jbi.2021.103778. Epub 2021 Apr 17.

Predicting adult neuroscience intensive care unit admission from emergency department triage using a retrospective, tabular-free text machine learning approach.

Sci Rep. 2021 Jan 14;11(1):1381. doi: 10.1038/s41598-021-80985-3.

引用本文的文献

Developing interpretable machine learning models to predict length of stay and disposition decision for adult patients in emergency departments.

BMJ Health Care Inform. 2025 Jun 26;32(1):e101152. doi: 10.1136/bmjhci-2024-101152.

Development of an emergency department length-of-stay prediction model based on machine learning.

World J Emerg Med. 2025 May 1;16(3):220-224. doi: 10.5847/wjem.j.1920-8642.2025.048.

Quadriceps muscle thickness as measured by point-of-care ultrasound is associated with hospital length of stay among hospitalised older patients.

Age Ageing. 2025 Mar 28;54(4). doi: 10.1093/ageing/afaf103.

Leveraging deep neural network and language models for predicting long-term hospitalization risk in schizophrenia.

Schizophrenia (Heidelb). 2025 Mar 5;11(1):35. doi: 10.1038/s41537-025-00585-2.

Hospital Length of Stay Prediction for Planned Admissions Using Observational Medical Outcomes Partnership Common Data Model: Retrospective Study.

J Med Internet Res. 2024 Nov 22;26:e59260. doi: 10.2196/59260.

Development and validation of a machine learning model integrated with the clinical workflow for inpatient discharge date prediction.

Front Digit Health. 2024 Sep 30;6:1455446. doi: 10.3389/fdgth.2024.1455446. eCollection 2024.

BERT-based language model for accurate drug adverse event extraction from social media: implementation, evaluation, and contributions to pharmacovigilance practices.

Front Public Health. 2024 Apr 23;12:1392180. doi: 10.3389/fpubh.2024.1392180. eCollection 2024.

Predicting Postoperative Hospital Stays Using Nursing Narratives and the Reverse Time Attention (RETAIN) Model: Retrospective Cohort Study.

JMIR Med Inform. 2023 Dec 19;11:e45377. doi: 10.2196/45377.

Applying Natural Language Processing to Textual Data From Clinical Data Warehouses: Systematic Review.

JMIR Med Inform. 2023 Dec 15;11:e42477. doi: 10.2196/42477.

Temporal trends in neurosurgical volume and length of stay in a public healthcare system: A decade in review with a focus on the COVID-19 pandemic.

Surg Neurol Int. 2023 Nov 24;14:407. doi: 10.25259/SNI_787_2023. eCollection 2023.

本文引用的文献

Predicting mortality in critically ill patients with diabetes using machine learning and clinical notes.

BMC Med Inform Decis Mak. 2020 Dec 30;20(Suppl 11):295. doi: 10.1186/s12911-020-01318-4.

Deep-learning approaches to identify critically Ill patients at emergency department triage using limited information.

J Am Coll Emerg Physicians Open. 2020 Sep 1;1(5):773-781. doi: 10.1002/emp2.12218. eCollection 2020 Oct.

Extracting medication information from unstructured public health data: a demonstration on data from population-based and tertiary-based samples.

BMC Med Res Methodol. 2020 Oct 15;20(1):258. doi: 10.1186/s12874-020-01131-7.

Emergency department disposition prediction using a deep neural network with integrated clinical narratives and structured data.

Int J Med Inform. 2020 Jul;139:104146. doi: 10.1016/j.ijmedinf.2020.104146. Epub 2020 Apr 23.

Prediction of admission in pediatric emergency department with deep neural networks and triage textual data.

Neural Netw. 2020 Jun;126:170-177. doi: 10.1016/j.neunet.2020.03.012. Epub 2020 Mar 18.

Impact on Length of Stay of a Hospital Medicine Emergency Department Boarder Service.

J Hosp Med. 2020 Mar;15(3):147-153. doi: 10.12788/jhm.3337.

Machine Learning-Based Prediction of Korean Triage and Acuity Scale Level in Emergency Department Patients.

Healthc Inform Res. 2019 Oct;25(4):305-312. doi: 10.4258/hir.2019.25.4.305. Epub 2019 Oct 31.

Identifying risks areas related to medication administrations - text mining analysis using free-text descriptions of incident reports.

BMC Health Serv Res. 2019 Nov 4;19(1):791. doi: 10.1186/s12913-019-4597-9.

Hospital characteristics, rather than surgical volume, predict length of stay following colorectal cancer surgery.

Aust N Z J Public Health. 2020 Feb;44(1):73-82. doi: 10.1111/1753-6405.12932. Epub 2019 Oct 16.

CCMapper: An adaptive NLP-based free-text chief complaint mapping algorithm.

Comput Biol Med. 2019 Oct;113:103398. doi: 10.1016/j.compbiomed.2019.103398. Epub 2019 Aug 21.

文献AI研究员

20分钟写一篇综述，助力文献阅读效率提升50倍。

立即体验

用中文搜PubMed

大模型驱动的PubMed中文搜索引擎

马上搜索

文档翻译

学术文献翻译模型，支持多种主流文档格式。

立即体验

利用非结构化数据预测住院时间。

The prediction of hospital length of stay using unstructured data.

机构信息

出版信息

OBJECTIVE

METHODS

RESULTS

CONCLUSIONS

目的

方法

结果

结论

相似文献

引用本文的文献

本文引用的文献

文献AI研究员

用中文搜PubMed

文档翻译

Suppr 超能文献

相似文献

引用本文的文献

本文引用的文献