Espinosa-Moreno Juan Carlos, García-García Fernando, Mas-Bilbao Naia, García-Gutiérrez Susana, Legarreta-Olabarrieta María José, Lee Dae-Jin
Basque Center for Applied Mathematics (BCAM), Bilbao, Basque Country, Spain.
Department of Mathematics, University of the Basque Country (UPV/EHU), Leioa, Basque Country, Spain.
PLoS One. 2025 Aug 26;20(8):e0322101. doi: 10.1371/journal.pone.0322101. eCollection 2025.
Length of Stay (LoS) for in-hospital patients is a relevant indicator of efficiency in healthcare. Moreover, it is often related to the occurrence of hospital-acquired complications. In this work, we aim to explore time-to-event analysis for modelling LoS. We employed competing risk models (CR), as we considered two mutually exclusive outcomes: favorable discharge and deterioration. The explanatory variables included the patient's sex, age, and longitudinal vital signs collected from a dataset comprising [Formula: see text] admissions. To address sparse measurements, we transformed longitudinal vital signs into cross-sectional statistics. Our approach involves data pre-processing, imputation of missing data, and variable selection. We proposed four types of CR models: Cause-specific Cox, Sub-distribution hazard, and two variants of Random Survival Forests, with both generalised Log-Rank test (cause-specific hazard estimates) and Gray's test (cumulative incidences estimations) as node splitting rules. Performance in LoS CR models was evaluated over a time frame from 2 to 15 days. Additionally, we considered baselines with two well-established clinical early warning scores the National Early Warning Score (NEWS) and the Modified Early Warning Score (MEWS). The best model was Random Survival Forest using Gray's test split, with Integrated Brier Score[×100] of 0.386, C-Index above 99%, and Brier Score below 0.006, along the entire time frame. Employing cross-sectional statistics derived from vital signs, along with rigorous data pre-processing, outperformed the degree of correctness of modelling LoS, compared to NEWS and MEWS.
住院患者的住院时间(LoS)是医疗保健效率的一个相关指标。此外,它通常与医院获得性并发症的发生有关。在这项工作中,我们旨在探索用于对LoS进行建模的事件时间分析。我们采用了竞争风险模型(CR),因为我们考虑了两个相互排斥的结果:良好出院和病情恶化。解释变量包括患者的性别、年龄以及从一个包含[公式:见文本]次入院数据集中收集的纵向生命体征。为了解决稀疏测量问题,我们将纵向生命体征转换为横断面统计数据。我们的方法包括数据预处理、缺失数据插补和变量选择。我们提出了四种类型的CR模型:特定原因的Cox模型、子分布风险模型以及随机生存森林的两个变体,使用广义对数秩检验(特定原因风险估计)和格雷检验(累积发病率估计)作为节点分裂规则。LoS CR模型的性能在2至15天的时间范围内进行评估。此外,我们考虑了两个成熟的临床早期预警评分——国家早期预警评分(NEWS)和改良早期预警评分(MEWS)作为基线。最佳模型是使用格雷检验分裂的随机生存森林模型,在整个时间范围内,综合Brier评分[×100]为0.386,C指数高于99%,Brier评分低于0.006。与NEWS和MEWS相比,采用从生命体征得出的横断面统计数据以及严格的数据预处理,在LoS建模的正确性方面表现更优。