一种用于对计数数据建模的统计方法比较及其在住院时间中的应用。

A comparison of statistical methods for modeling count data with an application to hospital length of stay.

机构信息

School of Mathematical and Statistical Sciences, University of Texas Rio Grande Valley, One West University Boulevard, Brownsville CampusBrownsville, TX, 78520, USA.

出版信息

BMC Med Res Methodol. 2022 Aug 4;22(1):211. doi: 10.1186/s12874-022-01685-8.

DOI:10.1186/s12874-022-01685-8

PMID:35927612

原文链接:https://pmc.ncbi.nlm.nih.gov/articles/PMC9351158/

Abstract

BACKGROUND

Hospital length of stay (LOS) is a key indicator of hospital care management efficiency, cost of care, and hospital planning. Hospital LOS is often used as a measure of a post-medical procedure outcome, as a guide to the benefit of a treatment of interest, or as an important risk factor for adverse events. Therefore, understanding hospital LOS variability is always an important healthcare focus. Hospital LOS data can be treated as count data, with discrete and non-negative values, typically right skewed, and often exhibiting excessive zeros. In this study, we compared the performance of the Poisson, negative binomial (NB), zero-inflated Poisson (ZIP), and zero-inflated negative binomial (ZINB) regression models using simulated and empirical data.

METHODS

Data were generated under different simulation scenarios with varying sample sizes, proportions of zeros, and levels of overdispersion. Analysis of hospital LOS was conducted using empirical data from the Medical Information Mart for Intensive Care database.

RESULTS

Results showed that Poisson and ZIP models performed poorly in overdispersed data. ZIP outperformed the rest of the regression models when the overdispersion is due to zero-inflation only. NB and ZINB regression models faced substantial convergence issues when incorrectly used to model equidispersed data. NB model provided the best fit in overdispersed data and outperformed the ZINB model in many simulation scenarios with combinations of zero-inflation and overdispersion, regardless of the sample size. In the empirical data analysis, we demonstrated that fitting incorrect models to overdispersed data leaded to incorrect regression coefficients estimates and overstated significance of some of the predictors.

CONCLUSIONS

Based on this study, we recommend to the researchers that they consider the ZIP models for count data with zero-inflation only and NB models for overdispersed data or data with combinations of zero-inflation and overdispersion. If the researcher believes there are two different data generating mechanisms producing zeros, then the ZINB regression model may provide greater flexibility when modeling the zero-inflation and overdispersion.

摘要

背景

住院时长（LOS）是医院管理效率、医疗成本和医院规划的关键指标。医院 LOS 通常被用作医疗后程序结果的衡量标准，作为治疗效果的指导，或作为不良事件的重要风险因素。因此，了解医院 LOS 的变化一直是医疗保健的重点。医院 LOS 数据可以视为计数数据，具有离散的非负数值，通常呈右偏态分布，并且经常出现大量零值。在这项研究中，我们使用模拟数据和实际数据比较了泊松、负二项式（NB）、零膨胀泊松（ZIP）和零膨胀负二项式（ZINB）回归模型的性能。

方法

在不同的模拟场景下，根据样本量、零值比例和过度离散程度的变化生成数据。使用来自重症监护医疗信息集市数据库的实际数据对医院 LOS 进行分析。

结果

结果表明，泊松和 ZIP 模型在过度离散数据下表现不佳。当过度离散仅由于零膨胀引起时，ZIP 模型优于其他回归模型。当错误地用于模拟等分散数据时，NB 和 ZINB 回归模型会遇到严重的收敛问题。NB 模型在过度离散数据中提供了最佳拟合，并且在许多具有零膨胀和过度离散组合的模拟场景中，无论样本量如何，都优于 ZINB 模型。在实际数据分析中，我们证明了将错误模型拟合到过度离散数据中会导致回归系数估计错误，并夸大了一些预测因子的显著性。

结论

基于这项研究，我们建议研究人员对于仅具有零膨胀的计数数据考虑使用 ZIP 模型，对于过度离散数据或具有零膨胀和过度离散组合的数据考虑使用 NB 模型。如果研究人员认为有两种不同的数据生成机制产生零值，则在对零膨胀和过度离散进行建模时，ZINB 回归模型可能提供更大的灵活性。

https://cdn.ncbi.nlm.nih.gov/pmc/blobs/c7b3/9351158/5f7e609fb513/12874_2022_1685_Fig1_HTML.jpg

相似文献

A comparison of statistical methods for modeling count data with an application to hospital length of stay.

BMC Med Res Methodol. 2022 Aug 4;22(1):211. doi: 10.1186/s12874-022-01685-8.

On performance of parametric and distribution-free models for zero-inflated and over-dispersed count responses.

Stat Med. 2015 Oct 30;34(24):3235-45. doi: 10.1002/sim.6560. Epub 2015 Jun 15.

Multilevel modeling in single-case studies with zero-inflated and overdispersed count data.

Behav Res Methods. 2024 Apr;56(4):2765-2781. doi: 10.3758/s13428-024-02359-7. Epub 2024 Feb 21.

Models for analyzing zero-inflated and overdispersed count data: an application to cigarette and marijuana use.

Nicotine Tob Res. 2018 Apr 18;22(8):1390-8. doi: 10.1093/ntr/nty072.

Evaluation of negative binomial and zero-inflated negative binomial models for the analysis of zero-inflated count data: application to the telemedicine for children with medical complexity trial.

Trials. 2023 Sep 27;24(1):613. doi: 10.1186/s13063-023-07648-8.

Statistical modelling of falls count data with excess zeros.

Inj Prev. 2011 Aug;17(4):266-70. doi: 10.1136/ip.2011.031740. Epub 2011 Jun 8.

Analyzing hospitalization data: potential limitations of Poisson regression.

Nephrol Dial Transplant. 2015 Aug;30(8):1244-9. doi: 10.1093/ndt/gfv071. Epub 2015 Mar 25.

A simulation study of the performance of statistical models for count outcomes with excessive zeros.

Stat Med. 2024 Oct 30;43(24):4752-4767. doi: 10.1002/sim.10198. Epub 2024 Aug 28.

Marginalized zero-inflated negative binomial regression with application to dental caries.

Stat Med. 2016 May 10;35(10):1722-35. doi: 10.1002/sim.6804. Epub 2015 Nov 15.

A score test for overdispersion in zero-inflated poisson mixed regression model.

Stat Med. 2007 Mar 30;26(7):1608-22. doi: 10.1002/sim.2616.

引用本文的文献

Relationship between Household Tuberculosis and Socioeconomic and Bioenvironmental Factors: A Statistical Model Approach Using NFHS-5 Data.

Indian J Community Med. 2025 Jul-Aug;50(4):689-693. doi: 10.4103/ijcm.ijcm_191_24. Epub 2025 Feb 21.

Does a take-home dose program result in better patient adherence to methadone? Evidence from Vietnam.

Harm Reduct J. 2025 Jul 28;22(1):131. doi: 10.1186/s12954-025-01279-9.

Does change in area-level deprivation, change health outcomes? A latent class growth analysis of population data.

SSM Popul Health. 2025 Jun 11;31:101826. doi: 10.1016/j.ssmph.2025.101826. eCollection 2025 Sep.

Zero-inflated models for the evaluation of colorectal polyps in colon cancer screening studies-a value-based biostatistics practice.

PeerJ. 2025 May 26;13:e19504. doi: 10.7717/peerj.19504. eCollection 2025.

Prolonged Length of Stay at Out-Of-State Trauma Centers: Potential Role for Repatriation.

J Am Coll Surg. 2025 May 16. doi: 10.1097/XCS.0000000000001449.

Modeling the Microsurgical Learning Curve Using a Poisson-Based Statistical Approach for Skill Assessment.

Cureus. 2025 Apr 25;17(4):e83009. doi: 10.7759/cureus.83009. eCollection 2025 Apr.

The Prevalence, Risk Factors, and Clinical Outcomes of Vitamin C Deficiency in Adult Hospitalised Patients: A Retrospective Observational Study.

Nutrients. 2025 Mar 25;17(7):1131. doi: 10.3390/nu17071131.

Development of Multiservice Machine Learning Models to Predict Postsurgical Length of Stay and Discharge Disposition at the Time of Case Posting.

Ann Surg Open. 2025 Jan 31;6(1):e547. doi: 10.1097/AS9.0000000000000547. eCollection 2025 Mar.

Acute pain trajectories in elderly patients with fragility hip fractures.

Bone. 2025 Apr;193:117428. doi: 10.1016/j.bone.2025.117428. Epub 2025 Feb 22.

Low falls and inpatient complications increase risk for longer length of stay in older persons admitted following trauma.

BMC Geriatr. 2025 Feb 14;25(1):98. doi: 10.1186/s12877-025-05755-6.

本文引用的文献

The association between opening a short stay paediatric assessment unit and trends in short stay hospital admissions.

BMC Health Serv Res. 2021 May 29;21(1):523. doi: 10.1186/s12913-021-06541-x.

Preoperative Physical Therapy Results in Shorter Length of Stay and Discharge Disposition Following Total Knee Arthroplasty: A Retrospective Study.

J Rehabil Med Clin Commun. 2019 May 23;2:1000017. doi: 10.2340/20030711-1000017. eCollection 2019.

Statistical models for analyzing count data: predictors of length of stay among HIV patients in Portugal using a multilevel model.

BMC Health Serv Res. 2021 Apr 21;21(1):372. doi: 10.1186/s12913-021-06389-1.

A Study of Factors Affecting the Length of Hospital Stay of COVID-19 Patients by Cox-Proportional Hazard Model in a South Indian Tertiary Care Hospital.

J Prim Care Community Health. 2021 Jan-Dec;12:21501327211000231. doi: 10.1177/21501327211000231.

Cost-Effectiveness Analysis of Type 2 Diabetes Mellitus (T2DM) Treatment in Patients with Complications of Kidney and Peripheral Vascular Diseases in Indonesia.

Healthcare (Basel). 2021 Feb 16;9(2):211. doi: 10.3390/healthcare9020211.

Predicting Length of Stay and Discharge Destination for Surgical Patients: A Cohort Study.

Int J Environ Res Public Health. 2020 Dec 18;17(24):9490. doi: 10.3390/ijerph17249490.

COVID-19 length of hospital stay: a systematic review and data synthesis.

BMC Med. 2020 Sep 3;18(1):270. doi: 10.1186/s12916-020-01726-3.

Costs and Length of Stay of Hospitalizations due to Diabetes-Related Complications.

J Diabetes Res. 2019 Sep 8;2019:2363292. doi: 10.1155/2019/2363292. eCollection 2019.

Excess length of hospital stay due to healthcare acquired infections: methodologies evaluation.

Ann Ig. 2019 Sep-Oct;31(5):507-516. doi: 10.7416/ai.2019.2311.

Improving length of stay prediction using a hidden Markov model.

AMIA Jt Summits Transl Sci Proc. 2019 May 6;2019:425-434. eCollection 2019.

文献AI研究员

20分钟写一篇综述，助力文献阅读效率提升50倍。

立即体验

用中文搜PubMed

大模型驱动的PubMed中文搜索引擎

马上搜索

文档翻译

学术文献翻译模型，支持多种主流文档格式。

立即体验

一种用于对计数数据建模的统计方法比较及其在住院时间中的应用。

A comparison of statistical methods for modeling count data with an application to hospital length of stay.

机构信息

School of Mathematical and Statistical Sciences, University of Texas Rio Grande Valley, One West University Boulevard, Brownsville CampusBrownsville, TX, 78520, USA.

出版信息

BMC Med Res Methodol. 2022 Aug 4;22(1):211. doi: 10.1186/s12874-022-01685-8.

DOI:10.1186/s12874-022-01685-8

PMID:35927612

原文链接:https://pmc.ncbi.nlm.nih.gov/articles/PMC9351158/

Abstract

BACKGROUND

METHODS

RESULTS

CONCLUSIONS

摘要

一种用于对计数数据建模的统计方法比较及其在住院时间中的应用。

A comparison of statistical methods for modeling count data with an application to hospital length of stay.

机构信息

出版信息

BACKGROUND

METHODS

RESULTS

CONCLUSIONS

背景

方法

结果

结论

相似文献

引用本文的文献

本文引用的文献

文献AI研究员

用中文搜PubMed

文档翻译

Suppr 超能文献

一种用于对计数数据建模的统计方法比较及其在住院时间中的应用。

A comparison of statistical methods for modeling count data with an application to hospital length of stay.

机构信息

出版信息

BACKGROUND

METHODS

RESULTS

CONCLUSIONS

背景

方法

结果

结论

相似文献

引用本文的文献

本文引用的文献