Suppr超能文献

理清统计建模中的结构零和随机零。

Untangle the Structural and Random Zeros in Statistical Modelings.

作者信息

Tang W, He H, Wang W J, Chen D G

机构信息

Department of Global Biostatistics & Data Science, Tulane University School of Public Health and Tropical Medicine, New Orleans, LA70122, USA.

Department of Epidemiology, Tulane University School of Public Health and Tropical Medicine, New Orleans, LA70122, USA.

出版信息

J Appl Stat. 2018;45(9):1714-1733. doi: 10.1080/02664763.2017.1391180. Epub 2017 Oct 24.

Abstract

Count data with structural zeros are common in public health applications. There are considerable researches focusing on zero-inflated models such as zero-inflated Poisson (ZIP) and zero-inflated Negative Binomial (ZINB) models for such zero-inflated count data when used as response variable. However, when such variables are used as predictors, the difference between structural and random zeros is often ignored and may result in biased estimates. One remedy is to include an indicator of the structural zero in the model as a predictor if observed. However, structural zeros are often not observed in practice, in which case no statistical method is available to address the bias issue. This paper is aimed to fill this methodological gap by developing parametric methods to model zero-inflated count data when used as predictors based on the maximum likelihood approach. The response variable can be any type of data including continuous, binary, count or even zero-inflated count responses. Simulation studies are performed to assess the numerical performance of this new approach when sample size is small to moderate. A real data example is also used to demonstrate the application of this method.

摘要

在公共卫生应用中,带有结构零的计数数据很常见。当此类零膨胀计数数据用作响应变量时,有大量研究聚焦于零膨胀模型,如零膨胀泊松(ZIP)模型和零膨胀负二项式(ZINB)模型。然而,当此类变量用作预测变量时,结构零和随机零之间的差异常常被忽略,这可能导致估计有偏差。一种补救方法是,如果观察到结构零,就在模型中纳入一个结构零的指标作为预测变量。然而,在实际中结构零往往无法观察到,在这种情况下,没有统计方法可用于解决偏差问题。本文旨在通过基于最大似然法开发参数方法来对用作预测变量的零膨胀计数数据进行建模,以填补这一方法学空白。响应变量可以是任何类型的数据,包括连续数据、二元数据、计数数据,甚至是零膨胀计数响应数据。进行模拟研究以评估当样本量从小到中等时这种新方法的数值性能。还使用了一个实际数据示例来展示该方法的应用。

相似文献

1
Untangle the Structural and Random Zeros in Statistical Modelings.
J Appl Stat. 2018;45(9):1714-1733. doi: 10.1080/02664763.2017.1391180. Epub 2017 Oct 24.
2
A GEE-type approach to untangle structural and random zeros in predictors.
Stat Methods Med Res. 2019 Dec;28(12):3683-3696. doi: 10.1177/0962280218812228. Epub 2018 Nov 26.
3
On performance of parametric and distribution-free models for zero-inflated and over-dispersed count responses.
Stat Med. 2015 Oct 30;34(24):3235-45. doi: 10.1002/sim.6560. Epub 2015 Jun 15.
4
Distribution-free Inference of Zero-inated Binomial Data for Longitudinal Studies.
J Appl Stat. 2015 Oct 1;42(10):2203-2219. doi: 10.1080/02664763.2015.1023270. Epub 2015 Mar 18.
5
Multilevel modeling in single-case studies with zero-inflated and overdispersed count data.
Behav Res Methods. 2024 Apr;56(4):2765-2781. doi: 10.3758/s13428-024-02359-7. Epub 2024 Feb 21.
6
A comparison of statistical methods for modeling count data with an application to hospital length of stay.
BMC Med Res Methodol. 2022 Aug 4;22(1):211. doi: 10.1186/s12874-022-01685-8.
7
A semiparametric marginalized zero-inflated model for analyzing healthcare utilization panel data with missingness.
J Appl Stat. 2019;46(16):2862-2883. doi: 10.1080/02664763.2019.1620705. Epub 2019 May 22.
8
A simulation study of the performance of statistical models for count outcomes with excessive zeros.
Stat Med. 2024 Oct 30;43(24):4752-4767. doi: 10.1002/sim.10198. Epub 2024 Aug 28.

引用本文的文献

2
Endometrial resection and ablation versus hysterectomy for heavy menstrual bleeding.
Cochrane Database Syst Rev. 2021 Feb 23;2(2):CD000329. doi: 10.1002/14651858.CD000329.pub4.
3
Progestogens or progestogen-releasing intrauterine systems for uterine fibroids (other than preoperative medical therapy).
Cochrane Database Syst Rev. 2020 Nov 23;11(11):CD008994. doi: 10.1002/14651858.CD008994.pub3.
4
A GEE-type approach to untangle structural and random zeros in predictors.
Stat Methods Med Res. 2019 Dec;28(12):3683-3696. doi: 10.1177/0962280218812228. Epub 2018 Nov 26.

本文引用的文献

3
Distribution-free models for longitudinal count responses with overdispersion and structural zeros.
Stat Med. 2013 Jun 30;32(14):2390-405. doi: 10.1002/sim.5691. Epub 2012 Dec 12.
4
Alcohol, conscientiousness and event-level condom use.
Br J Health Psychol. 2011 Nov;16(4):828-45. doi: 10.1111/j.2044-8287.2011.02019.x. Epub 2011 Mar 31.
5
New variable selection methods for zero-inflated count data with applications to the substance abuse field.
Stat Med. 2011 Aug 15;30(18):2326-40. doi: 10.1002/sim.4268. Epub 2011 May 12.
6
7
Alcohol outlet density, levels of drinking and alcohol-related harm in New Zealand: a national study.
J Epidemiol Community Health. 2011 Oct;65(10):841-6. doi: 10.1136/jech.2009.104935. Epub 2010 Oct 14.
10
Motivational and skills training HIV/sexually transmitted infection sexual risk reduction groups for men.
J Subst Abuse Treat. 2009 Sep;37(2):138-50. doi: 10.1016/j.jsat.2008.11.008. Epub 2009 Jan 15.

文献AI研究员

20分钟写一篇综述,助力文献阅读效率提升50倍。

立即体验

用中文搜PubMed

大模型驱动的PubMed中文搜索引擎

马上搜索

文档翻译

学术文献翻译模型,支持多种主流文档格式。

立即体验