Feng Cindy
School of Epidemiology and Public Health, Faculty of Medicine, University of Ottawa, Ottawa, Canada.
School of Public Health, University of Saskatchewan, Saskatoon, Canada.
J Appl Stat. 2020 Jul 25;49(1):1-23. doi: 10.1080/02664763.2020.1796943. eCollection 2022.
Zero-inflated count data are frequently encountered in public health and epidemiology research. Two-parts model is often used to model the excessive zeros, which are a mixture of two components: a point mass at zero and a count distribution, such as a Poisson distribution. When the rate of events per unit exposure is of interest, offset is commonly used to account for the varying extent of exposure, which is essentially a predictor whose regression coefficient is fixed at one. Such an assumption of exposure effect is, however, quite restrictive for many practical problems. Further, for zero-inflated models, offset is often only included in the count component of the model. However, the probability of excessive zero component could also be affected by the amount of 'exposure'. We, therefore, proposed incorporating the varying exposure as a covariate rather than an offset term in both the probability of excessive zeros and conditional counts components of the zero-inflated model. A real example is used to illustrate the usage of the proposed methods, and simulation studies are conducted to assess the performance of the proposed methods for a broad variety of situations.
零膨胀计数数据在公共卫生和流行病学研究中经常遇到。两部分模型通常用于对过多的零值进行建模,这些零值是两个成分的混合:零处的点质量和计数分布,如泊松分布。当关注单位暴露的事件发生率时,通常使用偏移量来考虑暴露程度的变化,偏移量本质上是一个预测变量,其回归系数固定为1。然而,这种暴露效应的假设在许多实际问题中具有很大的局限性。此外,对于零膨胀模型,偏移量通常只包含在模型的计数成分中。然而,过多零成分的概率也可能受到“暴露”量的影响。因此,我们建议在零膨胀模型的过多零概率和条件计数成分中,将变化的暴露作为一个协变量而不是偏移项纳入。通过一个实际例子来说明所提出方法的用法,并进行模拟研究以评估所提出方法在各种情况下的性能。