Todem David, Kim KyungMann, Hsu Wei-Wen
Department of Epidemiology and Biostatistics, Michigan State University, East Lansing, Michigan 48824, U.S.A..
Department of Biostatistics and Medical Informatics, University of Wisconsin-Madison, Madison, Wisconsin 53792, U.S.A.
Biometrics. 2016 Sep;72(3):986-94. doi: 10.1111/biom.12492. Epub 2016 Feb 17.
Zero-inflated regression models have emerged as a popular tool within the parametric framework to characterize count data with excess zeros. Despite their increasing popularity, much of the literature on real applications of these models has centered around the latent class formulation where the mean response of the so-called at-risk or susceptible population and the susceptibility probability are both related to covariates. While this formulation in some instances provides an interesting representation of the data, it often fails to produce easily interpretable covariate effects on the overall mean response. In this article, we propose two approaches that circumvent this limitation. The first approach consists of estimating the effect of covariates on the overall mean from the assumed latent class models, while the second approach formulates a model that directly relates the overall mean to covariates. Our results are illustrated by extensive numerical simulations and an application to an oral health study on low income African-American children, where the overall mean model is used to evaluate the effect of sugar consumption on caries indices.
零膨胀回归模型已成为参数框架内用于刻画具有过多零值计数数据的一种流行工具。尽管它们越来越受欢迎,但关于这些模型实际应用的许多文献都集中在潜在类别公式上,其中所谓的高危或易感人群的平均反应以及易感性概率都与协变量相关。虽然这种公式在某些情况下提供了对数据的有趣表示,但它往往无法产生对总体平均反应的易于解释的协变量效应。在本文中,我们提出了两种方法来规避这一限制。第一种方法包括从假设的潜在类别模型估计协变量对总体均值的影响,而第二种方法则构建一个直接将总体均值与协变量相关联的模型。我们的结果通过广泛的数值模拟以及对低收入非裔美国儿童口腔健康研究的应用进行了说明,其中总体均值模型用于评估糖消耗对龋齿指数的影响。