Li Chin-Shang, Lu Minggen
School of Nursing, The State University of New York, University at Buffalo, Buffalo, NY, USA.
School of Community Health Sciences, University of Nevada, Reno, NV, USA.
J Appl Stat. 2021 May 6;49(11):2845-2869. doi: 10.1080/02664763.2021.1925228. eCollection 2022.
When the observed proportion of zeros in a data set consisting of binary outcome data is larger than expected under a regular logistic regression model, it is frequently suggested to use a zero-inflated Bernoulli (ZIB) regression model. A spline-based ZIB regression model is proposed to describe the potentially nonlinear effect of a continuous covariate. A spline is used to approximate the unknown smooth function. Under the smoothness condition, the spline estimator of the unknown smooth function is uniformly consistent, and the regression parameter estimators are asymptotically normally distributed. We propose an easily implemented and consistent estimation method for the variances of the regression parameter estimators. Extensive simulations are conducted to investigate the finite-sample performance of the proposed method. A real-life data set is used to illustrate the practical use of the proposed methodology. The real-life data analysis indicates that the prediction performance of the proposed semiparametric ZIB regression model is better compared to the parametric ZIB regression model.
当由二元结局数据组成的数据集中观察到的零比例大于常规逻辑回归模型下的预期比例时,经常建议使用零膨胀伯努利(ZIB)回归模型。提出了一种基于样条的ZIB回归模型来描述连续协变量的潜在非线性效应。使用样条来近似未知的平滑函数。在平滑条件下,未知平滑函数的样条估计量是一致的,并且回归参数估计量渐近正态分布。我们为回归参数估计量的方差提出了一种易于实现且一致的估计方法。进行了广泛的模拟以研究所提出方法的有限样本性能。使用一个实际数据集来说明所提出方法的实际应用。实际数据分析表明,与参数化ZIB回归模型相比,所提出的半参数ZIB回归模型的预测性能更好。