Song Kai-Sheng
Department of Mathematics, The University of North Texas, Denton, TX, USA.
J Appl Stat. 2020 May 26;48(9):1603-1627. doi: 10.1080/02664763.2020.1769577. eCollection 2021.
We propose zero-inflated statistical models based on the generalized Hermite distribution for simultaneously modelling of excess zeros, over/underdispersion, and multimodality. These new models are parsimonious yet remarkably flexible allowing the covariates to be introduced directly through the mean, dispersion, and zero-inflated parameters. To accommodate the interval inequality constraint for the dispersion parameter, we present a new link function for the covariate-dependent dispersion regression model. We derive score tests for zero inflation in both covariate-free and covariate-dependent models. Both the score test and the likelihood-ratio test are conducted to examine the validity of zero inflation. The score test provides a useful tool when computing the likelihood-ratio statistic proves to be difficult. We analyse several hotel booking cancellation datasets extracted from two recently published real datasets from a resort hotel and a city hotel. These extracted cancellation datasets reveal complex features of excess zeros, over/underdispersion, and multimodality simultaneously making them difficult to analyse with existing approaches. The application of the proposed methods to the cancellation datasets illustrates the usefulness and flexibility of the models.
我们提出了基于广义埃尔米特分布的零膨胀统计模型,用于同时对过多的零值、过度离散/不足离散和多峰性进行建模。这些新模型简洁却极为灵活,允许通过均值、离散度和零膨胀参数直接引入协变量。为了适应离散度参数的区间不等式约束,我们为协变量相关的离散度回归模型提出了一种新的链接函数。我们推导了无协变量模型和协变量相关模型中零膨胀的得分检验。同时进行得分检验和似然比检验以检验零膨胀的有效性。当计算似然比统计量被证明困难时,得分检验提供了一个有用的工具。我们分析了从最近发表的一个度假酒店和一个城市酒店的两个真实数据集中提取的几个酒店预订取消数据集。这些提取的取消数据集同时揭示了过多零值、过度离散/不足离散和多峰性的复杂特征,使得用现有方法进行分析变得困难。所提出方法在取消数据集上的应用说明了这些模型的实用性和灵活性。