Department of Statistics, Federal University of Technology - Paraná, Curitiba, Brazil.
Department of Applied Mathematics and Statistics, Institute of Mathematical and Computer Sciences, University of São Paulo, São Carlos, Brazil.
Biom J. 2021 Jan;63(1):81-104. doi: 10.1002/bimj.202000046. Epub 2020 Oct 19.
Count data sets are traditionally analyzed using the ordinary Poisson distribution. However, such a model has its applicability limited as it can be somewhat restrictive to handle specific data structures. In this case, it arises the need for obtaining alternative models that accommodate, for example, (a) zero-modification (inflation or deflation at the frequency of zeros), (b) overdispersion, and (c) individual heterogeneity arising from clustering or repeated (correlated) measurements made on the same subject. Cases (a)-(b) and (b)-(c) are often treated together in the statistical literature with several practical applications, but models supporting all at once are less common. Hence, this paper's primary goal was to jointly address these issues by deriving a mixed-effects regression model based on the hurdle version of the Poisson-Lindley distribution. In this framework, the zero-modification is incorporated by assuming that a binary probability model determines which outcomes are zero-valued, and a zero-truncated process is responsible for generating positive observations. Approximate posterior inferences for the model parameters were obtained from a fully Bayesian approach based on the Adaptive Metropolis algorithm. Intensive Monte Carlo simulation studies were performed to assess the empirical properties of the Bayesian estimators. The proposed model was considered for the analysis of a real data set, and its competitiveness regarding some well-established mixed-effects models for count data was evaluated. A sensitivity analysis to detect observations that may impact parameter estimates was performed based on standard divergence measures. The Bayesian -value and the randomized quantile residuals were considered for model diagnostics.
计数数据集传统上使用普通泊松分布进行分析。然而,由于该模型对于处理特定的数据结构具有一定的局限性,因此其适用性有限。在这种情况下,需要获得可适应的替代模型,例如:(a) 零修改(在零频率处膨胀或收缩),(b) 过离散,以及(c) 来自聚类或对同一主体进行重复(相关)测量的个体异质性。在统计文献中,情况(a)-(b)和(b)-(c)通常一起处理,并具有多种实际应用,但同时支持所有情况的模型则较少见。因此,本文的主要目标是通过基于泊松-林德利分布的门限版本推导出混合效应回归模型来共同解决这些问题。在此框架中,通过假设二进制概率模型确定哪些结果为零值来实现零修改,并且零截断过程负责生成正观测值。通过基于自适应 Metropolis 算法的完全贝叶斯方法,获得了模型参数的近似后验推断。进行了密集的蒙特卡罗模拟研究,以评估贝叶斯估计量的经验性质。对真实数据集进行了分析,并对一些针对计数数据的成熟混合效应模型进行了评估,以确定该模型的竞争力。基于标准分歧度量,进行了一项敏感性分析,以检测可能影响参数估计的观测值。贝叶斯 - 值和随机化分位数残差用于模型诊断。