Zhan Dongying, Young Derek S
Dr. Bing Zhang Department of Statistics, University of Kentucky, 725 Rose Street, Lexington, KY 40536-0082 USA.
Stat Pap (Berl). 2023 May 19:1-24. doi: 10.1007/s00362-023-01452-x.
For modeling count data, the Conway-Maxwell-Poisson (CMP) distribution is a popular generalization of the Poisson distribution due to its ability to characterize data over- or under-dispersion. While the classic parameterization of the CMP has been well-studied, its main drawback is that it is does not directly model the mean of the counts. This is mitigated by using a mean-parameterized version of the CMP distribution. In this work, we are concerned with the setting where count data may be comprised of subpopulations, each possibly having varying degrees of data dispersion. Thus, we propose a finite mixture of mean-parameterized CMP distributions. An EM algorithm is constructed to perform maximum likelihood estimation of the model, while bootstrapping is employed to obtain estimated standard errors. A simulation study is used to demonstrate the flexibility of the proposed mixture model relative to mixtures of Poissons and mixtures of negative binomials. An analysis of dog mortality data is presented.
The online version contains supplementary material available at 10.1007/s00362-023-01452-x.
对于计数数据建模,康威 - 麦克斯韦 - 泊松(CMP)分布是泊松分布的一种流行推广,因为它能够刻画数据的过度离散或不足离散。虽然CMP的经典参数化已得到充分研究,但其主要缺点是它不能直接对计数的均值进行建模。通过使用CMP分布的均值参数化版本可以缓解这一问题。在这项工作中,我们关注的是计数数据可能由亚群体组成的情况,每个亚群体可能具有不同程度的数据离散度。因此,我们提出了均值参数化CMP分布的有限混合模型。构建了一种期望最大化(EM)算法来执行模型的最大似然估计,同时采用自助法来获得估计的标准误差。通过模拟研究来证明所提出的混合模型相对于泊松混合模型和负二项式混合模型的灵活性。还给出了对犬类死亡率数据的分析。
在线版本包含可在10.1007/s00362 - 023 - 01452 - x获取的补充材料。