Viwatwongkasem Chukiat, Kuhnert Ronny, Satitvipawee Pratana
Department of Biostatistics, Faculty of Public Health, Mahidol University, 420/1 Rajvithi Road, Phayathai District, Bangkok, Thailand.
Biom J. 2008 Dec;50(6):1006-21. doi: 10.1002/bimj.200810484.
The purpose of the study is to estimate the population size under a homogeneous truncated count model and under model contaminations via the Horvitz-Thompson approach on the basis of a count capture-recapture experiment. The proposed estimator is based on a mixture of zero-truncated Poisson distributions. The benefit of using the proposed model is statistical inference of the long-tailed or skewed distributions and the concavity of the likelihood function with strong results available on the nonparametric maximum likelihood estimator (NPMLE). The results of comparisons, for finding the appropriate estimator among McKendrick's, Mantel-Haenszel's, Zelterman's, Chao's, the maximum likelihood, and the proposed methods in a simulation study, reveal that under model contaminations the proposed estimator provides the best choice according to its smallest bias and smallest mean square error for a situation of sufficiently large population sizes and the further results show that the proposed estimator performs well even for a homogeneous situation. The empirical examples, containing the cholera epidemic in India based on homogeneity and the heroin user data in Bangkok 2002 based on heterogeneity, are fitted with an excellent goodness-of-fit of the models and the confidence interval estimations may also be of considerable interest.
本研究的目的是在计数捕获 - 再捕获实验的基础上,通过霍维茨 - 汤普森方法,估计齐次截断计数模型下以及模型受污染情况下的总体规模。所提出的估计量基于零截断泊松分布的混合。使用所提出模型的好处在于对长尾或偏态分布进行统计推断,以及似然函数的凹性,并且在非参数最大似然估计器(NPMLE)方面有强有力的结果。在模拟研究中,为了在麦肯德里克方法、曼特尔 - 海恩泽尔方法、泽尔特曼方法、 Chao方法、最大似然方法以及所提出的方法中找到合适的估计器而进行的比较结果表明,在模型受污染的情况下,对于足够大的总体规模情况,所提出的估计器因其最小偏差和最小均方误差而提供了最佳选择,进一步的结果表明,所提出的估计器即使在齐次情况下也表现良好。实证例子包括基于齐次性的印度霍乱疫情以及基于非齐次性的2002年曼谷海洛因使用者数据,这些例子与模型具有出色的拟合优度,并且置信区间估计可能也相当有趣。