Chao Anne, Bunge John
Institute of Statistics, National Tsing Hua University, Hsin-Chu, Taiwan.
Biometrics. 2002 Sep;58(3):531-9. doi: 10.1111/j.0006-341x.2002.00531.x.
Consider a stochastic abundance model in which the species arrive in the sample according to independent Poisson processes, where the abundance parameters of the processes follow a gamma distribution. We propose a new estimator of the number of species for this model. The estimator takes the form of the number of duplicated species (i.e., species represented by two or more individuals) divided by an estimated duplication fraction. The duplication fraction is estimated from all frequencies including singleton information. The new estimator is closely related to the sample coverage estimator presented by Chao and Lee (1992, Journal of the American Statistical Association 87, 210-217). We illustrate the procedure using the Malayan butterfly data discussed by Fisher, Corbet, and Williams (1943, Journal of Animal Ecology 12, 42-58) and a 1989 Christmas Bird Count dataset collected in Florida, U.S.A. Simulation studies show that this estimator compares well with maximum likelihood estimators (i.e., empirical Bayes estimators from the Bayesian viewpoint) for which an iterative numerical procedure is needed and may be infeasible.
考虑一个随机丰度模型,其中物种根据独立的泊松过程进入样本,这些过程的丰度参数服从伽马分布。我们为该模型提出了一种新的物种数量估计器。该估计器的形式为重复物种(即由两个或更多个体代表的物种)的数量除以估计的重复比例。重复比例是根据包括单例信息在内的所有频率进行估计的。新估计器与Chao和Lee(1992年,《美国统计协会杂志》87卷,210 - 217页)提出的样本覆盖率估计器密切相关。我们使用Fisher、Corbet和Williams(1943年,《动物生态学杂志》12卷,42 - 58页)讨论的马来亚蝴蝶数据以及1989年在美国佛罗里达州收集的圣诞鸟类计数数据集来说明该过程。模拟研究表明,对于需要迭代数值程序且可能不可行的最大似然估计器(即从贝叶斯观点来看的经验贝叶斯估计器),该估计器与之相比表现良好。