Department of Wildlife, Fish, and Conservation Biology, University of California Davis, 1088 Academic Surge, One Shields Ave, Davis, California, 95616, USA.
Southeast Climate Adaptation Science Center, U.S. Geological Survey, North Carolina State University, 127 David Clark Labs, Campus Box 7617, Raleigh, North Carolina, 27695, USA.
Ecol Appl. 2021 Mar;31(2):e02249. doi: 10.1002/eap.2249. Epub 2021 Jan 6.
Community occupancy models estimate species-specific parameters while sharing information across species by treating parameters as sampled from a common distribution. When communities consist of discrete groups, shrinkage of estimates toward the community mean can mask differences among groups. Infinite-mixture models using a Dirichlet process (DP) distribution, in which the number of latent groups is estimated from the data, have been proposed as a solution. In addition to community structure, these models estimate species similarity, which allows testing hypotheses about whether traits drive species response to environmental conditions. We develop a community occupancy model (COM) using a DP distribution to model species-level parameters. Because clustering algorithms are sensitive to dimensionality and distinctiveness of clusters, we conducted a simulation study to explore performance of the DP-COM with different dimensions (i.e., different numbers of model parameters with species-level DP random effects) and under varying cluster differences. Because the DP-COM is computationally expensive, we compared its estimates to a COM with a normal random species effect. We further applied the DP-COM model to a bird data set from Uganda. Estimates of the number of clusters and species cluster identity improved with increasing difference among clusters and increasing dimensions of the DP; but the number of clusters was always overestimated. Estimates of number of sites occupied and species and community-level covariate coefficients on occupancy probability were generally unbiased with (near-) nominal 95% Bayesian Credible Interval coverage. Accuracy of estimates from the normal and the DP-COM was similar. The DP-COM clustered 166 bird species into 27 clusters regarding their affiliation with open or woodland habitat and distance to oil wells. Estimates of covariate coefficients were similar between a normal and the DP-COM. Except sunbirds, species within a family were not more similar in their response to these covariates than the overall community. Given that estimates were consistent between the normal and the DP-COM, and considering the computational burden for the DP models, we recommend using the DP-COM only when the analysis focuses on community structure and species similarity, as these quantities can only be obtained under the DP-COM.
社区占有率模型通过将参数视为从共同分布中采样来估计特定物种的参数,同时在物种之间共享信息。当社区由离散的群体组成时,估计值向社区平均值的收缩可能会掩盖群体之间的差异。已经提出了使用狄利克雷过程(DP)分布的无限混合模型,其中从数据中估计潜在群体的数量,以解决这个问题。除了社区结构,这些模型还估计物种相似性,这允许检验关于特征是否驱动物种对环境条件的反应的假设。我们开发了一种使用 DP 分布的社区占有率模型(COM)来对物种水平的参数进行建模。由于聚类算法对聚类的维度和独特性敏感,我们进行了一项模拟研究,以探讨具有不同维度(即具有物种水平 DP 随机效应的不同数量的模型参数)和在不同聚类差异下 DP-COM 的性能。由于 DP-COM 的计算成本很高,我们将其估计值与具有正态随机物种效应的 COM 进行了比较。我们进一步将 DP-COM 模型应用于来自乌干达的鸟类数据集。随着聚类之间差异的增加和 DP 维度的增加,聚类的数量和物种聚类身份的估计值得到了改善;但聚类的数量总是被高估。被占领的站点数量以及物种和群落水平协变量系数对占有率概率的估计通常是无偏的(接近)名义 95%贝叶斯可信区间覆盖。正态和 DP-COM 的估计精度相似。DP-COM 根据其与开阔或林地栖息地的关系以及与油井的距离,将 166 种鸟类聚类为 27 个聚类。协变量系数的估计在正态和 DP-COM 之间相似。除了太阳鸟外,同一科内的物种对这些协变量的反应并不比整个群落更相似。考虑到正常和 DP-COM 之间的估计是一致的,并且考虑到 DP 模型的计算负担,我们建议仅在分析侧重于社区结构和物种相似性时使用 DP-COM,因为这些数量只能在 DP-COM 下获得。