Department of Environmental Science, Policy, and Management, University of California, Berkeley, Berkeley, CA, 94720, USA.
Sci Rep. 2022 Jul 19;12(1):12276. doi: 10.1038/s41598-022-16368-z.
To analyze species count data when detection is imperfect, ecologists need models to estimate relative abundance in the presence of unknown sources of heterogeneity. Two candidate models are generalized linear mixed models (GLMMs) and hierarchical N-mixture models. GLMMs are computationally robust but do not explicitly separate detection from abundance patterns. N-mixture models separately estimate detection and abundance via a latent state but are sensitive to violations in assumptions and subject to practical estimation issues. When one can assume that detection is not systematically confounded with ecological patterns of interest, these two models can be viewed as sharing a heuristic framework for relative abundance estimation. Model selection can then determine which predicts observed counts best, for example by AIC. We compared four N-mixture model variants and two GLMM variants for predicting bird counts in local subsets of a citizen science dataset, eBird, based on model selection and goodness-of-fit measures. We found that both GLMMs and N-mixture models-especially N-mixtures with beta-binomial detection submodels-were supported in a moderate number of datasets, suggesting that both tools are useful and that relative fit is context-dependent. We provide faster software implementations of N-mixture likelihood calculations and a reparameterization to interpret unstable estimates for N-mixture models.
当检测不完全时,生态学家需要模型来估计存在未知异质性来源时的相对丰度。两种候选模型是广义线性混合模型(GLMMs)和分层 N 混合物模型。GLMMs 在计算上是强大的,但没有明确地将检测与丰度模式分开。N 混合物模型通过潜在状态分别估计检测和丰度,但对违反假设很敏感,并受到实际估计问题的影响。当可以假设检测与感兴趣的生态模式没有系统混淆时,这两种模型可以被视为共享相对丰度估计的启发式框架。然后可以通过 AIC 等方法确定哪个模型最能预测观测到的计数。我们比较了四个 N 混合物模型变体和两个 GLMM 变体,用于根据模型选择和拟合优度度量预测公民科学数据集 eBird 中局部子集的鸟类计数。我们发现 GLMM 和 N 混合物模型——特别是具有 beta 二项式检测子模型的 N 混合物模型——在相当数量的数据集得到了支持,这表明这两种工具都很有用,并且相对适应性取决于上下文。我们提供了 N 混合物似然计算的更快软件实现,并对 N 混合物模型的不稳定估计进行了重新参数化。