U.S. Geological Survey Patuxent Wildlife Research Center, Laurel, Maryland, 20708, USA.
Ecol Appl. 2020 Sep;30(6):e02137. doi: 10.1002/eap.2137. Epub 2020 Jun 15.
The North American Breeding Bird Survey (BBS) provides data that can be used in complex, multiscale analyses of population change, while controlling for scale-specific nuisance factors. Many alternative models can be fit to the data, but most model selection procedures are not appropriate for hierarchical models. Leave-one-out cross-validation (LOOCV), in which relative model fit is assessed by omitting an observation and assessing the prediction of a model fit using the remainder of the data, provides a reasonable approach for assessing models, but is time consuming and not feasible to apply for all observations in large data sets. We report the first large-scale formal model selection for BBS data, applying LOOCV to stratified random samples of observations from BBS data. Our results are for 548 species of North American birds, comparing the fit of four alternative models that differ in year effect structures and in descriptions of extra-Poisson overdispersion. We use a hierarchical model among species to evaluate posterior probabilities that models are best for individual species. Models in which differences in year effects are conditionally independent (D models) were generally favored over models in which year effects are modeled by a slope parameter and a random year effect (S models), and models in which extra-Poisson overdispersion effects are independent and t-distributed (H models) tended to be favored over models where overdispersion was independent and normally distributed. Our conclusions lead us to recommend a change from the conventional S model to D and H models for the vast majority of species (544/548). Comparison of estimated population trends based on the favored model relative to the S model currently used for BBS summaries indicates no consistent differences in estimated trends. Of the 18 species that showed large differences in estimated trends between models, estimated trends from the default S model were more extreme, reflecting the influence of the slope parameter in that model for species that are undergoing large population changes. WAIC, a computationally simpler alternative to LOOCV, does not appear to be a reliable alternative to LOOCV.
北美繁殖鸟类调查(BBS)提供的数据可用于人口变化的复杂多尺度分析,同时控制特定于规模的干扰因素。可以拟合许多替代模型,但大多数模型选择程序都不适用于层次模型。逐个观察值的留一交叉验证(LOOCV),其中通过省略一个观察值并使用数据的其余部分评估模型拟合的预测来评估相对模型拟合度,为评估模型提供了一种合理的方法,但时间消耗且不适用于大型数据集的所有观察值。我们报告了 BBS 数据的第一个大规模正式模型选择,在 BBS 数据的分层随机观察样本中应用 LOOCV。我们的结果适用于北美 548 种鸟类,比较了在年效应结构和超泊松分布描述方面存在差异的四种替代模型的拟合度。我们在物种之间使用层次模型来评估模型对个别物种最佳的后验概率。年效应差异条件独立的模型(D 模型)通常优于年效应由斜率参数和随机年效应建模的模型(S 模型),并且超泊松分布差异独立和 t 分布的模型(H 模型)倾向于优于独立且正态分布的模型。我们的结论导致我们建议将 S 模型改为 D 和 H 模型,以便对绝大多数物种(544/548)进行更改。与目前用于 BBS 摘要的 S 模型相比,基于首选模型估计的种群趋势的比较表明,估计趋势没有一致的差异。在模型之间显示出估计趋势存在较大差异的 18 个物种中,从默认 S 模型估计的趋势更为极端,这反映了该模型中斜率参数对那些经历大种群变化的物种的影响。WAIC 是 LOOCV 的一种计算上更简单的替代方法,但似乎不是 LOOCV 的可靠替代方法。