Nicolas Gaëlle, Robinson Timothy P, Wint G R William, Conchedda Giulia, Cinardi Giuseppina, Gilbert Marius
Biological Control and Spatial Ecology, Université Libre de Bruxelles, Brussels, Belgium.
Fonds National de la Recherche Scientifique, Brussels, Belgium.
PLoS One. 2016 Mar 15;11(3):e0150424. doi: 10.1371/journal.pone.0150424. eCollection 2016.
Large scale, high-resolution global data on farm animal distributions are essential for spatially explicit assessments of the epidemiological, environmental and socio-economic impacts of the livestock sector. This has been the major motivation behind the development of the Gridded Livestock of the World (GLW) database, which has been extensively used since its first publication in 2007. The database relies on a downscaling methodology whereby census counts of animals in sub-national administrative units are redistributed at the level of grid cells as a function of a series of spatial covariates. The recent upgrade of GLW1 to GLW2 involved automating the processing, improvement of input data, and downscaling at a spatial resolution of 1 km per cell (5 km per cell in the earlier version). The underlying statistical methodology, however, remained unchanged. In this paper, we evaluate new methods to downscale census data with a higher accuracy and increased processing efficiency. Two main factors were evaluated, based on sample census datasets of cattle in Africa and chickens in Asia. First, we implemented and evaluated Random Forest models (RF) instead of stratified regressions. Second, we investigated whether models that predicted the number of animals per rural person (per capita) could provide better downscaled estimates than the previous approach that predicted absolute densities (animals per km2). RF models consistently provided better predictions than the stratified regressions for both continents and species. The benefit of per capita over absolute density models varied according to the species and continent. In addition, different technical options were evaluated to reduce the processing time while maintaining their predictive power. Future GLW runs (GLW 3.0) will apply the new RF methodology with optimized modelling options. The potential benefit of per capita models will need to be further investigated with a better distinction between rural and agricultural populations.
关于农场动物分布的大规模、高分辨率全球数据对于在空间上明确评估畜牧业的流行病学、环境和社会经济影响至关重要。这一直是世界网格化牲畜(GLW)数据库开发背后的主要动机,该数据库自2007年首次发布以来已被广泛使用。该数据库依赖于一种降尺度方法,即根据一系列空间协变量,将国家以下行政单位的动物普查数量重新分配到网格单元层面。最近GLW1到GLW2的升级涉及自动化处理、输入数据的改进以及以每个单元格1公里(早期版本为每个单元格5公里)的空间分辨率进行降尺度。然而,基本的统计方法保持不变。在本文中,我们评估了以更高精度和更高处理效率对普查数据进行降尺度的新方法。基于非洲牛和亚洲鸡的样本普查数据集,评估了两个主要因素。首先,我们实施并评估了随机森林模型(RF)而非分层回归。其次,我们研究了预测每个农村人口(人均)动物数量的模型是否能比之前预测绝对密度(每平方公里动物数量)的方法提供更好的降尺度估计。对于这两个大陆和物种,RF模型始终比分层回归提供更好的预测。人均模型相对于绝对密度模型的优势因物种和大陆而异。此外,还评估了不同的技术选项以减少处理时间同时保持其预测能力。未来的GLW运行版本(GLW 3.0)将应用具有优化建模选项的新RF方法。人均模型的潜在优势需要通过更好地区分农村人口和农业人口来进一步研究。