Office of Computational and Information Sciences and Technology, NASA Goddard Space Flight Center, Greenbelt, Maryland, United States of America.
PLoS One. 2021 Mar 3;16(3):e0237208. doi: 10.1371/journal.pone.0237208. eCollection 2021.
MaxEnt is an important aid in understanding the influence of climate change on species distributions. There is growing interest in using IPCC-class global climate model outputs as environmental predictors in this work. These models provide realistic, global representations of the climate system, projections for hundreds of variables (including Essential Climate Variables), and combine observations from an array of satellite, airborne, and in-situ sensors. Unfortunately, direct use of this important class of data in MaxEnt modeling has been limited by the large size of climate model output collections and the fact that MaxEnt can only operate on a relatively small set of predictors stored in a computer's main memory. In this study, we demonstrate the feasibility of a Monte Carlo method that overcomes this limitation by finding a useful subset of predictors in a larger, externally-stored collection of environmental variables in a reasonable amount of time. Our proposed solution takes an ensemble approach wherein many MaxEnt runs, each drawing on a small random subset of variables, converges on a global estimate of the top contributing subset of variables in the larger collection. In preliminary tests, the Monte Carlo approach selected a consistent set of top six variables within 540 runs, with the four most contributory variables of the top six accounting for approximately 93% of overall permutation importance in a final model. These results suggest that a Monte Carlo approach could offer a viable means of screening environmental predictors prior to final model construction that is amenable to parallelization and scalable to very large data sets. This points to the possibility of near-real-time multiprocessor implementations that could enable broader and more exploratory use of global climate model outputs in environmental niche modeling and aid in the discovery of viable predictors.
最大熵模型(MaxEnt)是理解气候变化对物种分布影响的重要辅助工具。人们越来越感兴趣的是,在这项工作中,使用政府间气候变化专门委员会(IPCC)级别的全球气候模型输出作为环境预测因子。这些模型提供了对气候系统的真实、全球描述,对数百个变量(包括基本气候变量)进行预测,并结合了来自一系列卫星、机载和现场传感器的观测结果。不幸的是,由于气候模型输出集合的规模庞大,而且 MaxEnt 只能在计算机主内存中相对较小的一组预测因子上运行,因此,直接在 MaxEnt 建模中使用这一重要数据类别受到了限制。在这项研究中,我们展示了一种蒙特卡罗方法的可行性,该方法通过在合理的时间内,从较大的外部存储环境变量集合中找到有用的预测因子子集,克服了这一限制。我们提出的解决方案采用了一种集合方法,其中许多 MaxEnt 运行每次都会从变量的一个小随机子集进行抽取,最终会收敛到较大集合中对顶级贡献变量子集的全局估计。在初步测试中,蒙特卡罗方法在 540 次运行中选择了一组一致的顶级六个变量,其中前六个变量中的前四个变量贡献了最终模型中约 93%的整体置换重要性。这些结果表明,蒙特卡罗方法可以在最终模型构建之前提供一种可行的筛选环境预测因子的方法,这种方法适合并行化并且可以扩展到非常大的数据集。这表明,有可能实现接近实时的多处理器实现,从而可以更广泛、更具探索性地使用全球气候模型输出进行环境生态位建模,并有助于发现可行的预测因子。