Suppr超能文献

采样偏差和模型复杂度对最大熵物种分布模型预测性能的影响。

The effects of sampling bias and model complexity on the predictive performance of MaxEnt species distribution models.

机构信息

Forest Ecology and Conservation Group, Department of Plant Sciences, University of Cambridge, Cambridge, United Kingdom.

出版信息

PLoS One. 2013;8(2):e55158. doi: 10.1371/journal.pone.0055158. Epub 2013 Feb 14.

Abstract

Species distribution models (SDMs) trained on presence-only data are frequently used in ecological research and conservation planning. However, users of SDM software are faced with a variety of options, and it is not always obvious how selecting one option over another will affect model performance. Working with MaxEnt software and with tree fern presence data from New Zealand, we assessed whether (a) choosing to correct for geographical sampling bias and (b) using complex environmental response curves have strong effects on goodness of fit. SDMs were trained on tree fern data, obtained from an online biodiversity data portal, with two sources that differed in size and geographical sampling bias: a small, widely-distributed set of herbarium specimens and a large, spatially clustered set of ecological survey records. We attempted to correct for geographical sampling bias by incorporating sampling bias grids in the SDMs, created from all georeferenced vascular plants in the datasets, and explored model complexity issues by fitting a wide variety of environmental response curves (known as "feature types" in MaxEnt). In each case, goodness of fit was assessed by comparing predicted range maps with tree fern presences and absences using an independent national dataset to validate the SDMs. We found that correcting for geographical sampling bias led to major improvements in goodness of fit, but did not entirely resolve the problem: predictions made with clustered ecological data were inferior to those made with the herbarium dataset, even after sampling bias correction. We also found that the choice of feature type had negligible effects on predictive performance, indicating that simple feature types may be sufficient once sampling bias is accounted for. Our study emphasizes the importance of reducing geographical sampling bias, where possible, in datasets used to train SDMs, and the effectiveness and essentialness of sampling bias correction within MaxEnt.

摘要

基于仅有存在数据训练的物种分布模型(SDM)经常被用于生态研究和保护规划。然而,SDM 软件的用户面临着各种选择,并且并不总是清楚选择一个选项而不是另一个选项将如何影响模型性能。我们使用 MaxEnt 软件和来自新西兰的树蕨存在数据进行了研究,评估了以下两种情况:(a)选择纠正地理采样偏差和(b)使用复杂的环境响应曲线是否对拟合优度有很大影响。SDM 是基于从在线生物多样性数据门户获取的树蕨数据进行训练的,这些数据有两个来源,在大小和地理采样偏差方面存在差异:一个是来自广泛分布的标本的小数据集,另一个是来自生态调查记录的大、空间聚类数据集。我们试图通过将采样偏差网格纳入模型来纠正地理采样偏差,这些网格是从数据集中所有地理参考的维管植物创建的,并且通过拟合各种环境响应曲线(在 MaxEnt 中称为“特征类型”)来探索模型复杂性问题。在每种情况下,都通过使用独立的国家数据集比较预测范围图与树蕨的存在和缺失情况来评估拟合优度。我们发现,纠正地理采样偏差会大大提高拟合优度,但并没有完全解决问题:即使在进行采样偏差校正后,使用聚类生态数据进行的预测也不如使用标本数据集进行的预测。我们还发现,特征类型的选择对预测性能几乎没有影响,这表明一旦考虑了采样偏差,简单的特征类型可能就足够了。我们的研究强调了在训练 SDM 时尽可能减少数据集的地理采样偏差的重要性,以及在 MaxEnt 中进行采样偏差校正的有效性和必要性。

https://cdn.ncbi.nlm.nih.gov/pmc/blobs/f009/3573023/f86a31b03a31/pone.0055158.g001.jpg

文献AI研究员

20分钟写一篇综述,助力文献阅读效率提升50倍。

立即体验

用中文搜PubMed

大模型驱动的PubMed中文搜索引擎

马上搜索

文档翻译

学术文献翻译模型,支持多种主流文档格式。

立即体验