采样偏差和模型复杂度对最大熵物种分布模型预测性能的影响。

The effects of sampling bias and model complexity on the predictive performance of MaxEnt species distribution models.

机构信息

Forest Ecology and Conservation Group, Department of Plant Sciences, University of Cambridge, Cambridge, United Kingdom.

出版信息

PLoS One. 2013;8(2):e55158. doi: 10.1371/journal.pone.0055158. Epub 2013 Feb 14.

DOI:10.1371/journal.pone.0055158

PMID:23457462

原文链接:https://pmc.ncbi.nlm.nih.gov/articles/PMC3573023/

Abstract

Species distribution models (SDMs) trained on presence-only data are frequently used in ecological research and conservation planning. However, users of SDM software are faced with a variety of options, and it is not always obvious how selecting one option over another will affect model performance. Working with MaxEnt software and with tree fern presence data from New Zealand, we assessed whether (a) choosing to correct for geographical sampling bias and (b) using complex environmental response curves have strong effects on goodness of fit. SDMs were trained on tree fern data, obtained from an online biodiversity data portal, with two sources that differed in size and geographical sampling bias: a small, widely-distributed set of herbarium specimens and a large, spatially clustered set of ecological survey records. We attempted to correct for geographical sampling bias by incorporating sampling bias grids in the SDMs, created from all georeferenced vascular plants in the datasets, and explored model complexity issues by fitting a wide variety of environmental response curves (known as "feature types" in MaxEnt). In each case, goodness of fit was assessed by comparing predicted range maps with tree fern presences and absences using an independent national dataset to validate the SDMs. We found that correcting for geographical sampling bias led to major improvements in goodness of fit, but did not entirely resolve the problem: predictions made with clustered ecological data were inferior to those made with the herbarium dataset, even after sampling bias correction. We also found that the choice of feature type had negligible effects on predictive performance, indicating that simple feature types may be sufficient once sampling bias is accounted for. Our study emphasizes the importance of reducing geographical sampling bias, where possible, in datasets used to train SDMs, and the effectiveness and essentialness of sampling bias correction within MaxEnt.

摘要

基于仅有存在数据训练的物种分布模型（SDM）经常被用于生态研究和保护规划。然而，SDM 软件的用户面临着各种选择，并且并不总是清楚选择一个选项而不是另一个选项将如何影响模型性能。我们使用 MaxEnt 软件和来自新西兰的树蕨存在数据进行了研究，评估了以下两种情况：（a）选择纠正地理采样偏差和（b）使用复杂的环境响应曲线是否对拟合优度有很大影响。SDM 是基于从在线生物多样性数据门户获取的树蕨数据进行训练的，这些数据有两个来源，在大小和地理采样偏差方面存在差异：一个是来自广泛分布的标本的小数据集，另一个是来自生态调查记录的大、空间聚类数据集。我们试图通过将采样偏差网格纳入模型来纠正地理采样偏差，这些网格是从数据集中所有地理参考的维管植物创建的，并且通过拟合各种环境响应曲线（在 MaxEnt 中称为“特征类型”）来探索模型复杂性问题。在每种情况下，都通过使用独立的国家数据集比较预测范围图与树蕨的存在和缺失情况来评估拟合优度。我们发现，纠正地理采样偏差会大大提高拟合优度，但并没有完全解决问题：即使在进行采样偏差校正后，使用聚类生态数据进行的预测也不如使用标本数据集进行的预测。我们还发现，特征类型的选择对预测性能几乎没有影响，这表明一旦考虑了采样偏差，简单的特征类型可能就足够了。我们的研究强调了在训练 SDM 时尽可能减少数据集的地理采样偏差的重要性，以及在 MaxEnt 中进行采样偏差校正的有效性和必要性。

https://cdn.ncbi.nlm.nih.gov/pmc/blobs/f009/3573023/f86a31b03a31/pone.0055158.g001.jpg

相似文献

The effects of sampling bias and model complexity on the predictive performance of MaxEnt species distribution models.

PLoS One. 2013;8(2):e55158. doi: 10.1371/journal.pone.0055158. Epub 2013 Feb 14.

Mapping species distributions with MAXENT using a geographically biased sample of presence data: a performance assessment of methods for correcting sampling bias.

PLoS One. 2014 May 12;9(5):e97122. doi: 10.1371/journal.pone.0097122. eCollection 2014.

Wrong, but useful: regional species distribution models may not be improved by range-wide data under biased sampling.

Ecol Evol. 2018 Jan 24;8(4):2196-2206. doi: 10.1002/ece3.3834. eCollection 2018 Feb.

Spatial sampling bias and model complexity in stream-based species distribution models: A case study of Paddlefish () in the Arkansas River basin, USA.

Ecol Evol. 2019 Dec 25;10(2):705-717. doi: 10.1002/ece3.5913. eCollection 2020 Jan.

How decisions about fitting species distribution models affect conservation outcomes.

Conserv Biol. 2021 Aug;35(4):1309-1320. doi: 10.1111/cobi.13669. Epub 2021 Feb 9.

Scale effects in species distribution models: implications for conservation planning under climate change.

Biol Lett. 2009 Feb 23;5(1):39-43. doi: 10.1098/rsbl.2008.0476.

Dispersal and extrapolation on the accuracy of temporal predictions from distribution models for the Darwin's frog.

Ecol Appl. 2017 Jul;27(5):1633-1645. doi: 10.1002/eap.1556. Epub 2017 Jun 19.

Validation of presence-only models for conservation planning and the application to whales in a multiple-use marine park.

Ecol Appl. 2021 Jan;31(1):e02214. doi: 10.1002/eap.2214. Epub 2020 Sep 16.

Distribution models calibrated with independent field data predict two million ancient and veteran trees in England.

Ecol Appl. 2022 Dec;32(8):e2695. doi: 10.1002/eap.2695. Epub 2022 Aug 9.

Evaluating Bayesian spatial methods for modelling species distributions with clumped and restricted occurrence data.

PLoS One. 2017 Nov 30;12(11):e0187602. doi: 10.1371/journal.pone.0187602. eCollection 2017.

引用本文的文献

Combining Niche Breadth to Predict the Current and Future Distribution of Leguminosae Under Climate Change on the Qinghai-Xizang Plateau.

Ecol Evol. 2025 Aug 21;15(8):e71895. doi: 10.1002/ece3.71895. eCollection 2025 Aug.

Predicting the Implications of Climatic Alterations on the Distribution of Endangered Species: A Case Study of Saxifragaceae on the Qinghai-Xizang Plateau.

Ecol Evol. 2025 Aug 6;15(8):e71899. doi: 10.1002/ece3.71899. eCollection 2025 Aug.

Assessing Suitable Habitats for (L.)Cass. in China Using an Optimized MaxEnt Model and Key Environmental Drivers.

Biology (Basel). 2025 Jun 26;14(7):769. doi: 10.3390/biology14070769.

The potential distribution area and dynamic niche variation analysis of an alien invasive plant Senecio vulgaris L.

Sci Rep. 2025 Jul 23;15(1):26742. doi: 10.1038/s41598-025-08144-6.

Predictive modelling of air pollution affecting human tuberculosis risk on Mainland China.

Sci Rep. 2025 Jul 2;15(1):23633. doi: 10.1038/s41598-025-08078-z.

Impacts of Climate Change on the Spatial Distribution and Habitat Suitability of .

Plants (Basel). 2025 May 12;14(10):1446. doi: 10.3390/plants14101446.

Occurrence Data Sources Matter for Species Distribution Modeling: A Case Study of Based on Biomod2.

Ecol Evol. 2025 May 8;15(5):e71390. doi: 10.1002/ece3.71390. eCollection 2025 May.

Potential Habitat and Priority Conservation Areas for Endangered Species in South Korea.

Animals (Basel). 2025 Apr 17;15(8):1158. doi: 10.3390/ani15081158.

Evaluating Past Range Shifts and Niche Dynamics of Giant Pandas Since the Last Interglacial.

Animals (Basel). 2025 Mar 12;15(6):801. doi: 10.3390/ani15060801.

Predicting the impact of climate change on the distribution of rhododendron on the qinghai-xizang plateau using maxent model.

Sci Rep. 2025 Mar 24;15(1):10055. doi: 10.1038/s41598-025-95016-8.

本文引用的文献

Predicting species distribution: offering more than simple habitat models.

Ecol Lett. 2005 Sep;8(9):993-1009. doi: 10.1111/j.1461-0248.2005.00792.x. Epub 2005 Jun 23.

A molecular phylogeny of scaly tree ferns (Cyatheaceae).

Am J Bot. 2007 May;94(5):873-86. doi: 10.3732/ajb.94.5.873.

Ecological niche modeling in Maxent: the importance of model complexity and the performance of model selection criteria.

Ecol Appl. 2011 Mar;21(2):335-42. doi: 10.1890/10-1171.1.

POC plots: calibrating species distribution models with presence-only data.

Ecology. 2010 Aug;91(8):2476-84. doi: 10.1890/09-0760.1.

Do pseudo-absence selection strategies influence species distribution models and their predictions? An information-theoretic approach based on simulated data.

BMC Ecol. 2009 Apr 24;9:8. doi: 10.1186/1472-6785-9-8.

Sample selection bias and presence-only distribution models: implications for background and pseudo-absence data.

Ecol Appl. 2009 Jan;19(1):181-97. doi: 10.1890/07-2153.1.

New developments in museum-based informatics and applications in biodiversity analysis.

Trends Ecol Evol. 2004 Sep;19(9):497-503. doi: 10.1016/j.tree.2004.07.006.

Tree ferns: monophyletic groups and their relationships as revealed by four protein-coding plastid loci.

Mol Phylogenet Evol. 2006 Jun;39(3):830-45. doi: 10.1016/j.ympev.2006.01.001. Epub 2006 Feb 14.

ROCR: visualizing classifier performance in R.

Bioinformatics. 2005 Oct 15;21(20):3940-1. doi: 10.1093/bioinformatics/bti623. Epub 2005 Aug 11.

文献AI研究员

20分钟写一篇综述，助力文献阅读效率提升50倍。

立即体验

用中文搜PubMed

大模型驱动的PubMed中文搜索引擎

马上搜索

文档翻译

学术文献翻译模型，支持多种主流文档格式。

立即体验

采样偏差和模型复杂度对最大熵物种分布模型预测性能的影响。

The effects of sampling bias and model complexity on the predictive performance of MaxEnt species distribution models.

机构信息

出版信息

相似文献

引用本文的文献

本文引用的文献

文献AI研究员

用中文搜PubMed

文档翻译

Suppr 超能文献

相似文献

引用本文的文献

本文引用的文献