文献检索，用中文搜 PubMed

应用&插件

Zotero 插件浏览器插件 Mac 客户端 Windows 客户端微信小程序

定价

高级版会员购买积分包购买API积分包

服务

文献检索文档翻译深度研究 API 文档 MCP 服务

关于我们

关于 Suppr 公司介绍联系我们用户协议隐私条款

关注我们

Suppr 超能文献

核心技术专利：CN118964589B侵权必究

粤ICP备2023148730 号-1Suppr @ 2026

Digital point-occurrence records from the Global Biodiversity Information Facility (GBIF) and other data providers enable a wide range of research in macroecology and biogeography. However, data errors may hamper immediate use. Manual data cleaning is time-consuming and often unfeasible, given that the databases may contain thousands or millions of records. Automated data cleaning pipelines are therefore of high importance. Taking North American as a model, we examined how different data cleaning pipelines (using, e.g., the GBIF web application, and four different packages) affect downstream species distribution models (SDMs). We also assessed how data differed from expert data. From 13,889 North American observations in GBIF, the pipelines removed 31.7% to 62.7% false positives, invalid coordinates, and duplicates, leading to datasets between 9484 (GBIF application) and 5196 records (manual-guided filtering). The expert data consisted of 704 records, comparable to data from field studies. Although differences in the absolute numbers of records were relatively large, species richness models based on stacked SDMs (S-SDM) from pipeline and expert data were strongly correlated (mean Pearson's across the pipelines: .9986, vs. the expert data: .9173). Our results suggest that all package-based pipelines reliably identified invalid coordinates. In contrast, the GBIF-filtered data still contained both spatial and taxonomic errors. Major drawbacks emerge from the fact that no pipeline fully discovered misidentified specimens without the assistance of taxonomic expert knowledge. We conclude that application-filtered GBIF data will still need additional review to achieve higher spatial data quality. Achieving high-quality taxonomic data will require extra effort, probably by thoroughly analyzing the data for misidentified taxa, supported by experts.

Suppr 超能文献

文献检索

文件翻译

深度研究

Suppr 超能文献

文献检索

文件翻译

深度研究

点出现记录的不同数据清理方案对下游宏观生态多样性模型的影响。

Influence of different data cleaning solutions of point-occurrence records on downstream macroecological diversity models.

作者信息

机构信息

出版信息

相似文献

引用本文的文献

本文引用的文献

点出现记录的不同数据清理方案对下游宏观生态多样性模型的影响。

Influence of different data cleaning solutions of point-occurrence records on downstream macroecological diversity models.

作者信息

机构信息

出版信息

相似文献

引用本文的文献

本文引用的文献