Suppr超能文献

对汇总发生记录中某些处理效果的审计。

An audit of some processing effects in aggregated occurrence records.

作者信息

Mesibov Robert

机构信息

West Ulverstone, Tasmania, Australia 7315.

出版信息

Zookeys. 2018 Apr 20(751):129-146. doi: 10.3897/zookeys.751.24791. eCollection 2018.

Abstract

A total of ca 800,000 occurrence records from the Australian Museum (AM), Museums Victoria (MV) and the New Zealand Arthropod Collection (NZAC) were audited for changes in selected Darwin Core fields after processing by the Atlas of Living Australia (ALA; for AM and MV records) and the Global Biodiversity Information Facility (GBIF; for AM, MV and NZAC records). Formal taxon names in the genus- and species-groups were changed in 13-21% of AM and MV records, depending on dataset and aggregator. There was little agreement between the two aggregators on processed names, with names changed in two to three times as many records by one aggregator alone compared to records with names changed by both aggregators. The type status of specimen records did not change with name changes, resulting in confusion as to the name with which a type was associated. Data losses of up to 100% were found after processing in some fields, apparently due to programming errors. The taxonomic usefulness of occurrence records could be improved if aggregators included both original and the processed taxonomic data items for each record. It is recommended that end-users check original and processed records for data loss and name replacements after processing by aggregators.

摘要

对来自澳大利亚博物馆(AM)、维多利亚博物馆(MV)和新西兰节肢动物收藏馆(NZAC)的总计约800,000条出现记录进行了审核,以检查在经过澳大利亚生物图谱(ALA;用于AM和MV记录)和全球生物多样性信息设施(GBIF;用于AM、MV和NZAC记录)处理后,选定的达尔文核心字段是否有变化。在AM和MV记录中,属级和种级的正式分类名称在13%-21%的记录中发生了变化,具体取决于数据集和聚合器。两个聚合器在处理后的名称上几乎没有一致性,仅一个聚合器更改名称的记录数量是两个聚合器都更改名称的记录数量的两到三倍。标本记录的模式状态并未随名称变化而改变,这导致了与模式相关联的名称产生混淆。在某些字段处理后发现数据损失高达100%,显然是由于编程错误。如果聚合器为每条记录同时包含原始和处理后的分类数据项,出现记录的分类学实用性可能会得到提高。建议最终用户在聚合器处理后检查原始记录和处理后的记录,以查看是否存在数据损失和名称替换情况。

相似文献

1
An audit of some processing effects in aggregated occurrence records.
Zookeys. 2018 Apr 20(751):129-146. doi: 10.3897/zookeys.751.24791. eCollection 2018.
2
A specialist's audit of aggregated occurrence records.
Zookeys. 2013 Apr 19(293):1-18. doi: 10.3897/zookeys.293.5111. Print 2013.
5
A specialist's audit of aggregated occurrence records: An 'aggregator's' perspective.
Zookeys. 2013 May 30(305):67-76. doi: 10.3897/zookeys.305.5438. Print 2013.
7
The history and impact of digitization and digital data mobilization on biodiversity research.
Philos Trans R Soc Lond B Biol Sci. 2018 Nov 19;374(1763):20170391. doi: 10.1098/rstb.2017.0391.
8
To increase trust, change the social design behind aggregated biodiversity data.
Database (Oxford). 2018 Jan 1;2018. doi: 10.1093/database/bax100.
9
A Standardized Reference Data Set for Vertebrate Taxon Name Resolution.
PLoS One. 2016 Jan 13;11(1):e0146894. doi: 10.1371/journal.pone.0146894. eCollection 2016.
10
Aggregated occurrence records of invasive European frog-bit ( L.) across North America.
Biodivers Data J. 2022 Feb 9;10:e77492. doi: 10.3897/BDJ.10.e77492. eCollection 2022.

引用本文的文献

1
Pytaxon: A Python software for resolving and correcting taxonomic names in biodiversity data.
Biodivers Data J. 2025 Jan 8;13:e138257. doi: 10.3897/BDJ.13.e138257. eCollection 2025.
3
Unified and pluralistic ideals for data sharing and reuse in biodiversity.
Database (Oxford). 2023 Jul 18;2023. doi: 10.1093/database/baad048.
4
Open Data Practices among Users of Primary Biodiversity Data.
Bioscience. 2021 Aug 18;71(11):1128-1147. doi: 10.1093/biosci/biab072. eCollection 2021 Nov.
5
The Atlas of Living Australia: History, current state and future directions.
Biodivers Data J. 2021 Apr 21;9:e65023. doi: 10.3897/BDJ.9.e65023. eCollection 2021.
6
Decentralized but Globally Coordinated Biodiversity Data.
Front Big Data. 2020 Oct 23;3:519133. doi: 10.3389/fdata.2020.519133. eCollection 2020.
7
Connecting data and expertise: a new alliance for biodiversity knowledge.
Biodivers Data J. 2019 Mar 8;7:e33679. doi: 10.3897/BDJ.7.e33679. eCollection 2019.
8
Data Leakage and Loss in Biodiversity Informatics.
Biodivers Data J. 2018 Nov 7(6):e26826. doi: 10.3897/BDJ.6.e26826. eCollection 2018.

本文引用的文献

1
To increase trust, change the social design behind aggregated biodiversity data.
Database (Oxford). 2018 Jan 1;2018. doi: 10.1093/database/bax100.
2
A specialist's audit of aggregated occurrence records: An 'aggregator's' perspective.
Zookeys. 2013 May 30(305):67-76. doi: 10.3897/zookeys.305.5438. Print 2013.
3
A specialist's audit of aggregated occurrence records.
Zookeys. 2013 Apr 19(293):1-18. doi: 10.3897/zookeys.293.5111. Print 2013.
4
Assessing the primary data hosted by the Spanish node of the Global Biodiversity Information Facility (GBIF).
PLoS One. 2013;8(1):e55144. doi: 10.1371/journal.pone.0055144. Epub 2013 Jan 25.
5
Darwin Core: an evolving community-developed biodiversity data standard.
PLoS One. 2012;7(1):e29715. doi: 10.1371/journal.pone.0029715. Epub 2012 Jan 6.

文献AI研究员

20分钟写一篇综述,助力文献阅读效率提升50倍。

立即体验

用中文搜PubMed

大模型驱动的PubMed中文搜索引擎

马上搜索

文档翻译

学术文献翻译模型,支持多种主流文档格式。

立即体验