• 文献检索
  • 文档翻译
  • 深度研究
  • 学术资讯
  • Suppr Zotero 插件Zotero 插件
  • 邀请有礼
  • 套餐&价格
  • 历史记录
应用&插件
Suppr Zotero 插件Zotero 插件浏览器插件Mac 客户端Windows 客户端微信小程序
定价
高级版会员购买积分包购买API积分包
服务
文献检索文档翻译深度研究API 文档MCP 服务
关于我们
关于 Suppr公司介绍联系我们用户协议隐私条款
关注我们

Suppr 超能文献

核心技术专利:CN118964589B侵权必究
粤ICP备2023148730 号-1Suppr @ 2026

文献检索

告别复杂PubMed语法,用中文像聊天一样搜索,搜遍4000万医学文献。AI智能推荐,让科研检索更轻松。

立即免费搜索

文件翻译

保留排版,准确专业,支持PDF/Word/PPT等文件格式,支持 12+语言互译。

免费翻译文档

深度研究

AI帮你快速写综述,25分钟生成高质量综述,智能提取关键信息,辅助科研写作。

立即免费体验

Pytaxon:一款用于解析和纠正生物多样性数据中分类学名称的Python软件。

Pytaxon: A Python software for resolving and correcting taxonomic names in biodiversity data.

作者信息

Proença Neto Marco A, De Sousa Marcos P A

机构信息

Centro Universitário do Estado do Pará, Belém, Brazil Centro Universitário do Estado do Pará Belém Brazil.

Laboratório de Computação Avançada para Biodiversidade (COMBIO). Museu Paraense Emílio Goeldi, Belém, Brazil Laboratório de Computação Avançada para Biodiversidade (COMBIO). Museu Paraense Emílio Goeldi Belém Brazil.

出版信息

Biodivers Data J. 2025 Jan 8;13:e138257. doi: 10.3897/BDJ.13.e138257. eCollection 2025.

DOI:10.3897/BDJ.13.e138257
PMID:39822261
原文链接:https://pmc.ncbi.nlm.nih.gov/articles/PMC11736304/
Abstract

BACKGROUND

The standardisation and correction of taxonomic names in large biodiversity databases remain persistent challenges for researchers, as errors in species names can compromise ecological analyses, land-use planning and conservation efforts, particularly when inaccurate data are shared on global biodiversity portals.

NEW INFORMATION

We present pytaxon, a Python software designed to resolve and correct taxonomic names in biodiversity data by leveraging the Global Names Verifier (GNV) API and employing fuzzy matching techniques to suggest corrections for discrepancies and nomenclatural inconsistencies. The pytaxon offers both a Command Line Interface (CLI) and a Graphical User Interface (GUI), ensuring accessibility to users with different levels of computing expertise. Tests on spreadsheets derived from datasets published in the Global Biodiversity Information Facility (GBIF) demonstrated its effectiveness in identifying and resolving taxonomic errors. By mitigating the propagation of inaccuracies from researchers' datasets to global biodiversity databases, pytaxon supports more reliable conservation decisions and robust scientific investigations. Its contributions enhance data integrity and promote informed biodiversity management in a rapidly evolving global environment.

摘要

背景

在大型生物多样性数据库中,分类学名称的标准化和纠正对研究人员来说仍然是持续存在的挑战,因为物种名称中的错误可能会影响生态分析、土地利用规划和保护工作,特别是当不准确的数据在全球生物多样性门户网站上共享时。

新信息

我们介绍了pytaxon,这是一款用Python编写的软件,旨在通过利用全球名称验证器(GNV)应用程序编程接口(API)并采用模糊匹配技术来解决和纠正生物多样性数据中的分类学名称,以对差异和命名不一致提出修正建议。pytaxon提供了命令行界面(CLI)和图形用户界面(GUI),确保不同计算专业水平的用户都能使用。对源自全球生物多样性信息设施(GBIF)发布的数据集的电子表格进行的测试证明了它在识别和解决分类学错误方面的有效性。通过减少不准确信息从研究人员的数据集传播到全球生物多样性数据库的情况,pytaxon支持做出更可靠的保护决策和进行有力的科学调查。它的贡献提高了数据完整性,并在快速发展的全球环境中促进了明智的生物多样性管理。

https://cdn.ncbi.nlm.nih.gov/pmc/blobs/1fe1/11736304/3a708384197d/bdj-13-e138257-g007.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/1fe1/11736304/890068b8aa0b/bdj-13-e138257-g001.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/1fe1/11736304/292e5418634d/bdj-13-e138257-g002.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/1fe1/11736304/863c904b1672/bdj-13-e138257-g003.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/1fe1/11736304/cb6344092b10/bdj-13-e138257-g004.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/1fe1/11736304/b948d3883680/bdj-13-e138257-g005.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/1fe1/11736304/ddda9567405c/bdj-13-e138257-g006.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/1fe1/11736304/3a708384197d/bdj-13-e138257-g007.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/1fe1/11736304/890068b8aa0b/bdj-13-e138257-g001.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/1fe1/11736304/292e5418634d/bdj-13-e138257-g002.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/1fe1/11736304/863c904b1672/bdj-13-e138257-g003.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/1fe1/11736304/cb6344092b10/bdj-13-e138257-g004.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/1fe1/11736304/b948d3883680/bdj-13-e138257-g005.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/1fe1/11736304/ddda9567405c/bdj-13-e138257-g006.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/1fe1/11736304/3a708384197d/bdj-13-e138257-g007.jpg

相似文献

1
Pytaxon: A Python software for resolving and correcting taxonomic names in biodiversity data.Pytaxon:一款用于解析和纠正生物多样性数据中分类学名称的Python软件。
Biodivers Data J. 2025 Jan 8;13:e138257. doi: 10.3897/BDJ.13.e138257. eCollection 2025.
2
The taxonomic name resolution service: an online tool for automated standardization of plant names.分类名称解析服务:一个用于植物名称自动标准化的在线工具。
BMC Bioinformatics. 2013 Jan 16;14:16. doi: 10.1186/1471-2105-14-16.
3
Treemendous: an R package for integrating taxonomic information across backbones.Treemendous:一个用于整合跨主干分类信息的R包。
PeerJ. 2024 Feb 28;12:e16896. doi: 10.7717/peerj.16896. eCollection 2024.
4
PhyloNext: a pipeline for phylogenetic diversity analysis of GBIF-mediated data.PhyloNext:一个用于分析 GBIF 介导数据的系统发育多样性的分析工具。
BMC Ecol Evol. 2024 Jun 11;24(1):76. doi: 10.1186/s12862-024-02256-9.
5
WorldFlora: An R package for exact and fuzzy matching of plant names against the World Flora Online taxonomic backbone data.《世界植物区系》:一个R软件包,用于根据《世界植物区系在线》分类学主干数据对植物名称进行精确和模糊匹配。
Appl Plant Sci. 2020 Sep 25;8(9):e11388. doi: 10.1002/aps3.11388. eCollection 2020 Sep.
6
Geographic name resolution service: A tool for the standardization and indexing of world political division names, with applications to species distribution modeling.地理名称解析服务:一种用于世界政治区划名称标准化和索引编制的工具,可应用于物种分布建模。
PLoS One. 2022 Nov 14;17(11):e0268162. doi: 10.1371/journal.pone.0268162. eCollection 2022.
7
DFAST_QC: quality assessment and taxonomic identification tool for prokaryotic Genomes.DFAST_QC:用于原核生物基因组的质量评估和分类鉴定工具。
BMC Bioinformatics. 2025 Jan 7;26(1):3. doi: 10.1186/s12859-024-06030-y.
8
Solr-Plant: efficient extraction of plant names from text.Solr-Plant:从文本中高效提取植物名称。
BMC Bioinformatics. 2019 May 22;20(1):263. doi: 10.1186/s12859-019-2874-6.
9
An expert curated global legume checklist improves the accuracy of occurrence, biodiversity and taxonomic data.专家精心编制的全球豆科植物清单提高了出现、生物多样性和分类学数据的准确性。
Sci Data. 2022 Nov 17;9(1):708. doi: 10.1038/s41597-022-01812-6.
10
PNSS: An online plant name service system.PNSS:一个在线植物名称服务系统。
Biodivers Data J. 2025 Mar 31;13:e142973. doi: 10.3897/BDJ.13.e142973. eCollection 2025.

本文引用的文献

1
U.Taxonstand: An R package for standardizing scientific names of plants and animals.U.Taxonstand:一个用于规范植物和动物科学名称的R软件包。
Plant Divers. 2022 Sep 8;45(1):1-5. doi: 10.1016/j.pld.2022.09.001. eCollection 2023 Jan.
2
Data Science in Undergraduate Life Science Education: A Need for Instructor Skills Training.本科生命科学教育中的数据科学:教师技能培训的必要性。
Bioscience. 2021 Oct 27;71(12):1274-1287. doi: 10.1093/biosci/biab107. eCollection 2021 Dec.
3
WorldFlora: An R package for exact and fuzzy matching of plant names against the World Flora Online taxonomic backbone data.
《世界植物区系》:一个R软件包,用于根据《世界植物区系在线》分类学主干数据对植物名称进行精确和模糊匹配。
Appl Plant Sci. 2020 Sep 25;8(9):e11388. doi: 10.1002/aps3.11388. eCollection 2020 Sep.
4
TBtools: An Integrative Toolkit Developed for Interactive Analyses of Big Biological Data.TBtools:一个用于生物大数据交互式分析的集成工具包。
Mol Plant. 2020 Aug 3;13(8):1194-1202. doi: 10.1016/j.molp.2020.06.009. Epub 2020 Jun 23.
5
Harvestmen occurrence database (Arachnida, Opiliones) of the Museu Paraense Emílio Goeldi, Brazil.巴西帕拉伊巴州埃米利奥·戈尔迪博物馆的盲蛛出现数据库(蛛形纲,盲蛛目)。
Biodivers Data J. 2019 Dec 31;7:e47456. doi: 10.3897/BDJ.7.e47456. eCollection 2019.
6
From command-line bioinformatics to bioGUI.从命令行生物信息学到生物图形用户界面
PeerJ. 2019 Nov 21;7:e8111. doi: 10.7717/peerj.8111. eCollection 2019.
7
Spreadsheets to expedite taxonomic publications by automatic generation of morphological descriptions and specimen lists.通过自动生成形态描述和标本清单来加快分类学出版物发布的电子表格。
Zootaxa. 2019 Jun 27;4624(1):zootaxa.4624.1.12. doi: 10.11646/zootaxa.4624.1.12.
8
A method to implement continuous characters in digital identification keys that estimates the probability of an annotation.一种在数字识别键中实现连续字符的方法,该方法可估计注释的概率。
Appl Plant Sci. 2019 May 8;7(5):e01247. doi: 10.1002/aps3.1247. eCollection 2019 May.
9
Dataset from the Snakes (Serpentes, Reptiles) collection of the Museu Paraense Emílio Goeldi, Pará, Brazil.来自巴西帕拉州帕拉伊巴埃米利奥·戈尔迪博物馆蛇类(有鳞目,爬行动物)藏品的数据集。
Biodivers Data J. 2019 Apr 18;7:e34013. doi: 10.3897/BDJ.7.e34013. eCollection 2019.
10
An audit of some processing effects in aggregated occurrence records.对汇总发生记录中某些处理效果的审计。
Zookeys. 2018 Apr 20(751):129-146. doi: 10.3897/zookeys.751.24791. eCollection 2018.