Suppr超能文献

分类名称解析服务:一个用于植物名称自动标准化的在线工具。

The taxonomic name resolution service: an online tool for automated standardization of plant names.

机构信息

Department of Ecology and Evolutionary Biology, University of Arizona Tucson, P,O, Box 210088, Tucson, AZ 85721, USA.

出版信息

BMC Bioinformatics. 2013 Jan 16;14:16. doi: 10.1186/1471-2105-14-16.

Abstract

BACKGROUND

The digitization of biodiversity data is leading to the widespread application of taxon names that are superfluous, ambiguous or incorrect, resulting in mismatched records and inflated species numbers. The ultimate consequences of misspelled names and bad taxonomy are erroneous scientific conclusions and faulty policy decisions. The lack of tools for correcting this 'names problem' has become a fundamental obstacle to integrating disparate data sources and advancing the progress of biodiversity science.

RESULTS

The TNRS, or Taxonomic Name Resolution Service, is an online application for automated and user-supervised standardization of plant scientific names. The TNRS builds upon and extends existing open-source applications for name parsing and fuzzy matching. Names are standardized against multiple reference taxonomies, including the Missouri Botanical Garden's Tropicos database. Capable of processing thousands of names in a single operation, the TNRS parses and corrects misspelled names and authorities, standardizes variant spellings, and converts nomenclatural synonyms to accepted names. Family names can be included to increase match accuracy and resolve many types of homonyms. Partial matching of higher taxa combined with extraction of annotations, accession numbers and morphospecies allows the TNRS to standardize taxonomy across a broad range of active and legacy datasets.

CONCLUSIONS

We show how the TNRS can resolve many forms of taxonomic semantic heterogeneity, correct spelling errors and eliminate spurious names. As a result, the TNRS can aid the integration of disparate biological datasets. Although the TNRS was developed to aid in standardizing plant names, its underlying algorithms and design can be extended to all organisms and nomenclatural codes. The TNRS is accessible via a web interface at http://tnrs.iplantcollaborative.org/ and as a RESTful web service and application programming interface. Source code is available at https://github.com/iPlantCollaborativeOpenSource/TNRS/.

摘要

背景

生物多样性数据的数字化导致了大量冗余、模糊或不正确的分类群名称的广泛应用,从而导致记录不匹配和物种数量膨胀。拼写错误的名称和糟糕的分类学的最终后果是错误的科学结论和错误的政策决策。缺乏纠正这种“名称问题”的工具已成为整合不同数据源和推进生物多样性科学进展的基本障碍。

结果

TNRS(分类名称解析服务)是一个在线应用程序,用于自动和用户监督植物科学名称的标准化。TNRS 建立在现有的用于名称解析和模糊匹配的开源应用程序之上,并进行了扩展。名称是针对多个参考分类群进行标准化的,包括密苏里植物园的 Tropicos 数据库。TNRS 能够在单个操作中处理数千个名称,解析和纠正拼写错误的名称和权威名称,标准化变体拼写,并将命名同义词转换为接受的名称。可以包含家族名称以提高匹配准确性并解决许多类型的同音字问题。结合提取注释、访问号和形态种,对高级分类群进行部分匹配,允许 TNRS 在广泛的活动和遗留数据集之间标准化分类。

结论

我们展示了 TNRS 如何解决许多形式的分类语义异质性、纠正拼写错误和消除虚假名称。因此,TNRS 可以帮助整合不同的生物数据集。虽然 TNRS 是为了帮助标准化植物名称而开发的,但它的底层算法和设计可以扩展到所有生物和命名代码。TNRS 可通过 http://tnrs.iplantcollaborative.org/ 的 Web 界面以及作为 RESTful Web 服务和应用程序编程接口访问。源代码可在 https://github.com/iPlantCollaborativeOpenSource/TNRS/ 获得。

https://cdn.ncbi.nlm.nih.gov/pmc/blobs/5bae/3554605/2202698a7c11/1471-2105-14-16-1.jpg

文献检索

告别复杂PubMed语法,用中文像聊天一样搜索,搜遍4000万医学文献。AI智能推荐,让科研检索更轻松。

立即免费搜索

文件翻译

保留排版,准确专业,支持PDF/Word/PPT等文件格式,支持 12+语言互译。

免费翻译文档

深度研究

AI帮你快速写综述,25分钟生成高质量综述,智能提取关键信息,辅助科研写作。

立即免费体验