Suppr超能文献

生物多样性信息学:数据链接的挑战与共享标识符的作用

Biodiversity informatics: the challenge of linking data and the role of shared identifiers.

作者信息

Page Roderic D M

机构信息

Division of Environmental and Evolutional Biology, Institute of Biomedical and Life Sciences, University of Glasgow, Glasgow G12 8QQ, UK.

出版信息

Brief Bioinform. 2008 Sep;9(5):345-54. doi: 10.1093/bib/bbn022. Epub 2008 Apr 29.

Abstract

A major challenge facing biodiversity informatics is integrating data stored in widely distributed databases. Initial efforts have relied on taxonomic names as the shared identifier linking records in different databases. However, taxonomic names have limitations as identifiers, being neither stable nor globally unique, and the pace of molecular taxonomic and phylogenetic research means that a lot of information in public sequence databases is not linked to formal taxonomic names. This review explores the use of other identifiers, such as specimen codes and GenBank accession numbers, to link otherwise disconnected facts in different databases. The structure of these links can also be exploited using the PageRank algorithm to rank the results of searches on biodiversity databases. The key to rich integration is a commitment to deploy and reuse globally unique, shared identifiers [such as Digital Object Identifiers (DOIs) and Life Science Identifiers (LSIDs)], and the implementation of services that link those identifiers.

摘要

生物多样性信息学面临的一个主要挑战是整合存储在广泛分布的数据库中的数据。最初的努力依赖于分类学名称作为链接不同数据库中记录的共享标识符。然而,分类学名称作为标识符存在局限性,既不稳定也不具有全球唯一性,而且分子分类学和系统发育研究的速度意味着公共序列数据库中的许多信息并未与正式的分类学名称相关联。本综述探讨了使用其他标识符,如标本代码和GenBank登录号,来链接不同数据库中原本不相关的事实。这些链接的结构还可以利用PageRank算法对生物多样性数据库搜索结果进行排名。丰富整合的关键在于致力于部署和重用全球唯一的共享标识符[如数字对象标识符(DOI)和生命科学标识符(LSID)],以及实施链接这些标识符的服务。

文献AI研究员

20分钟写一篇综述,助力文献阅读效率提升50倍。

立即体验

用中文搜PubMed

大模型驱动的PubMed中文搜索引擎

马上搜索

文档翻译

学术文献翻译模型,支持多种主流文档格式。

立即体验