• 文献检索
  • 文档翻译
  • 深度研究
  • 学术资讯
  • Suppr Zotero 插件Zotero 插件
  • 邀请有礼
  • 套餐&价格
  • 历史记录
应用&插件
Suppr Zotero 插件Zotero 插件浏览器插件Mac 客户端Windows 客户端微信小程序
定价
高级版会员购买积分包购买API积分包
服务
文献检索文档翻译深度研究API 文档MCP 服务
关于我们
关于 Suppr公司介绍联系我们用户协议隐私条款
关注我们

Suppr 超能文献

核心技术专利:CN118964589B侵权必究
粤ICP备2023148730 号-1Suppr @ 2026

文献检索

告别复杂PubMed语法,用中文像聊天一样搜索,搜遍4000万医学文献。AI智能推荐,让科研检索更轻松。

立即免费搜索

文件翻译

保留排版,准确专业,支持PDF/Word/PPT等文件格式,支持 12+语言互译。

免费翻译文档

深度研究

AI帮你快速写综述,25分钟生成高质量综述,智能提取关键信息,辅助科研写作。

立即免费体验

让机器能够理解物种清单——从关系数据库到本体论的转变。

Making species checklists understandable to machines - a shift from relational databases to ontologies.

作者信息

Laurenne Nina, Tuominen Jouni, Saarenmaa Hannu, Hyvönen Eero

机构信息

Semantic Computing Research Group (SeCo), Department of Media Technology, Aalto University, P.O. Box 15500, 00076 Aalto, Espoo, Finland.

Digitarium, University of Eastern Finland, P.O. Box 111, 80101 Joensuu, Finland.

出版信息

J Biomed Semantics. 2014 Sep 8;5:40. doi: 10.1186/2041-1480-5-40. eCollection 2014.

DOI:10.1186/2041-1480-5-40
PMID:25937880
原文链接:https://pmc.ncbi.nlm.nih.gov/articles/PMC4417522/
Abstract

BACKGROUND

The scientific names of plants and animals play a major role in Life Sciences as information is indexed, integrated, and searched using scientific names. The main problem with names is their ambiguous nature, because more than one name may point to the same taxon and multiple taxa may share the same name. In addition, scientific names change over time, which makes them open to various interpretations. Applying machine-understandable semantics to these names enables efficient processing of biological content in information systems. The first step is to use unique persistent identifiers instead of name strings when referring to taxa. The most commonly used identifiers are Life Science Identifiers (LSID), which are traditionally used in relational databases, and more recently HTTP URIs, which are applied on the Semantic Web by Linked Data applications.

RESULTS

We introduce two models for expressing taxonomic information in the form of species checklists. First, we show how species checklists are presented in a relational database system using LSIDs. Then, in order to gain a more detailed representation of taxonomic information, we introduce meta-ontology TaxMeOn to model the same content as Semantic Web ontologies where taxa are identified using HTTP URIs. We also explore how changes in scientific names can be managed over time.

CONCLUSIONS

The use of HTTP URIs is preferable for presenting the taxonomic information of species checklists. An HTTP URI identifies a taxon and operates as a web address from which additional information about the taxon can be located, unlike LSID. This enables the integration of biological data from different sources on the web using Linked Data principles and prevents the formation of information silos. The Linked Data approach allows a user to assemble information and evaluate the complexity of taxonomical data based on conflicting views of taxonomic classifications. Using HTTP URIs and Semantic Web technologies also facilitate the representation of the semantics of biological data, and in this way, the creation of more "intelligent" biological applications and services.

摘要

背景

植物和动物的科学名称在生命科学中起着重要作用,因为信息是使用科学名称进行索引、整合和搜索的。名称的主要问题在于其模糊性,因为不止一个名称可能指向同一个分类单元,并且多个分类单元可能共享同一个名称。此外,科学名称会随时间变化,这使得它们容易产生各种解释。将机器可理解的语义应用于这些名称能够在信息系统中高效处理生物内容。第一步是在提及分类单元时使用唯一的持久标识符而非名称字符串。最常用的标识符是生命科学标识符(LSID),传统上用于关系数据库,最近则是HTTP统一资源标识符(URI),由关联数据应用在语义网上使用。

结果

我们引入了两种以物种清单形式表达分类信息的模型。首先,我们展示了物种清单在关系数据库系统中如何使用LSID呈现。然后,为了获得更详细的分类信息表示,我们引入元本体TaxMeOn来对与语义网本体相同的内容进行建模,其中分类单元使用HTTP URI进行标识。我们还探讨了如何随时间管理科学名称的变化。

结论

使用HTTP URI来呈现物种清单的分类信息更为可取。与LSID不同,HTTP URI标识一个分类单元,并作为一个网址运行,从该网址可以找到关于该分类单元的其他信息。这使得能够使用关联数据原则在网络上整合来自不同来源的生物数据,并防止形成信息孤岛。关联数据方法允许用户根据分类学分类的冲突观点来收集信息并评估分类数据的复杂性。使用HTTP URI和语义网技术还便于表示生物数据的语义,从而创建更“智能”的生物应用和服务。

https://cdn.ncbi.nlm.nih.gov/pmc/blobs/031b/4417522/7233c37b428f/13326_2013_211_Fig5_HTML.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/031b/4417522/afc607514b47/13326_2013_211_Fig1_HTML.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/031b/4417522/c02ecc1de728/13326_2013_211_Fig2_HTML.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/031b/4417522/22dad19c9d2c/13326_2013_211_Fig3_HTML.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/031b/4417522/80562fb42ac2/13326_2013_211_Fig4_HTML.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/031b/4417522/7233c37b428f/13326_2013_211_Fig5_HTML.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/031b/4417522/afc607514b47/13326_2013_211_Fig1_HTML.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/031b/4417522/c02ecc1de728/13326_2013_211_Fig2_HTML.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/031b/4417522/22dad19c9d2c/13326_2013_211_Fig3_HTML.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/031b/4417522/80562fb42ac2/13326_2013_211_Fig4_HTML.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/031b/4417522/7233c37b428f/13326_2013_211_Fig5_HTML.jpg

相似文献

1
Making species checklists understandable to machines - a shift from relational databases to ontologies.让机器能够理解物种清单——从关系数据库到本体论的转变。
J Biomed Semantics. 2014 Sep 8;5:40. doi: 10.1186/2041-1480-5-40. eCollection 2014.
2
The use and limits of scientific names in biological informatics.生物信息学中科学名称的使用与局限
Zookeys. 2016 Jan 7(550):207-23. doi: 10.3897/zookeys.550.9546. eCollection 2016.
3
Ten years and a million links: building a global taxonomic library connecting persistent identifiers for names, publications and people.十年与百万链接:构建一个连接名称、出版物及人员永久标识符的全球分类学库。
Biodivers Data J. 2023 Sep 14;11:e107914. doi: 10.3897/BDJ.11.e107914. eCollection 2023.
4
Clever generation of rich SPARQL queries from annotated relational schema: application to Semantic Web Service creation for biological databases.从带注释的关系模式中生成丰富的 SPARQL 查询:应用于生物数据库语义 Web 服务创建。
BMC Bioinformatics. 2013 Apr 15;14:126. doi: 10.1186/1471-2105-14-126.
5
The taxonomic name resolution service: an online tool for automated standardization of plant names.分类名称解析服务:一个用于植物名称自动标准化的在线工具。
BMC Bioinformatics. 2013 Jan 16;14:16. doi: 10.1186/1471-2105-14-16.
6
bioGUID: resolving, discovering, and minting identifiers for biodiversity informatics.生物 GUID:解决、发现和铸造生物多样性信息学标识符。
BMC Bioinformatics. 2009 Nov 10;10 Suppl 14(Suppl 14):S5. doi: 10.1186/1471-2105-10-S14-S5.
7
The big four of plant taxonomy - a comparison of global checklists of vascular plant names.植物分类学的四大体系——维管植物名称全球名录比较
New Phytol. 2023 Nov;240(4):1687-1702. doi: 10.1111/nph.18961. Epub 2023 May 27.
8
Avibase - a database system for managing and organizing taxonomic concepts.Avibase - 一个用于管理和组织分类学概念的数据库系统。
Zookeys. 2014 Jun 25(420):117-35. doi: 10.3897/zookeys.420.7089. eCollection 2014.
9
LSID Tester, a tool for testing Life Science Identifier resolution services.LSID测试器,一种用于测试生命科学标识符解析服务的工具。
Source Code Biol Med. 2008 Feb 18;3:2. doi: 10.1186/1751-0473-3-2.
10
Automated ontology generation framework powered by linked biomedical ontologies for disease-drug domain.基于链接生物医学本体的疾病-药物领域自动化本体生成框架。
Comput Methods Programs Biomed. 2018 Oct;165:117-128. doi: 10.1016/j.cmpb.2018.08.010. Epub 2018 Aug 16.

引用本文的文献

1
Stop using racist, unethical, and inappropriate names in taxonomy.停止在分类学中使用种族主义、不道德和不恰当的名称。
Proc Natl Acad Sci U S A. 2024 Nov 5;121(45):e2415490121. doi: 10.1073/pnas.2415490121. Epub 2024 Oct 30.
2
The relational modeling of hierarchical data in biodiversity databases.生物多样性数据库中层次数据的关系建模。
Database (Oxford). 2024 Oct 10;2024. doi: 10.1093/database/baae107.
3
PhyloSophos: a high-throughput scientific name mapping algorithm augmented with explicit consideration of taxonomic science, and its application on natural product (NP) occurrence database processing.

本文引用的文献

1
The Design of the Internet's Architecture by the Internet Engineering Task Force (IETF) and Human Rights.互联网工程任务组(IETF)对互联网架构的设计与人权
Sci Eng Ethics. 2017 Apr;23(2):449-468. doi: 10.1007/s11948-016-9793-y. Epub 2016 Jun 2.
2
Darwin Core: an evolving community-developed biodiversity data standard.达尔文核心:一个不断发展的社区开发的生物多样性数据标准。
PLoS One. 2012;7(1):e29715. doi: 10.1371/journal.pone.0029715. Epub 2012 Jan 6.
3
Evolutionary informatics: unifying knowledge about the diversity of life.进化信息学:统一关于生命多样性的知识。
PhyloSophos:一种高通量科学名称映射算法,其增强了对分类学科学的明确考虑,及其在天然产物 (NP) 出现数据库处理中的应用。
BMC Bioinformatics. 2023 Dec 14;24(1):475. doi: 10.1186/s12859-023-05588-3.
4
The tempo and mode of the taxonomic correction process: How taxonomists have corrected and recorrected North American bird species over the last 127 years.分类学修正过程的节奏和模式:在过去的 127 年里,分类学家如何修正和反复修正北美的鸟类物种。
PLoS One. 2018 Apr 19;13(4):e0195736. doi: 10.1371/journal.pone.0195736. eCollection 2018.
5
Two Influential Primate Classifications Logically Aligned.两种有影响力的灵长类动物分类在逻辑上保持一致。
Syst Biol. 2016 Jul;65(4):561-82. doi: 10.1093/sysbio/syw023. Epub 2016 Mar 22.
Trends Ecol Evol. 2012 Feb;27(2):94-103. doi: 10.1016/j.tree.2011.11.001. Epub 2011 Dec 10.
4
The NCBI Taxonomy database.NCBI 分类数据库。
Nucleic Acids Res. 2012 Jan;40(Database issue):D136-43. doi: 10.1093/nar/gkr1178. Epub 2011 Dec 1.
5
Identifying and relating biological concepts in the Catalogue of Life.在《生命目录》中识别生物概念并建立它们之间的联系。
J Biomed Semantics. 2011 Oct 17;2(1):7. doi: 10.1186/2041-1480-2-7.
6
GOMMA: a component-based infrastructure for managing and analyzing life science ontologies and their evolution.GOMMA:用于管理和分析生命科学本体及其演化的基于组件的基础设施。
J Biomed Semantics. 2011 Sep 13;2:6. doi: 10.1186/2041-1480-2-6.
7
Names are key to the big new biology.名称是新生物学的关键。
Trends Ecol Evol. 2010 Dec;25(12):686-91. doi: 10.1016/j.tree.2010.09.004. Epub 2010 Oct 18.
8
The ontology of biological taxa.生物分类群的本体论。
Bioinformatics. 2008 Jul 1;24(13):i313-21. doi: 10.1093/bioinformatics/btn158.
9
Biodiversity informatics: organizing and linking information across the spectrum of life.生物多样性信息学:整合与关联生命全谱信息。
Brief Bioinform. 2007 Sep;8(5):347-57. doi: 10.1093/bib/bbm037. Epub 2007 Aug 17.