基于语义的生物医学文献标注工作流。

A semantic-based workflow for biomedical literature annotation.

机构信息

University of Aveiro, DETI/IEETA, University of Aveiro, Campus Universitário de Santiago, 3810-193 Aveiro, Portugal.

出版信息

Database (Oxford). 2017 Jan 1;2017. doi: 10.1093/database/bax088.

DOI:10.1093/database/bax088

PMID:29220478

原文链接:https://pmc.ncbi.nlm.nih.gov/articles/PMC5691355/

Abstract

Computational annotation of textual information has taken on an important role in knowledge extraction from the biomedical literature, since most of the relevant information from scientific findings is still maintained in text format. In this endeavour, annotation tools can assist in the identification of biomedical concepts and their relationships, providing faster reading and curation processes, with reduced costs. However, the separate usage of distinct annotation systems results in highly heterogeneous data, as it is difficult to efficiently combine and exchange this valuable asset. Moreover, despite the existence of several annotation formats, there is no unified way to integrate miscellaneous annotation outcomes into a reusable, sharable and searchable structure. Taking up this challenge, we present a modular architecture for textual information integration using semantic web features and services. The solution described allows the migration of curation data into a common model, providing a suitable transition process in which multiple annotation data can be integrated and enriched, with the possibility of being shared, compared and reused across semantic knowledge bases.

摘要

计算注释的文本信息已经承担了一个重要的角色在知识提取的生物医学文献，因为大部分相关信息从科学发现仍然保持在文本格式。在这方面的努力，注释工具可以协助识别生物医学的概念和他们的关系，提供更快的阅读和策展过程，降低成本。然而，单独使用不同的注释系统导致高度异构的数据，因为它很难有效地结合和交换这个有价值的资产。此外，尽管有几种注释格式，没有统一的方式来整合各种注释结果到一个可重复使用的，可共享的和可搜索的结构。面对这个挑战，我们提出了一种使用语义 web 功能和服务的文本信息集成的模块化架构。所描述的解决方案允许策展数据迁移到一个通用模型，提供一个合适的过渡过程中，多个注释数据可以集成和丰富，有可能被共享，比较和重复使用跨语义知识库。