Suppr
超能文献

从文本中学习本体论规则以提取基因相互作用的多种关系。

Learning ontological rules to extract multiple relations of genic interactions from text.

机构信息

LIPN, Université Paris 13/CNRS UMR7030, Laboratoire d'Informatique Paris-Nord, Institut Galilée, Université Paris 13, Villetaneuse, France.

出版信息

Int J Med Inform. 2009 Dec;78(12):e31-8. doi: 10.1016/j.ijmedinf.2009.03.005. Epub 2009 Apr 23.

DOI:10.1016/j.ijmedinf.2009.03.005

PMID:19398370

Abstract

INTRODUCTION

Information extraction (IE) systems have been proposed in recent years to extract genic interactions from bibliographical resources. They are limited to single interaction relations, and have to face a trade-off between recall and precision, by focusing either on specific interactions (for precision), or general and unspecified interactions of biological entities (for recall). Yet, biologists need to process more complex data from literature, in order to study biological pathways. An ontology is an adequate formal representation to model this sophisticated knowledge. However, the tight integration of IE systems and ontologies is still a current research issue, a fortiori with complex ones that go beyond hierarchies.

METHOD

We propose a rich modeling of genic interactions with an ontology, and show how it can be used within an IE system. The ontology is seen as a language specifying a normalized representation of text. First, IE is performed by extracting instances from natural language processing (NLP) modules. Then, deductive inferences on the ontology language are completed, and new instances are derived from previously extracted ones. Inference rules are learnt with an inductive logic programming (ILP) algorithm, using the ontology as the hypothesis language, and its instantiation on an annotated corpus as the example language. Learning is set in a multi-class setting to deal with the multiple ontological relations.

RESULTS

We validated our approach on an annotated corpus of gene transcription regulations in the Bacillus subtilis bacterium. We reach a global recall of 89.3% and a precision of 89.6%, with high scores for the ten semantic relations defined in the ontology.

摘要

简介

近年来，信息提取（IE）系统已被提出，用于从文献资源中提取基因相互作用。它们仅限于单个相互作用关系，并且必须在召回率和精度之间进行权衡，要么专注于特定的相互作用（用于精度），要么关注生物实体的一般和未指定的相互作用（用于召回率）。然而，生物学家需要处理来自文献的更复杂的数据，以研究生物途径。本体论是对这种复杂知识进行建模的一种合适的形式化表示。然而，IE 系统与本体论的紧密集成仍然是一个当前的研究问题，尤其是对于超越层次结构的复杂本体论更是如此。

方法

我们提出了一种丰富的基因相互作用模型，使用本体论，并展示了如何在 IE 系统中使用它。本体论被视为一种语言，指定文本的规范化表示。首先，通过从自然语言处理（NLP）模块中提取实例来执行 IE。然后，在本体论语言上完成演绎推理，并从之前提取的实例中推导出新实例。推理规则是使用归纳逻辑编程（ILP）算法学习的，本体论用作假设语言，其在带注释语料库上的实例化用作示例语言。学习设置在多类设置中，以处理多个本体论关系。

结果

我们在枯草芽孢杆菌基因转录调控的带注释语料库上验证了我们的方法。我们达到了 89.3%的全局召回率和 89.6%的精度，本体论中定义的十个语义关系的得分都很高。

相似文献

Learning ontological rules to extract multiple relations of genic interactions from text.

Int J Med Inform. 2009 Dec;78(12):e31-8. doi: 10.1016/j.ijmedinf.2009.03.005. Epub 2009 Apr 23.

Extracting phenotypic information from the literature via natural language processing.

Stud Health Technol Inform. 2004;107(Pt 2):758-62.

Formal ontology for natural language processing and the integration of biomedical databases.

Int J Med Inform. 2006 Mar-Apr;75(3-4):224-31. doi: 10.1016/j.ijmedinf.2005.07.015. Epub 2005 Sep 8.

Enhancing knowledge representations by ontological relations.

Stud Health Technol Inform. 2008;136:791-6.

Wnt pathway curation using automated natural language processing: combining statistical methods with partial and full parse for knowledge extraction.

Bioinformatics. 2005 Apr 15;21(8):1653-8. doi: 10.1093/bioinformatics/bti165. Epub 2004 Nov 25.

Gene Regulation Ontology (GRO): design principles and use cases.

Stud Health Technol Inform. 2008;136:9-14.

Extraction of regulatory gene/protein networks from Medline.

Bioinformatics. 2006 Mar 15;22(6):645-50. doi: 10.1093/bioinformatics/bti597. Epub 2005 Jul 26.

The interaction of domain knowledge and linguistic structure in natural language processing: interpreting hypernymic propositions in biomedical text.

J Biomed Inform. 2003 Dec;36(6):462-77. doi: 10.1016/j.jbi.2003.11.003.

Negation of protein-protein interactions: analysis and extraction.

Bioinformatics. 2007 Jul 1;23(13):i424-32. doi: 10.1093/bioinformatics/btm184.

RelEx--relation extraction using dependency parse trees.

Bioinformatics. 2007 Feb 1;23(3):365-71. doi: 10.1093/bioinformatics/btl616. Epub 2006 Dec 1.

引用本文的文献

Overview of the gene regulation network and the bacteria biotope tasks in BioNLP'13 shared task.

BMC Bioinformatics. 2015;16 Suppl 10(Suppl 10):S1. doi: 10.1186/1471-2105-16-S10-S1. Epub 2015 Jul 13.

A semantic-based method for extracting concept definitions from scientific publications: evaluation in the autism phenotype domain.

J Biomed Semantics. 2013 Aug 12;4(1):14. doi: 10.1186/2041-1480-4-14.

Event extraction of bacteria biotopes: a knowledge-intensive NLP-based approach.

BMC Bioinformatics. 2012 Jun 26;13 Suppl 11(Suppl 11):S8. doi: 10.1186/1471-2105-13-S11-S8.

BioNLP Shared Task--The Bacteria Track.

BMC Bioinformatics. 2012 Jun 26;13 Suppl 11(Suppl 11):S3. doi: 10.1186/1471-2105-13-S11-S3.

文献AI研究员

20分钟写一篇综述，助力文献阅读效率提升50倍。

立即体验

用中文搜PubMed

大模型驱动的PubMed中文搜索引擎

马上搜索

文档翻译

学术文献翻译模型，支持多种主流文档格式。

立即体验

Suppr超能文献

从文本中学习本体论规则以提取基因相互作用的多种关系。

Learning ontological rules to extract multiple relations of genic interactions from text.

机构信息

出版信息

INTRODUCTION

METHOD

RESULTS

简介

方法

结果

相似文献

引用本文的文献

文献AI研究员

用中文搜PubMed

文档翻译