从已发表的文本文章中学习生物相互作用模式的锚定动词。

Learning anchor verbs for biological interaction patterns from published text articles.

作者信息

Hatzivassiloglou Vasileios, Weng Wubin

机构信息

Department of Computer Science, Columbia University, 1214 Amsterdam Avenue, New York, NY 10027 USA.

出版信息

Int J Med Inform. 2002 Dec 4;67(1-3):19-32. doi: 10.1016/s1386-5056(02)00054-0.

DOI:10.1016/s1386-5056(02)00054-0

PMID:12460629

Abstract

Much of knowledge modeling in the molecular biology domain involves interactions between proteins, genes, various forms of RNA, small molecules, etc. Interactions between these substances are typically extracted and codified manually, increasing the cost and time for modeling and substantially limiting the coverage of the resulting knowledge base. In this paper, we describe an automatic system that learns from text interaction verbs; these verbs can then form the core of automatically retrieved patterns which model classes of biological interactions. We investigate text features relating verbs with genes and proteins, and apply statistical tests and a logistic regression statistical model to determine whether a given verb belongs to the class of interaction verbs. Our system, AVAD, achieves over 87% precision and 82% recall when tested on an 11 million word corpus of journal articles. In addition, we compare the automatically obtained results with a manually constructed database of interaction verbs and show that the automatic approach can significantly enrich the manual list by detecting rarer interaction verbs that were omitted from the database.

摘要

分子生物学领域的许多知识建模都涉及蛋白质、基因、各种形式的RNA、小分子等之间的相互作用。这些物质之间的相互作用通常是手动提取和编码的，这增加了建模的成本和时间，并大大限制了所得知识库的覆盖范围。在本文中，我们描述了一个从文本交互动词中学习的自动系统；这些动词随后可以形成自动检索模式的核心，这些模式对生物相互作用的类别进行建模。我们研究将动词与基因和蛋白质相关联的文本特征，并应用统计测试和逻辑回归统计模型来确定给定动词是否属于交互动词类别。我们的系统AVAD在一个1100万字的期刊文章语料库上进行测试时，精度超过87%，召回率达到82%。此外，我们将自动获得的结果与手动构建的交互动词数据库进行比较，结果表明，自动方法可以通过检测数据库中遗漏的罕见交互动词，显著丰富手动列表。

相似文献

Learning anchor verbs for biological interaction patterns from published text articles.从已发表的文本文章中学习生物相互作用模式的锚定动词。

Int J Med Inform. 2002 Dec 4;67(1-3):19-32. doi: 10.1016/s1386-5056(02)00054-0.

Discovering patterns to extract protein-protein interactions from full texts.从全文中发现提取蛋白质-蛋白质相互作用的模式。

Bioinformatics. 2004 Dec 12;20(18):3604-12. doi: 10.1093/bioinformatics/bth451. Epub 2004 Jul 29.

Recognizing names in biomedical texts: a machine learning approach.识别生物医学文本中的名称：一种机器学习方法。

Bioinformatics. 2004 May 1;20(7):1178-90. doi: 10.1093/bioinformatics/bth060. Epub 2004 Feb 10.

Emergent behavior of growing knowledge about molecular interactions.关于分子相互作用的知识增长的涌现行为。

Nat Biotechnol. 2005 Oct;23(10):1243-7. doi: 10.1038/nbt1005-1243.

Wnt pathway curation using automated natural language processing: combining statistical methods with partial and full parse for knowledge extraction.使用自动自然语言处理技术对Wnt信号通路进行整理：结合统计方法与部分及完全句法分析进行知识提取。

Bioinformatics. 2005 Apr 15;21(8):1653-8. doi: 10.1093/bioinformatics/bti165. Epub 2004 Nov 25.

Of truth and pathways: chasing bits of information through myriads of articles.关于真相与路径：在无数文章中追寻点滴信息。

Bioinformatics. 2002;18 Suppl 1:S249-57. doi: 10.1093/bioinformatics/18.suppl_1.s249.

Discovering patterns to extract protein-protein interactions from the literature: Part II.从文献中发现用于提取蛋白质-蛋白质相互作用的模式：第二部分。

Bioinformatics. 2005 Aug 1;21(15):3294-300. doi: 10.1093/bioinformatics/bti493. Epub 2005 May 12.

Extracting Protein-Protein Interactions from MEDLINE using the Hidden Vector State model.使用隐向量状态模型从医学在线数据库（MEDLINE）中提取蛋白质-蛋白质相互作用信息。

Int J Bioinform Res Appl. 2008;4(1):64-80. doi: 10.1504/IJBRA.2008.017164.

BIOSMILE: a semantic role labeling system for biomedical verbs using a maximum-entropy model with automatically generated template features.BIOSMILE：一种用于生物医学动词的语义角色标注系统，它使用带有自动生成模板特征的最大熵模型。

BMC Bioinformatics. 2007 Sep 1;8:325. doi: 10.1186/1471-2105-8-325.

Finding the evidence for protein-protein interactions from PubMed abstracts.从PubMed摘要中寻找蛋白质-蛋白质相互作用的证据。

Bioinformatics. 2006 Jul 15;22(14):e220-6. doi: 10.1093/bioinformatics/btl203.

引用本文的文献

Automatic extraction of protein-protein interactions using grammatical relationship graph.基于语法关系图自动提取蛋白质相互作用。

BMC Med Inform Decis Mak. 2018 Jul 23;18(Suppl 2):42. doi: 10.1186/s12911-018-0628-4.

Connecting the dots between PubMed abstracts.连接 PubMed 摘要之间的点。

PLoS One. 2012;7(1):e29509. doi: 10.1371/journal.pone.0029509. Epub 2012 Jan 3.

What the papers say: text mining for genomics and systems biology.文献综述：基因组学和系统生物学的文本挖掘。

Hum Genomics. 2010 Oct;5(1):17-29. doi: 10.1186/1479-7364-5-1-17.

Bayesian inference of protein-protein interactions from biological literature.基于生物文献的蛋白质-蛋白质相互作用的贝叶斯推断

Bioinformatics. 2009 Jun 15;25(12):1536-42. doi: 10.1093/bioinformatics/btp245. Epub 2009 Apr 15.

Identification of transcription factor contexts in literature using machine learning approaches.使用机器学习方法在文献中识别转录因子上下文。

BMC Bioinformatics. 2008 Apr 11;9 Suppl 3(Suppl 3):S11. doi: 10.1186/1471-2105-9-S3-S11.

文献检索

告别复杂PubMed语法，用中文像聊天一样搜索，搜遍4000万医学文献。AI智能推荐，让科研检索更轻松。

立即免费搜索

文件翻译

保留排版，准确专业，支持PDF/Word/PPT等文件格式，支持 12+语言互译。

免费翻译文档

深度研究

AI帮你快速写综述，25分钟生成高质量综述，智能提取关键信息，辅助科研写作。

立即免费体验

从已发表的文本文章中学习生物相互作用模式的锚定动词。

Learning anchor verbs for biological interaction patterns from published text articles.

作者信息

机构信息

出版信息

相似文献

引用本文的文献

文献检索

文件翻译

深度研究

Suppr 超能文献

相似文献

引用本文的文献