Suppr超能文献

生物对比:从生物医学文献中提取并利用蛋白质-蛋白质对比关系

BioContrasts: extracting and exploiting protein-protein contrastive relations from biomedical literature.

作者信息

Kim Jung-Jae, Zhang Zhuo, Park Jong C, Ng See-Kiong

机构信息

Computer Science Division & AITrc, Korea Advanced Institute of Science and Technology, Yuseong-gu, Daejeon 305-701, South Korea.

出版信息

Bioinformatics. 2006 Mar 1;22(5):597-605. doi: 10.1093/bioinformatics/btk016. Epub 2005 Dec 20.

Abstract

MOTIVATION

Contrasts are useful conceptual vehicles for learning processes and exploratory research of the unknown. For example, contrastive information between proteins can reveal what similarities, divergences and relations there are of the two proteins, leading to invaluable insights for better understanding about the proteins. Such contrastive information are found to be reported in the biomedical literature. However, there have been no reported attempts in current biomedical text mining work that systematically extract and present such useful contrastive information from the literature for exploitation.

RESULTS

Our BioContrasts system extracts protein-protein contrastive information from MEDLINE abstracts and presents the information to biologists in a web-application for exploitation. Contrastive information are identified in the text abstracts with contrastive negation patterns such as 'A but not B'. A total of 799 169 pairs of contrastive expressions were successfully extracted from 2.5 million MEDLINE abstracts. Using grounding of contrastive protein names to Swiss-Prot entries, we were able to produce 41 471 pieces of contrasts between Swiss-Prot protein entries. These contrastive pieces of information are then presented via a user-friendly interactive web portal that can be exploited for applications such as the refinement of biological pathways.

AVAILABILITY

BioContrasts can be accessed at http://biocontrasts.i2r.a-star.edu.sg. It is also mirrored at http://biocontrasts.biopathway.org.

SUPPLEMENTARY INFORMATION

Supplementary materials are available at Bioinformatics online.

摘要

动机

对比是学习过程和对未知进行探索性研究的有用概念工具。例如,蛋白质之间的对比信息可以揭示这两种蛋白质有哪些相似性、差异和关系,从而为更好地理解蛋白质提供宝贵的见解。据发现,此类对比信息在生物医学文献中有报道。然而,在当前的生物医学文本挖掘工作中,尚未有报道尝试从文献中系统地提取并呈现此类有用的对比信息以供利用。

结果

我们的BioContrasts系统从MEDLINE摘要中提取蛋白质 - 蛋白质对比信息,并在一个网络应用程序中向生物学家呈现这些信息以供利用。通过诸如“A但不是B”这样的对比否定模式在文本摘要中识别对比信息。从250万篇MEDLINE摘要中成功提取了总共799169对对比表达。通过将对比蛋白质名称与Swiss - Prot条目进行关联,我们能够生成Swiss - Prot蛋白质条目之间的41471条对比信息。然后,这些对比信息通过一个用户友好的交互式网络门户呈现出来,可用于诸如生物途径优化等应用。

可用性

可通过http://biocontrasts.i2r.a - star.edu.sg访问BioContrasts。它也在http://biocontrasts.biopathway.org上进行了镜像。

补充信息

补充材料可在《生物信息学》在线获取。

文献AI研究员

20分钟写一篇综述,助力文献阅读效率提升50倍。

立即体验

用中文搜PubMed

大模型驱动的PubMed中文搜索引擎

马上搜索

文档翻译

学术文献翻译模型,支持多种主流文档格式。

立即体验