Chagoyen Monica, Carmona-Saez Pedro, Gil Concha, Carazo Jose M, Pascual-Montano Alberto
Biocomputing Unit, Centro Nacional de Biotecnologia--CSIC, Madrid, Spain.
BMC Bioinformatics. 2006 Jul 26;7:363. doi: 10.1186/1471-2105-7-363.
Recent analyses in systems biology pursue the discovery of functional modules within the cell. Recognition of such modules requires the integrative analysis of genome-wide experimental data together with available functional schemes. In this line, methods to bridge the gap between the abstract definitions of cellular processes in current schemes and the interlinked nature of biological networks are required.
This work explores the use of the scientific literature to establish potential relationships among cellular processes. To this end we have used a document based similarity method to compute pair-wise similarities of the biological processes described in the Gene Ontology (GO). The method has been applied to the biological processes annotated for the Saccharomyces cerevisiae genome. We compared our results with similarities obtained with two ontology-based metrics, as well as with gene product annotation relationships. We show that the literature-based metric conserves most direct ontological relationships, while reveals biologically sounded similarities that are not obtained using ontology-based metrics and/or genome annotation.
The scientific literature is a valuable source of information from which to compute similarities among biological processes. The associations discovered by literature analysis are a valuable complement to those encoded in existing functional schemes, and those that arise by genome annotation. These similarities can be used to conveniently map the interlinked structure of cellular processes in a particular organism.
系统生物学的最新分析致力于发现细胞内的功能模块。识别这些模块需要对全基因组实验数据以及可用的功能模式进行综合分析。为此,需要一些方法来弥合当前模式中细胞过程的抽象定义与生物网络的相互联系性质之间的差距。
本研究探索利用科学文献来建立细胞过程之间的潜在关系。为此,我们使用了一种基于文档的相似性方法来计算基因本体论(GO)中描述的生物过程的成对相似性。该方法已应用于酿酒酵母基因组注释的生物过程。我们将结果与通过两种基于本体的度量获得的相似性以及基因产物注释关系进行了比较。我们表明,基于文献的度量保留了大多数直接的本体关系,同时揭示了使用基于本体的度量和/或基因组注释无法获得的生物学上合理的相似性。
科学文献是计算生物过程之间相似性的宝贵信息来源。通过文献分析发现的关联是对现有功能模式中编码的关联以及基因组注释产生的关联的宝贵补充。这些相似性可用于方便地绘制特定生物体中细胞过程的相互联系结构。