Bhardwaj Nitin, Lu Hui
Bioinformatics Program, Department of Bioengineering, University of Illinois at Chicago, Chicago, IL 60607, USA.
Bioinformatics. 2005 Jun 1;21(11):2730-8. doi: 10.1093/bioinformatics/bti398. Epub 2005 Mar 29.
Function annotation of an unclassified protein on the basis of its interaction partners is well documented in the literature. Reliable predictions of interactions from other data sources such as gene expression measurements would provide a useful route to function annotation. We investigate the global relationship of protein-protein interactions with gene expression. This relationship is studied in four evolutionarily diverse species, for which substantial information regarding their interactions and expression is available: human, mouse, yeast and Escherichia coli.
In E.coli the expression of interacting pairs is highly correlated in comparison to random pairs, while in the other three species, the correlation of expression of interacting pairs is only slightly stronger than that of random pairs. To strengthen the correlation, we developed a protocol to integrate ortholog information into the interaction and expression datasets. In all four genomes, the likelihood of predicting protein interactions from highly correlated expression data is increased using our protocol. In yeast, for example, the likelihood of predicting a true interaction, when the correlation is > 0.9, increases from 1.4 to 9.4. The improvement demonstrates that protein interactions are reflected in gene expression and the correlation between the two is strengthened by evolution information. The results establish that co-expression of interacting protein pairs is more conserved than that of random ones.
基于未分类蛋白质的相互作用伙伴对其进行功能注释在文献中已有充分记载。从其他数据源(如基因表达测量)可靠地预测相互作用将为功能注释提供一条有用的途径。我们研究了蛋白质 - 蛋白质相互作用与基因表达之间的全局关系。在四个进化上不同的物种中研究了这种关系,对于这些物种,有关于它们的相互作用和表达的大量信息:人类、小鼠、酵母和大肠杆菌。
与随机配对相比,大肠杆菌中相互作用对的表达高度相关,而在其他三个物种中,相互作用对的表达相关性仅略强于随机配对。为了加强相关性,我们开发了一种将直系同源信息整合到相互作用和表达数据集中的方案。在所有四个基因组中,使用我们的方案,从高度相关的表达数据预测蛋白质相互作用的可能性增加。例如,在酵母中,当相关性>0.9时,预测真实相互作用的可能性从1.4增加到9.4。这种改进表明蛋白质相互作用反映在基因表达中,并且进化信息加强了两者之间的相关性。结果表明,相互作用蛋白质对的共表达比随机蛋白质对的共表达更保守。