Liu Yin, Liu Nianjun, Zhao Hongyu
Program of Computational Biology and Bioinformatics, Yale University, New Haven, CT 06520, USA.
Bioinformatics. 2005 Aug 1;21(15):3279-85. doi: 10.1093/bioinformatics/bti492. Epub 2005 May 19.
Identifying protein-protein interactions is critical for understanding cellular processes. Because protein domains represent binding modules and are responsible for the interactions between proteins, computational approaches have been proposed to predict protein interactions at the domain level. The fact that protein domains are likely evolutionarily conserved allows us to pool information from data across multiple organisms for the inference of domain-domain and protein-protein interaction probabilities.
We use a likelihood approach to estimating domain-domain interaction probabilities by integrating large-scale protein interaction data from three organisms, Saccharomyces cerevisiae, Caenorhabditis elegans and Drosophila melanogaster. The estimated domain-domain interaction probabilities are then used to predict protein-protein interactions in S.cerevisiae. Based on a thorough comparison of sensitivity and specificity, Gene Ontology term enrichment and gene expression profiles, we have demonstrated that it may be far more informative to predict protein-protein interactions from diverse organisms than from a single organism.
The program for computing the protein-protein interaction probabilities and supplementary material are available at http://bioinformatics.med.yale.edu/interaction.
识别蛋白质-蛋白质相互作用对于理解细胞过程至关重要。由于蛋白质结构域代表结合模块并负责蛋白质之间的相互作用,因此已提出计算方法来在结构域水平预测蛋白质相互作用。蛋白质结构域可能在进化上保守这一事实使我们能够汇集来自多个生物体的数据信息,以推断结构域-结构域和蛋白质-蛋白质相互作用的概率。
我们使用一种似然方法,通过整合来自酿酒酵母、秀丽隐杆线虫和黑腹果蝇这三种生物体的大规模蛋白质相互作用数据,来估计结构域-结构域相互作用概率。然后,将估计出的结构域-结构域相互作用概率用于预测酿酒酵母中的蛋白质-蛋白质相互作用。基于对敏感性和特异性、基因本体术语富集和基因表达谱的全面比较,我们证明了从多种生物体预测蛋白质-蛋白质相互作用可能比从单一生物体预测更具信息价值。
用于计算蛋白质-蛋白质相互作用概率的程序和补充材料可在http://bioinformatics.med.yale.edu/interaction获取。