Singh Rohit, Xu Jinbo, Berger Bonnie
Computer Science and Artificial Intelligence Laboratory, Massachusetts Institute of Technology, Cambridge, MA 02139, USA.
Pac Symp Biocomput. 2006:403-14.
This paper presents a framework for predicting protein-protein interactions (PPI) that integrates structure-based information with other functional annotations, e.g. GO, co-expression and co-localization, etc., Given two protein sequences, the structure-based interaction prediction technique threads these two sequences to all the protein complexes in the PDB and then chooses the best potential match. Based on this match, structural information is incorporated into logistic regression to evaluate the probability of these two proteins interacting. This paper also describes a random forest classifier which can effectively combine the structure-based prediction results and other functional annotations together to predict protein interactions. Experimental results indicate that the predictive power of the structure-based method is better than many other information sources. Also, combining the structure-based method with other information sources allows us to achieve a better performance than when structure information is not used. We also tested our method on a set of approximately 1000 yeast genes and, interestingly, the predicted interaction network is a scale-free network. Our method predicted some potential interactions involving yeast homologs of human disease-related proteins.
本文提出了一个预测蛋白质-蛋白质相互作用(PPI)的框架,该框架将基于结构的信息与其他功能注释(如基因本体论(GO)、共表达和共定位等)整合在一起。给定两个蛋白质序列,基于结构的相互作用预测技术将这两个序列与蛋白质数据银行(PDB)中的所有蛋白质复合物进行比对,然后选择最佳的潜在匹配。基于此匹配,将结构信息纳入逻辑回归以评估这两种蛋白质相互作用的概率。本文还描述了一种随机森林分类器,它可以有效地将基于结构的预测结果与其他功能注释结合起来,以预测蛋白质相互作用。实验结果表明,基于结构的方法的预测能力优于许多其他信息源。此外,将基于结构的方法与其他信息源相结合,使我们能够获得比不使用结构信息时更好的性能。我们还在一组约1000个酵母基因上测试了我们的方法,有趣的是,预测的相互作用网络是一个无标度网络。我们的方法预测了一些涉及人类疾病相关蛋白质的酵母同源物的潜在相互作用。