Chou Kuo-Chen, Cai Yu-Dong
Gordon Life Science Institute, 13784 Torrey Del Mar, San Diego, California 92130, USA.
J Proteome Res. 2006 Feb;5(2):316-22. doi: 10.1021/pr050331g.
To understand the networks in living cells, it is indispensably important to identify protein-protein interactions on a genomic scale. Unfortunately, it is both time-consuming and expensive to do so solely based on experiments due to the nature of the problem whose complexity is obviously overwhelming, just like the fact that "life is complicated". Therefore, developing computational techniques for predicting protein-protein interactions would be of significant value in this regard. By fusing the approach based on the gene ontology and the approach of pseudo-amino acid composition, a predictor called "GO-PseAA" predictor was established to deal with this problem. As a showcase, prediction was performed on 6323 protein pairs from yeast. To avoid redundancy and homology bias, none of the protein pairs investigated has > or = 40% sequence identity with any other. The overall success rate obtained by jackknife cross-validation was 81.6%, indicating the GO-PseAA predictor is very promising for predicting protein-protein interactions from protein sequences, and might become a useful vehicle for studying the network biology in the postgenomic era.
为了理解活细胞中的网络,在基因组规模上识别蛋白质-蛋白质相互作用至关重要。不幸的是,由于该问题本质上的复杂性显然令人难以应对,就如同“生命是复杂的”这一事实一样,仅基于实验来进行识别既耗时又昂贵。因此,开发用于预测蛋白质-蛋白质相互作用的计算技术在这方面将具有重要价值。通过融合基于基因本体的方法和伪氨基酸组成的方法,建立了一种名为“GO-PseAA”的预测器来处理此问题。作为一个实例,对来自酵母的6323对蛋白质进行了预测。为避免冗余和同源性偏差,所研究的蛋白质对中没有任何一对与其他蛋白质对具有≥40%的序列同一性。通过留一法交叉验证获得的总体成功率为81.6%,这表明GO-PseAA预测器在从蛋白质序列预测蛋白质-蛋白质相互作用方面非常有前景,并且可能成为后基因组时代研究网络生物学的有用工具。