Rao Raghuraj, Tun Kyaw, Lakshminarayanan Samavedham, Dhar Pawan K
Department of Chemical and Biomolecular Engineering, National University of Singapore, Singapore 117576, Singapore.
In Silico Biol. 2009;9(4):179-94.
The computational prediction of protein-protein interactions (PPI) is an essential complement to direct experimental evidence. Traditional approaches rely on less available or computationally predicted surface properties, show database-specific performances and are computationally expensive for large-scale datasets. Several sensitivity and specificity issues remain. Here, we report a novel method based on 'Amino-acid Residue Associations' (ARA) among interacting proteins which utilizes the accurate and easily available primary sequence. Large scale PPI datasets for six model species (from E. coli to human) were studied. The ARA method shows up to 73%sensitivity and 78% specificity. Furthermore, the method performs remarkably well in terms of stability and generalizability. The performance of ARA method benchmarked against existing prediction techniques shows performance improvement upto 25%. Ability of ARA method to predict PPI across species and across databases is also demonstrated. Overall, the ARA method provides a significant improvement over existing ones in correctly identifying large scale protein-protein interactions,irrespective of the data resource, network size or organism.
The MATLAB code for ARA approach will be made available upon request.
蛋白质-蛋白质相互作用(PPI)的计算预测是对直接实验证据的重要补充。传统方法依赖于较少可用或通过计算预测的表面性质,表现出特定于数据库的性能,并且对于大规模数据集计算成本高昂。仍然存在一些敏感性和特异性问题。在此,我们报告一种基于相互作用蛋白质之间“氨基酸残基关联”(ARA)的新方法,该方法利用准确且易于获取的一级序列。我们研究了六种模式物种(从大肠杆菌到人类)的大规模PPI数据集。ARA方法显示出高达73%的敏感性和78%的特异性。此外,该方法在稳定性和通用性方面表现出色。与现有预测技术相比,ARA方法的性能提升高达25%。还证明了ARA方法跨物种和跨数据库预测PPI的能力。总体而言,无论数据资源、网络规模或生物体如何,ARA方法在正确识别大规模蛋白质-蛋白质相互作用方面比现有方法有显著改进。
可根据要求提供ARA方法的MATLAB代码。