Kiel Christina, Serrano Luis
EMBL-CRG Systems Biology Unit, CRG-Centre de Regulacio Genomica, Dr Aiguader 88, 08003 Barcelona, Spain.
Bioinformatics. 2007 Sep 1;23(17):2226-30. doi: 10.1093/bioinformatics/btm336. Epub 2007 Jun 28.
One of the more challenging problems in biology is to determine the cellular protein interaction network. Progress has been made to predict protein-protein interactions based on structural information, assuming that structural similar proteins interact in a similar way. In a previous publication, we have determined a genome-wide Ras-effector interaction network based on homology models, with a high accuracy of predicting binding and non-binding domains. However, for a prediction on a genome-wide scale, homology modelling is a time-consuming process. Therefore, we here successfully developed a faster method using position energy matrices, where based on different Ras-effector X-ray template structures, all amino acids in the effector binding domain are sequentially mutated to all other amino acid residues and the effect on binding energy is calculated. Those pre-calculated matrices can then be used to score for binding any Ras or effector sequences.
Based on position energy matrices, the sequences of putative Ras-binding domains can be scanned quickly to calculate an energy sum value. By calibrating energy sum values using quantitative experimental binding data, thresholds can be defined and thus non-binding domains can be excluded quickly. Sequences which have energy sum values above this threshold are considered to be potential binding domains, and could be further analysed using homology modelling. This prediction method could be applied to other protein families sharing conserved interaction types, in order to determine in a fast way large scale cellular protein interaction networks. Thus, it could have an important impact on future in silico structural genomics approaches, in particular with regard to increasing structural proteomics efforts, aiming to determine all possible domain folds and interaction types.
All matrices are deposited in the ADAN database (http://adan-embl.ibmc.umh.es/).
Supplementary data are available at Bioinformatics online.
生物学中较具挑战性的问题之一是确定细胞蛋白质相互作用网络。基于结构信息预测蛋白质-蛋白质相互作用已取得进展,假设结构相似的蛋白质以相似方式相互作用。在之前的一篇论文中,我们基于同源模型确定了全基因组范围的Ras效应器相互作用网络,在预测结合和非结合结构域方面具有很高的准确性。然而,对于全基因组规模的预测,同源建模是一个耗时的过程。因此,我们在此成功开发了一种使用位置能量矩阵的更快方法,该方法基于不同的Ras效应器X射线模板结构,将效应器结合结构域中的所有氨基酸依次突变为所有其他氨基酸残基,并计算对结合能的影响。然后,这些预先计算的矩阵可用于对任何Ras或效应器序列的结合进行评分。
基于位置能量矩阵,可以快速扫描假定的Ras结合结构域序列以计算能量总和值。通过使用定量实验结合数据校准能量总和值,可以定义阈值,从而快速排除非结合结构域。能量总和值高于此阈值的序列被认为是潜在的结合结构域,可以使用同源建模进行进一步分析。这种预测方法可应用于具有保守相互作用类型的其他蛋白质家族,以便快速确定大规模细胞蛋白质相互作用网络。因此,它可能对未来的计算机结构基因组学方法产生重要影响,特别是在增加结构蛋白质组学努力方面,旨在确定所有可能的结构域折叠和相互作用类型。
所有矩阵都存放在ADAN数据库(http://adan-embl.ibmc.umh.es/)中。
补充数据可在《生物信息学》在线获取。