Frenkel-Morgenstern Milana, Magid Rachel, Eyal Eran, Pietrokovski Shmuel
Department of Molecular Genetics, Weizmann Institute of Science, Rehovot, Israel.
BMC Bioinformatics. 2007 May 24;8 Suppl 5(Suppl 5):S6. doi: 10.1186/1471-2105-8-S5-S6.
Accurate prediction of intra-protein residue contacts from sequence information will allow the prediction of protein structures. Basic predictions of such specific contacts can be further refined by jointly analyzing predicted contacts, and by adding information on the relative positions of contacts in the protein primary sequence.
We introduce a method for graph analysis refinement of intra-protein contacts, termed GARP. Our previously presented intra-contact prediction method by means of pair-to-pair substitution matrix (P2PConPred) was used to test the GARP method. In our approach, the top contact predictions obtained by a basic prediction method were used as edges to create a weighted graph. The edges were scored by a mutual clustering coefficient that identifies highly connected graph regions, and by the density of edges between the sequence regions of the edge nodes. A test set of 57 proteins with known structures was used to determine contacts. GARP improves the accuracy of the P2PConPred basic prediction method in whole proteins from 12% to 18%.
Using a simple approach we increased the contact prediction accuracy of a basic method by 1.5 times. Our graph approach is simple to implement, can be used with various basic prediction methods, and can provide input for further downstream analyses.
从序列信息准确预测蛋白质内残基间的接触将有助于预测蛋白质结构。通过联合分析预测的接触,并添加蛋白质一级序列中接触的相对位置信息,可以进一步完善此类特定接触的基本预测。
我们介绍了一种用于蛋白质内接触的图分析优化方法,称为GARP。我们之前提出的通过对替换矩阵(P2PConPred)进行接触预测的方法用于测试GARP方法。在我们的方法中,通过基本预测方法获得的顶级接触预测用作边来创建加权图。边通过识别高度连通图区域的互聚类系数以及边节点序列区域之间的边密度来评分。使用一组57个具有已知结构的蛋白质测试集来确定接触。GARP将全蛋白中P2PConPred基本预测方法的准确性从12%提高到18%。
我们使用一种简单的方法将基本方法的接触预测准确性提高了1.5倍。我们的图方法易于实现,可与各种基本预测方法一起使用,并可为进一步的下游分析提供输入。