Iwadate Mitsuo, Kanou Kazuhiko, Terashi Genki, Umeyama Hideaki, Takeda-Shitaka Mayuko
Department of Biological Sciences, Chuo University, Japan.
Chem Pharm Bull (Tokyo). 2010 Jan;58(1):1-10. doi: 10.1248/cpb.58.1.
We have devised a power function (PF) that can predict the accuracy of a three-dimensional (3D) structure model of a protein using only amino acid sequence alignments. This Power Function (PF) consists of three parts; (1) the length of a model, (2) a homology identity percent value and (3) the agreement rate between PSI-PRED secondary structure prediction and the secondary structure judgment of a reference protein. The PF value is mathematically computed from the execution process of homology search tools, such as FASTA or various BLAST programs, to obtain the amino acid sequence alignments. There is a high correlation between the global distance test-total score (GDT_TS) value of the protein model based upon the PF score and the GDT_TS(MAX) value used as an index of protein modeling accuracy in the international contest Critical Assessment of Techniques for Protein Structure Prediction (CASP). Accordingly, the PF method is valuable for constructing a highly accurate model without wasteful calculations of homology modeling that is normally performed by an iterative method to move the main chain and side chains in the modeling process. Moreover, a model with higher accuracy can be obtained by combining the models ordered by the PF score with models sorted by the size of the CIRCLE score. The CIRCLE software is a 3D-1D program, in which energetic stabilization is estimated based upon the experimental environment of each amino acid residue in the protein solution or protein crystals.
我们设计了一种幂函数(PF),它仅使用氨基酸序列比对就能预测蛋白质三维(3D)结构模型的准确性。这种幂函数(PF)由三部分组成:(1)模型的长度,(2)同源性同一性百分比值,以及(3)PSI-PRED二级结构预测与参考蛋白质二级结构判断之间的一致率。PF值是通过同源性搜索工具(如FASTA或各种BLAST程序)的执行过程进行数学计算得出的,以获得氨基酸序列比对。基于PF分数的蛋白质模型的全局距离测试总分(GDT_TS)值与在国际蛋白质结构预测技术关键评估(CASP)中用作蛋白质建模准确性指标的GDT_TS(MAX)值之间存在高度相关性。因此,PF方法对于构建高精度模型很有价值,无需像通常通过迭代方法在建模过程中移动主链和侧链那样进行浪费计算的同源建模。此外,通过将按PF分数排序的模型与按CIRCLE分数大小排序的模型相结合,可以获得更高准确性的模型。CIRCLE软件是一个3D-1D程序,其中基于蛋白质溶液或蛋白质晶体中每个氨基酸残基的实验环境来估计能量稳定性。