Lee Seung Yup, Skolnick Jeffrey
Center for the Study of Systems Biology, Georgia Institute of Technology, Atlanta, Georgia 30318, USA.
Biophys J. 2008 Aug;95(4):1956-64. doi: 10.1529/biophysj.108.129759. Epub 2008 May 16.
To improve tertiary structure predictions of more difficult targets, the next generation of TASSER, TASSER_2.0, has been developed. TASSER_2.0 incorporates more accurate side-chain contact restraint predictions from a new approach, the composite-sequence method, based on consensus restraints generated by an improved threading algorithm, PROSPECTOR_3.5, which uses computationally evolved and wild-type template sequences as input. TASSER_2.0 was tested on a large-scale, benchmark set of 2591 nonhomologous, single domain proteins < or =200 residues that cover the Protein Data Bank at 35% pairwise sequence identity. Compared with the average fraction of accurately predicted side-chain contacts of 0.37 using PROSPECTOR_3.5 with wild-type template sequences, the average accuracy of the composite-sequence method increases to 0.60. The resulting TASSER_2.0 models are closer to their native structures, with an average root mean-square deviation of 4.99 A compared to the 5.31 A result of TASSER. Defining a successful prediction as a model with a root mean-square deviation to native <6.5 A, the success rate of TASSER_2.0 (TASSER) for Medium targets (targets with good templates/poor alignments) is 74.3% (64.7%) and 40.8% (35.5%) for the Hard targets (incorrect templates/alignments). For Easy targets (good templates/alignments), the success rate slightly increases from 86.3% to 88.4%.
为了改进对更具挑战性目标的三级结构预测,已经开发了下一代TASSER,即TASSER_2.0。TASSER_2.0采用了一种新方法——复合序列法,纳入了更准确的侧链接触限制预测,该方法基于一种改进的穿线算法PROSPECTOR_3.5生成的一致性限制,PROSPECTOR_3.5使用经过计算进化的和野生型模板序列作为输入。TASSER_2.0在一组大规模的、由2591个非同源单结构域蛋白质组成的基准集上进行了测试,这些蛋白质的残基数量≤200个,它们以35%的成对序列同一性覆盖了蛋白质数据库。与使用野生型模板序列的PROSPECTOR_3.5准确预测侧链接触的平均比例0.37相比,复合序列法的平均准确率提高到了0.60。由此得到的TASSER_2.