Kihara Daisuke, Zhang Yang, Lu Hui, Kolinski Andrzej, Skolnick Jeffrey
Laboratory of Computational Genomics, Donald Danforth Plant Science Center, 975 North Warson Road, St. Louis, MO 63132, USA.
Proc Natl Acad Sci U S A. 2002 Apr 30;99(9):5993-8. doi: 10.1073/pnas.092135699. Epub 2002 Apr 16.
An ab initio protein structure prediction procedure, TOUCHSTONE, was applied to all 85 small proteins of the Mycoplasma genitalium genome. TOUCHSTONE is based on a Monte Carlo refinement of a lattice model of proteins, which uses threading-based tertiary restraints. Such restraints are derived by extracting consensus contacts and local secondary structure from at least weakly scoring structures that, in some cases, can lack any global similarity to the sequence of interest. Selection of the native fold was done by using the convergence of the simulation from two different conformational search schemes and the lowest energy structure by a knowledge-based atomic-detailed potential. Among the 85 proteins, for 34 proteins with significant threading hits, the template structures were reasonably well reproduced. Of the remaining 51 proteins, 29 proteins converged to five or fewer clusters. In the test set, 84.8% of the proteins that converged to five or fewer clusters had a correct fold among the clusters. If this statistic is simply applied, 24 proteins (84.8% of the 29 proteins) may have correct folds. Thus, the topology of a total of 58 proteins probably has been correctly predicted. Based on these results, ab initio protein structure prediction is becoming a practical approach.
一种从头开始的蛋白质结构预测程序TOUCHSTONE,被应用于生殖支原体基因组的所有85个小蛋白质。TOUCHSTONE基于蛋白质晶格模型的蒙特卡罗优化,该模型使用基于穿线法的三级约束。这种约束是通过从至少弱得分结构中提取一致接触和局部二级结构得到的,在某些情况下,这些结构可能与目标序列没有任何全局相似性。通过使用来自两种不同构象搜索方案的模拟收敛以及基于知识的原子详细势能得到的最低能量结构来选择天然折叠。在这85个蛋白质中,对于34个有显著穿线命中的蛋白质,模板结构得到了合理的良好再现。在其余的51个蛋白质中,29个蛋白质收敛到五个或更少的簇。在测试集中,收敛到五个或更少簇的蛋白质中有84.8%在簇中有正确的折叠。如果简单应用这个统计数据,24个蛋白质(29个蛋白质的84.8%)可能有正确的折叠。因此,总共58个蛋白质的拓扑结构可能已被正确预测。基于这些结果,从头开始的蛋白质结构预测正在成为一种实用的方法。